Simply weeks after unveiling Gemini 2.5 Professional, Google is on to its subsequent top-performing mannequin.
On Thursday, the corporate launched an “early model” of Gemini 2.5 Flash in preview within the Gemini API, AI Studio, and Vertex AI. The mannequin has a data cutoff of January 2025. It could take textual content, photographs, video, and audio prompts, and has a one-million-token context window.
Google says the brand new model expands on Flash 2.0 with improved reasoning, however “with out compromising its famend velocity or value.” Reasoning fashions spend extra time “pondering” — or decoding a question — earlier than responding, which ends up in extra thorough and direct output that, ideally, aligns higher with a person’s wants, in comparison with earlier fashions that prioritize velocity. Fashions that cause are additionally higher outfitted to precisely ship on multi-step issues or duties.
“Gemini 2.5 Flash performs strongly on Exhausting Prompts in ChatBot Area, second solely to 2.5 Professional,” Google notes within the announcement.
Referring to the brand new mannequin as its most cost-efficient, Google notes that 2.5 Flash “permits builders to configure the quantity of pondering it does to maximise efficiency.” This provides builders a “pondering price range,” or the ability to pay for reasoning solely after they want it most. With reasoning on, the output value jumps from 60 cents per a million tokens to $3.50.
If builders do not give the mannequin a price range, it determines the question’s pondering wants itself by evaluating the request for complexity. For instance, it should establish prompts with minimal reasoning wants — like “What number of states are there within the US?” — individually from multi-step math issues. Google notes that to duplicate Flash 2.0 latency and price, builders ought to set the price range to 0.
Gemini 2.5 Flash scored 12% on Humanity’s Final Examination (HLE), a brand new, various benchmark to trade assessments which have develop into too straightforward for quickly evolving fashions. This rating outperformed competitor fashions, together with Claude 3.7 Sonnet and DeepSeek R1, however not OpenAI’s just-launched o4-mini, which got here in at 14% on the take a look at.
You’ll be able to strive Gemini 2.5 Flash in preview by way of the Gemini API in Google AI Studio and Vertex AI.
Need extra tales about AI? Join Innovation, our weekly e-newsletter.