Google just launched Gemini 3.5. The first model in the family, Gemini 3.5 Flash, went live on May 19, 2026 at Google I/O 2026, and it does something Google has never done before. The cheap, fast “Flash” tier now beats the previous flagship. Gemini 3.5 Flash outscores Gemini 3.1 Pro on Terminal-Bench 2.1 (76.2%), MCP Atlas (83.6%) y GDPval-AA (1,656 Elo), runs roughly 4x faster in output tokens per second, and costs about 25% less. It is already the default model in the Gemini app and AI Mode in Search for billions of people worldwide.
One thing did not ship today. Gemini 3.5 Pro was pushed to next month, and the I/O audience reportedly groaned when Sundar Pichai broke the news. This article covers exactly what launched today, the full benchmark numbers against Gemini 3.1 Pro, what Gemini 3.5 Flash costs, where it still loses, how to use it right now, and when the Pro model arrives.
The Key Takeaways
- Gemini 3.5 Flash launched May 19, 2026 at Google I/O 2026, the first model in the Gemini 3.5 family.
- It beats the bigger Gemini 3.1 Pro on coding and agentic benchmarks while running 4x faster in output tokens per second.
- API pricing is $1.50 per million input tokens and $9.00 per million output tokens, about 25% cheaper than Gemini 3.1 Pro.
- It is free in the Gemini app and AI Mode in Search, with a 1M-token context window.
- Gemini 3.5 Pro is delayed to June 2026; Google is already using it internally.
What Is Gemini 3.5 Flash?
Gemini 3.5 Flash is Google’s new speed-and-cost model, and it is the headline release of Google I/O 2026. In its official announcement, Google calls the family “frontier intelligence with action,” and the framing is deliberate. Flash models used to be the budget option you reached for when you did not need the smart model. Gemini 3.5 Flash flips that, delivering performance Google says “rivals large flagship models on multiple dimensions” while keeping Flash-tier speed and price.
The model is tuned for coding and agentic tasks, the long-horizon work where an AI plans, calls tools, and iterates instead of answering a single question. It also leads on multimodal understanding, scoring 84.2% on CharXiv Reasoning. The API model ID is gemini-3.5-flash, the context window is 1,048,576 input tokens with 64K output tokens, and the knowledge cutoff is January 2026. If you want the background on how Google got here, our breakdown of the jump from Gemini 2.5 to Gemini 3 traces the version history.
Just off stage at #GoogleIO, some highlights from this morning 🧵
— Sundar Pichai (@sundarpichai) May 19, 2026
Gemini 3.5 Flash is available today for everyone in @antigravity and across our products and APIs.
Compared to 3.1 Pro, 3.5 Flash is better across almost all benchmarks with huge progress in coding. It’s also… pic.twitter.com/zqTbCCZL9D
Gemini 3.5 Flash vs Gemini 3.1 Pro: The Benchmarks
Here is the part that matters. A Flash model is not supposed to beat last generation’s Pro model, and Gemini 3.5 Flash does exactly that on the benchmarks Google cares about most. The table below compares it against Gemini 3.1 Pro, the model it effectively replaces for most users.
| Benchmark | Gemini 3.5 Flash | Gemini 3.1 Pro | Winner | What it measures |
|---|---|---|---|---|
| Terminal-Bench 2.1 | 76.2% | 70.3% | 3.5 Flash | Real coding in a terminal |
| GDPval-AA | 1,656 Elo | 1,314 Elo | 3.5 Flash | Real-world agentic tasks |
| MCP Atlas | 83.6% | 78.2% | 3.5 Flash | Scaled tool use |
| CharXiv Reasoning | 84.2% | N/A | 3.5 Flash | Multimodal understanding |
| Humanity’s Last Exam | 40.2% | 44.4% | 3.1 Pro | Hardest expert reasoning |
| ARC-AGI-2 | 72.1% | 77.1% | 3.1 Pro | Abstract reasoning |
| Output speed | ~4x faster | baseline | 3.5 Flash | Output tokens per second |
On GDPval-AA, the jump is dramatic. Gemini 3.1 Pro scores 1,314 Elo; Gemini 3.5 Flash scores 1,656 Elo, a 342-point swing on real-world agentic work. Google says Gemini 3.5 Flash is 4 times faster than other frontier models when measured in output tokens per second. For agents that loop through dozens of steps, that speed gap compounds fast.
Where Gemini 3.1 Pro Still Wins
Gemini 3.5 Flash is not a clean sweep, and Google is honest about it. On the hardest pure-reasoning tests, the older Gemini 3.1 Pro still has the edge. It leads Humanity’s Last Exam (44.4% vs 40.2%) y ARC-AGI-2 (77.1% vs 72.1%), and it holds up better on very long-context retrieval, scoring 84.9% on MRCR v2 at 128k tokens against 77.3% for Flash, per independent benchmark and pricing data.
The practical read is simple. For coding, tool use, and agent workflows, Gemini 3.5 Flash is the better and cheaper pick today. For the deepest reasoning problems or huge-context document work, Gemini 3.1 Pro is still worth the extra cost until Gemini 3.5 Pro lands. Our full Gemini 3.1 Pro review covers where that model still shines.
Gemini 3.5 Flash Pricing
For most people, Gemini 3.5 Flash is free. It is the default model in the Gemini app and in AI Mode in Google Search at no cost. The pricing below matters if you are a developer calling it through the API.
API token pricing
| Token type | Gemini 3.5 Flash | Gemini 3.1 Pro |
|---|---|---|
| Input (per 1M) | $1.50 | $2.00 |
| Output (per 1M) | $9.00 | $12.00 |
| Cached input (per 1M) | $0.15 | N/A |
That makes Gemini 3.5 Flash about 25% cheaper on both input and output than Gemini 3.1 Pro at Google’s official API rates ($2.00 in / $12.00 out for 3.1 Pro), while scoring higher on the coding and agentic benchmarks. It is roughly 3x the price of the old Gemini 3 Flash ($0.50 in / $3.00 out), but far more capable, so it sits in a new sweet spot. Non-global regions are priced slightly higher at $1.65 input / $9.90 output. For the wider context, see our full Gemini pricing breakdown and our AI pricing comparison across the major models.
How to Use Gemini 3.5 Right Now
Gemini 3.5 Flash is already live everywhere Google ships Gemini. You do not have to wait or join a list. There are four ways to use it today.
- Gemini app: open the Gemini app on web, Android, or iOS. Gemini 3.5 Flash is the new default model, so you are already using it.
- AI Mode in Search: Gemini 3.5 Flash now powers AI Mode in Google Search worldwide, free for everyone.
- Gemini API: developers can call gemini-3.5-flash through Google AI Studio, the Gemini API, and Android Studio.
- Enterprise: it is available through Google Antigravity, the Gemini Enterprise Agent Platform, and Gemini Enterprise.
If you mostly use Gemini on a Mac, our Gemini Mac app review walks through the desktop experience. Prefer one subscription that gives you several models instead of juggling apps? The Fello AI app for Mac puts Gemini, ChatGPT, Claude, Grok and DeepSeek behind a single $9.99/month plan, so you can switch models per task without paying for each one separately.
Gemini Spark and the Rest of Google I/O 2026
Gemini 3.5 Flash powers more than chat. Google used it to launch Gemini Spark, a 24/7 personal AI agent that works in the background to handle recurring tasks across your digital life. Spark is rolling out to trusted testers today, with a Beta for Google AI Ultra subscribers in the US next week, as reported by Engadget.
The keynote also brought a Neural Expressive redesign of the Gemini app, a personalized Daily Brief morning digest for AI Plus, Pro, and Ultra subscribers, and updates to Gemini Omni. Google AI Ultra now starts at $99.99/month, with a higher-limit $200 plan on top. On the safety side, Google says Gemini 3.5 ships with strengthened cyber and CBRN safeguards, fewer incorrect refusals on safe prompts, and new interpretability tooling that inspects the model’s internal reasoning before a response is returned.
When Is Gemini 3.5 Pro Coming Out?
Gemini 3.5 Pro is expected in June 2026. Google did not ship it at I/O and gave no exact date, only “next month.” Pichai told the audience, “I know you can’t wait to get your hands on it. Give us until next month to get it to you.” Google says the Pro model is already being used internally and shares the same focus on coding and agents.
Until then, Gemini 3.5 Flash is the model to use for almost everything, and Gemini 3.1 Pro remains the option for the deepest reasoning tasks. We will update this article the moment Gemini 3.5 Pro goes live.
How Gemini 3.5 Stacks Up Against ChatGPT and Claude
Against the wider field, the picture is competitive rather than dominant. As of May 2026, GPT-5.5 leads many agentic workflow benchmarks, and Claude Opus 4.7 leads several coding benchmarks like SWE-bench Verified with the lowest hallucination rate of the three. Gemini’s edge is speed and price at near-flagship quality. Gemini 3.5 Flash does not claim the overall crown, but it changes the value equation by delivering this tier of performance at Flash cost and speed.
For a head-to-head on everyday use, see how Gemini compares to ChatGPT, or our three-way ChatGPT vs Claude vs Gemini breakdown for Mac users. The honest takeaway is that no single model wins everything, which is exactly why model choice per task beats model loyalty.
Should You Use Gemini 3.5?
Yes, and you probably already are. If you use the Gemini app or AI Mode in Search, Gemini 3.5 Flash is your default as of today, at no cost, with a 1M-token context window. For developers, it is the obvious move for coding and agent workloads given it beats Gemini 3.1 Pro on those benchmarks at 25% lower cost and 4x the speed. The one reason to hold off is the deepest reasoning work, where waiting a few weeks for Gemini 3.5 Pro is the smarter call. Open the Gemini app, ask it something hard, and watch how fast it answers.
FAQ
When did Gemini 3.5 release?
Gemini 3.5 Flash launched on May 19, 2026 at Google I/O 2026. It is the first model in the Gemini 3.5 family and is already the default in the Gemini app and AI Mode in Search.
Is Gemini 3.5 free?
Yes. Gemini 3.5 Flash is free in the Gemini app and in AI Mode in Google Search. Developers pay $1.50 per million input tokens and $9.00 per million output tokens through the API.
Is Gemini 3.5 Flash better than Gemini 3.1 Pro?
On coding and agentic benchmarks, yes. It beats Gemini 3.1 Pro on Terminal-Bench 2.1, MCP Atlas, and GDPval-AA while running about 4x faster. Gemini 3.1 Pro still leads on the hardest pure-reasoning tests.
When is Gemini 3.5 Pro coming out?
Google said Gemini 3.5 Pro will arrive “next month,” which points to June 2026. No exact date was given, and Google is already using it internally.
What is Gemini Spark?
Gemini Spark is a 24/7 personal AI agent powered by Gemini 3.5 Flash. It handles recurring tasks in the background and is rolling out to trusted testers now, with a Beta for Google AI Ultra subscribers in the US next week.




