Gemini 3.5 Pro Release Date: What We Know (+ Flash Review)

Google split Gemini 3.5 into two launches, and only one has actually arrived. Gemini 3.5 Flash went live on May 19, 2026 at Google I/O 2026 and did something Google had never done before, with the cheap, fast “Flash” tier beating the previous flagship on coding. The other half, Gemini 3.5 Pro, is the model everyone is waiting on, and its release date has now slipped by months. Google’s own position, published on July 21, 2026, is that Pro is “currently testing with partners”, with no date attached.

This article covers both sides. You get what Gemini 3.5 Flash delivers today, the full benchmarks against Gemini 3.1 Pro, and pricing. You also get everything Google has actually confirmed about Gemini 3.5 Pro, and how to tell those facts from the rumors, because most of the specs circulating for Pro never came from Google. Google has also confirmed a custom Gemini model will power Apple’s rebuilt Siri, which raises the stakes further. One thing to know up front, the Flash tier has already turned over a generation, and our Gemini 3.6 Flash review covers the July 21, 2026 replacement.

Índice hide

What Is Gemini 3.5 Flash?

Gemini 3.5 Flash vs Gemini 3.1 Pro: The Benchmarks

Where Gemini 3.1 Pro Still Wins

Gemini 3.5 Flash Pricing

API token pricing

How to Use Gemini 3.5 Right Now

Gemini Spark and the Rest of Google I/O 2026

When Is Gemini 3.5 Pro Coming Out?

Why Gemini 3.5 Pro Is Late

Gemini Is About to Power Apple’s New Siri

How Gemini 3.5 Pro and Flash Stack Up Against ChatGPT and Claude

Should You Use Gemini 3.5?

FAQ

The Key Takeaways

Gemini 3.5 Flash launched May 19, 2026 at Google I/O 2026, the first model in the Gemini 3.5 family.

It beats the bigger Gemini 3.1 Pro on coding and agentic benchmarks while running 4x faster in output tokens per second.

API pricing is $1.50 per million input tokens and $9.00 per million output tokens, about 25% cheaper than Gemini 3.1 Pro. Its replacement, Gemini 3.6 Flash, holds input at $1.50 and cuts output to $7.50.

It has a 1M-token context window. It was the free-tier model in the Gemini app from May 2026, but Google’s plans page now reads “Access to 3.6 Flash” for the free tier.

Gemini 3.5 Pro is still unreleased. Google said on July 21, 2026 that it is “currently testing with partners”. Bloomberg reported it is months behind schedule. The 2M-token context window y Deep Think specs circulating online are not confirmed by Google.

What Is Gemini 3.5 Flash?

Gemini 3.5 Flash is Google’s new speed-and-cost model, and it is the headline release of Google I/O 2026. In its official announcement, Google calls the family “frontier intelligence with action,” and the framing is deliberate. Flash models used to be the budget option you reached for when you did not need the smart model. Gemini 3.5 Flash flips that, delivering performance Google says “rivals large flagship models on multiple dimensions” while keeping Flash-tier speed and price. Alibaba shipped its own frontier reasoner the next day; see our Qwen3.7-Max review for the head-to-head with Flash on pricing and benchmarks.

The model is tuned for coding and agentic tasks, the long-horizon work where an AI plans, calls tools, and iterates instead of answering a single question. It also leads on multimodal understanding, scoring 84.2% on CharXiv Reasoning. The API model ID is gemini-3.5-flash, the context window is 1,048,576 input tokens with 64K output tokens, and the knowledge cutoff is January 2026. If you want the background on how Google got here, our breakdown of the jump from Gemini 2.5 to Gemini 3 traces the version history. Note the generation has since turned over. The gemini-3.5-flash ID is still stable and still callable, but gemini-3.6-flash is now the current Flash model in Google’s catalog.

Just off stage at #GoogleIO, some highlights from this morning 🧵

Gemini 3.5 Flash is available today for everyone in @antigravity and across our products and APIs.

Compared to 3.1 Pro, 3.5 Flash is better across almost all benchmarks with huge progress in coding. It’s also… pic.twitter.com/zqTbCCZL9D
— Sundar Pichai (@sundarpichai) May 19, 2026

Gemini 3.5 Flash vs Gemini 3.1 Pro: The Benchmarks

Here is the part that matters. A Flash model is not supposed to beat last generation’s Pro model, and Gemini 3.5 Flash does exactly that on the benchmarks Google cares about most. The table below compares it against Gemini 3.1 Pro, the model it effectively replaces for most users. For how every version from 2.5 to 3.6 stacks up, see our full Gemini model comparison.

Benchmark	Gemini 3.5 Flash	Gemini 3.1 Pro	Winner	What it measures
Terminal-Bench 2.1	76.2%	70.3%	3.5 Flash	Real coding in a terminal
GDPval-AA	1,656 Elo	1,314 Elo	3.5 Flash	Real-world agentic tasks
MCP Atlas	83.6%	78.2%	3.5 Flash	Scaled tool use
CharXiv Reasoning	84.2%	N/A	3.5 Flash	Multimodal understanding
Humanity’s Last Exam	40.2%	44.4%	3.1 Pro	Hardest expert reasoning
ARC-AGI-2	72.1%	77.1%	3.1 Pro	Abstract reasoning
Output speed	~4x faster	baseline	3.5 Flash	Output tokens per second

On GDPval-AA, the jump is dramatic. Gemini 3.1 Pro scores 1,314 Elo; Gemini 3.5 Flash scores 1,656 Elo, a 342-point swing on real-world agentic work. Google says Gemini 3.5 Flash is 4 times faster than other frontier models when measured in output tokens per second. For agents that loop through dozens of steps, that speed gap compounds fast.

Where Gemini 3.1 Pro Still Wins

Gemini 3.5 Flash is not a clean sweep, and Google is honest about it. On the hardest pure-reasoning tests, the older Gemini 3.1 Pro still has the edge. It leads Humanity’s Last Exam (44.4% vs 40.2%) y ARC-AGI-2 (77.1% vs 72.1%), and it holds up better on very long-context retrieval, scoring 84.9% on MRCR v2 at 128k tokens against 77.3% for Flash, per independent benchmark and pricing data.

The practical read is simple. For coding, tool use, and agent workflows, Gemini 3.5 Flash is the better and cheaper pick today. For the deepest reasoning problems or huge-context document work, Gemini 3.1 Pro is still worth the extra cost until Gemini 3.5 Pro lands. Our full Gemini 3.1 Pro review covers where that model still shines.

Gemini 3.5 Flash Pricing

For most people the Gemini app costs nothing, and that has not changed. What changed is which model you get. Gemini 3.5 Flash was the free-tier model from May 2026, and Google’s plans page now lists “Access to 3.6 Flash” for the free tier, so a free user today reaches the newer generation. The pricing below matters if you are a developer calling gemini-3.5-flash directly through the API. If you just want the short answer, see whether Gemini is free.

API token pricing

Token type	Gemini 3.5 Flash	Gemini 3.1 Pro
Input (per 1M)	$1.50	$2.00
Output (per 1M)	$9.00	$12.00
Cached input (per 1M)	$0.15	N/A

That makes Gemini 3.5 Flash about 25% cheaper on both input and output than Gemini 3.1 Pro at Google’s official API rates, which run $2.00 in / $12.00 out up to 200k tokens and $4.00 in / $18.00 out beyond that, while scoring higher on the coding and agentic benchmarks. It is roughly 3x the price of the old Gemini 3 Flash ($0.50 in / $3.00 out), but far more capable, so it sits in a new sweet spot. The successor moved exactly one number, Gemini 3.6 Flash holds input at $1.50 and cuts output to $7.50, so the saving is output-only. For the wider context, see our full Gemini pricing breakdown and our AI pricing comparison across the major models.

How to Use Gemini 3.5 Right Now

Gemini 3.5 Flash is already live everywhere Google ships Gemini. You do not have to wait or join a list. There are four ways to use it today.

Gemini app: open the Gemini app on web, Android, or iOS (the Gemini iOS app now ranks among the best AI apps for iPhone). Google has not restated an app default model since Gemini 3 Flash, but the free tier now reaches 3.6 Flash.
AI Mode in Search: Gemini 3.5 Flash rolled out to AI Mode in Google Search worldwide in May 2026, free for everyone. Google has not published a model change for AI Mode since.
Gemini API: developers can call gemini-3.5-flash through Google AI Studio, the Gemini API, and Android Studio.
Enterprise: it is available through Google Antigravity, the Gemini Enterprise Agent Platform, and Gemini Enterprise.

If you mostly use Gemini on a Mac, our Gemini Mac app review walks through the desktop experience. Prefer one subscription that gives you several models instead of juggling apps? The Fello AI app for Mac puts Gemini, ChatGPT, Claude, Grok and DeepSeek behind a single $9.99/month plan, so you can switch models per task without paying for each one separately.

Gemini Spark and the Rest of Google I/O 2026

Gemini 3.5 Flash powers more than chat. Google used it to launch Gemini Spark, a 24/7 personal AI agent that works in the background to handle recurring tasks across your digital life. Spark rolled out to trusted testers first, followed by a Beta for Google AI Ultra subscribers in the US, as reported by Engadget. The fact that Spark runs on Gemini 3.5 Flash rather than the heavier Pro tier is what lets it stay online 24/7 at agent-tier pricing.

The keynote also brought a Neural Expressive redesign of the Gemini app, a personalized Daily Brief morning digest for AI Plus, Pro, and Ultra subscribers, and updates to Gemini Omni. Google AI Ultra starts at $99.99/month for 5x Google AI Pro limits, with a $199.99/month tier on top for 20x. On the safety side, Google says Gemini 3.5 ships with strengthened cyber and CBRN safeguards, fewer incorrect refusals on safe prompts, and new interpretability tooling that inspects the model’s internal reasoning before a response is returned. Google also previewed Android XR smart glasses with Gemini 3.5 Flash built in, putting Google directly against Apple in the 2027 race.

When Is Gemini 3.5 Pro Coming Out?

Gemini 3.5 Pro has not launched, and Google has never given a second date. Sundar Pichai announced it at I/O on May 19, 2026 and said it would arrive the following month. June passed. On July 21, 2026, in the Gemini 3.6 Flash announcement, Google’s only public statement was that “Gemini 3.5 Pro is currently testing with partners and we plan to make it broadly available as soon as it’s ready”. The model does not appear in the Gemini API model catalog or in the Vertex AI model list, in any form, preview included.

The reporting behind the delay is thinner than the internet suggests. Bloomberg reported on July 16, 2026 that Gemini 3.5 Pro is months behind schedule because Google has been working to improve its capabilities, particularly in coding. Google updated the data used to train the model in late June, and the results fell short of expectations. Speaking to 10 current and former employees, Bloomberg described frustration among Google engineers, researchers and managers who worry the company is losing its edge while Anthropic and OpenAI ship models that beat Gemini. That is the entire verified account.

Be careful with the specs you read elsewhere. The 2-million-token context window and the built-in Deep Think reasoning mode are the two features attached to Gemini 3.5 Pro in almost every roundup, and Google has confirmed neither for this model. Deep Think is real and already ships inside Gemini for Google AI Ultra subscribers, but that is a product feature you can buy today, not a published 3.5 Pro spec. Pricing is not official either, so treat every circulating figure as a placeholder until Google publishes rates.

Feature	Gemini 3.5 Flash	Gemini 3.5 Pro
Status	Live since May 19, 2026	Unreleased; “testing with partners”
Context window	1M tokens	2M rumored, not confirmed
Deep Think mode	No	Rumored, not confirmed
Best for	Speed, coding, agents, everyday chat	Hardest reasoning, huge-context work
API price (per 1M)	$1.50 in / $9.00 out	Not published

Why Gemini 3.5 Pro Is Late

You will find a detailed story online about Google scrapping and rebuilding the base model after it broke down on recursive tool-calling and complex SVG layouts. None of that appears in any primary source. It traces back to AI commentary channels and SEO aggregators, and it has been repeated often enough to look like reporting. What Bloomberg actually described is narrower and duller, a training-data update in late June aimed at coding that did not deliver the gains Google wanted, on a model already running months late.

The context that is documented is people. TechCrunch reported on June 24, 2026 that Google DeepMind lost four senior researchers inside a fortnight. Noam Shazeer, a Gemini co-lead and co-author of the paper that introduced the Transformer, left for OpenAI. Days later John Jumper, the DeepMind director who shared the 2024 Nobel Prize in Chemistry for AlphaFold, went to Anthropic, and Gemini contributors Jonas Adler y Alexander Pritzel followed him there. Nobody has tied those exits to the Pro delay directly, but they line up with the frustration Bloomberg’s sources described.

Until Pro is live for everyone, Gemini 3.6 Flash covers almost everything and gemini-3.1-pro-preview remains the pick for the deepest reasoning tasks, though note Google still ships that one as a preview rather than a stable model. We will update this article the moment Gemini 3.5 Pro goes generally available.

Gemini Is About to Power Apple’s New Siri

Gemini’s reach now extends well beyond Google’s own apps. Under a multi-year deal reportedly worth around $1 billion a year, a custom Gemini model, said to run roughly 1.2 trillion parameters inside Apple’s own data centers, powers Apple’s rebuilt Siri, the biggest overhaul the assistant has ever had. Apple unveiled the new Siri AI at WWDC 2026 as a standalone, chatbot-style app for iPhone, iPad, and Mac that can pull context from your messages, emails, and photos and run multi-step actions across apps.

The iOS 27 public beta went live in mid-July 2026, so anyone can now test the Gemini-powered Siri AI ahead of a general release in the autumn alongside the next iPhone. That means the same model family reviewed here already sits behind the assistant for early testers, and soon hundreds of millions more Apple devices. For the full picture of what is changing, see our breakdown of whether Siri is now real AI.

How Gemini 3.5 Pro and Flash Stack Up Against ChatGPT and Claude

Against the wider field the picture is competitive rather than dominant, and the numbers are unkind. On the Artificial Analysis Intelligence Index v4.1, Claude Fable 5 leads at 60, ahead of GPT-5.6 Sol at 59, Kimi K3 at 57 y Claude Opus 4.8 at 56. Google’s best-scoring entry on that board is Gemini 3.6 Flash at 50, y Gemini 3.1 Pro Preview sits lower still at 46. Google’s Flash tier outscoring its own Pro tier is the clearest measure of why 3.5 Pro matters, and why the delay stings. Gemini’s edge today is speed and price at near-flagship quality. Gemini 3.5 Flash does not claim the overall crown, but it changes the value equation by delivering this tier of performance at Flash cost and speed, and Gemini 3.5 Pro is built to challenge the top tier on the hardest reasoning tasks.

For a head-to-head on everyday use, see how Gemini compares to ChatGPT, or our three-way ChatGPT vs Claude vs Gemini breakdown for Mac users. The honest takeaway is that no single model wins everything, which is exactly why model choice per task beats model loyalty.

Should You Use Gemini 3.5?

For app users the question mostly answers itself, the free tier reaches Gemini 3.6 Flash and you get the 1M-token context window at no cost. For developers the honest recommendation is to skip gemini-3.5-flash and call gemini-3.6-flash instead, since it carries the same $1.50 input price, produces about 17% fewer output tokens, and bills output at $7.50 rather than $9.00. Reach for gemini-3.1-pro-preview only when you need the deepest reasoning. And do not build a plan around Gemini 3.5 Pro arriving on a given date, because Google has not attached one to it.

FAQ

When did Gemini 3.5 release?

Gemini 3.5 Flash launched on May 19, 2026 at Google I/O 2026. It was the first model in the Gemini 3.5 family and became the free-tier model in the Gemini app. Gemini 3.5 Pro, announced the same day, has still not shipped.

Is Gemini 3.5 free?

Gemini 3.5 Flash was the free-tier model in the Gemini app from May 2026. Google’s plans page now reads “Access to 3.6 Flash” for the free tier, so free users reach the newer model. Developers calling gemini-3.5-flash pay $1.50 per million input tokens and $9.00 per million output tokens.

Is Gemini 3.5 Flash better than Gemini 3.1 Pro?

On coding and agentic benchmarks, yes. It beats Gemini 3.1 Pro on Terminal-Bench 2.1, MCP Atlas, and GDPval-AA while running about 4x faster. Gemini 3.1 Pro still leads on the hardest pure-reasoning tests.

When is Gemini 3.5 Pro coming out?

Google has not given a date. Pichai announced it at I/O on May 19, 2026 for the following month, and on July 21, 2026 Google said only that it is “currently testing with partners”. Bloomberg reported on July 16 that it is months behind schedule after a late-June training-data update fell short on coding. The 2M-token context window and Deep Think specs circulating online are unconfirmed.

Did Gemini 3.5 Pro miss a July 17 launch?

No. Google never set a July 17 date. That claim, along with the reports of a Vertex AI preview and a scrapped base model, comes from AI commentary channels and aggregator blogs rather than from Google or Bloomberg. Google’s only stated position is that Gemini 3.5 Pro is testing with partners.

What is Gemini Spark?

Gemini Spark is a 24/7 personal AI agent powered by Gemini 3.5 Flash. It handles recurring tasks in the background and rolled out to trusted testers first, with a Beta for Google AI Ultra subscribers in the US.