The city government of Rio de Janeiro just released a frontier-class open AI model. It is built to go head to head with the best open systems on the planet. Rio 3.5 Open 397B comes from IplanRIO, the municipal IT company that runs Rio de Janeiro’s digital services. On its own benchmarks it edges out Alibaba’s Qwen 3.7 Plus on several tests. That is not a typo. A city hall shipped a 397-billion-parameter model under a fully open license.
Here is the honest version of the story. Rio 3.5 Open 397B is not built from scratch. It is fine-tuned from Qwen 3.5-397B-A17B, Alibaba’s open base model, with a new reasoning layer added on top. The headline scores are striking, but they are first-party numbers that have not been independently audited yet. The model wins on some benchmarks while losing on others. This article breaks down what the model actually is, how it really compares to Qwen, and why a municipal government building AI matters for everyone.
The Key Takeaways
- Rio 3.5 Open 397B was released by IplanRIO, the IT company of Rio de Janeiro’s city government, making it one of the first frontier-class models from a municipal authority.
- It is a fine-tune of Alibaba’s Qwen 3.5-397B-A17B, with 397B total parameters a 17B active in a Mixture-of-Experts design.
- On its own model card it beats Qwen 3.7 Plus on four of five benchmarks, including Terminal-Bench 2.1 (70.8 vs 70.3) and SWE-Bench Multilingual (77.0 vs 75.8), trailing only on MMLU-Pro (88.0 vs 88.5).
- The gains lean heavily on SwiReasoning, a training-free method that switches between visible and hidden reasoning; the scores are not yet independently verified.
- The weights ship under an Licence MIT and weigh roughly 807 GB, so almost nobody will run this at home.
What is Rio 3.5 Open 397B?
Rio 3.5 Open 397B is a frontier-class open AI model released by IplanRIO. It is IT company of Rio de Janeiro’s city government. It is fine-tuned from Alibaba’s Qwen 3.5-397B-A17B, uses a Mixture-of-Experts design with 397 billion total parameters a 17 billion active per token, and ships under a permissive Licence MIT for commercial and research use.
The key word is open. The full weights are published on Hugging Face, so anyone with the hardware can download, inspect, and run the model. That openness is the whole point. That way it lets a city government compete in a space normally dominated by labs like OpenAI, Google, and Anthropic. Rio did not have to train a model from zero; it stood on top of a strong open base and added its own improvements.
That last detail matters for context. Calling Rio 3.5 a “Brazilian model that beats Qwen” is half the story, because the model is literally built on Qwen. The more accurate framing is that Rio took Alibaba’s open foundation and tuned it well enough to nudge past Alibaba’s own newer release on a handful of tests. If you want the wider picture of where these systems sit, our running guide to the best AI models tracks the frontier month by month.
Brazil just cooked up a model – Rio 3.5 397B, which is better than Alibaba's Qwen 3.7 Plus.
— Dr Singularity (@Dr_Singularity) June 13, 2026
Made by the city of Rio de Janerio.
This is exactly what I mean by global acceleration.
Glad to see AI progress in Brazil, we need more from all over the world. pic.twitter.com/fmmhNBB1pO
Rio 3.5 vs Qwen 3.7 Plus: the benchmark numbers
This is the claim that turned heads. On its own model card, Rio 3.5 beats Qwen 3.7 Plus on four of five listed benchmarks, leading on coding, software engineering, and hard math while trailing only on broad knowledge. The wins are real benchmarks, but most margins are thin and the scores come from Rio itself.
| Benchmark | What it tests | Rio 3.5 Open 397B | Qwen 3.7 Plus | Winner |
|---|---|---|---|---|
| Terminal-Bench 2.1 | Agentic terminal and coding tasks | 70.8 | 70.3 | Rio 3.5 |
| SWE-Bench Multilingual | Real code fixes across languages | 77.0 | 75.8 | Rio 3.5 |
| SWE-Bench Pro | Hard software engineering tasks | 58.1 | 57.6 | Rio 3.5 |
| IMOAnswerBench | Olympiad-level math reasoning | 89.5 | 86.0 | Rio 3.5 |
| MMLU-Pro | Advanced multi-subject knowledge | 88.0 | 88.5 | Qwen 3.7 Plus |
The margins tell the real story. Rio leads by half a point on Terminal-Bench 2.1 and SWE-Bench Pro, and by wider gaps on SWE-Bench Multilingual and the IMOAnswerBench math test, while Qwen keeps a slim edge on MMLU-Pro. Read every figure as Rio’s own reported number rather than a settled ranking. The honest takeaway is that “beats Qwen” is real but narrow and self-reported. For the base of comparison, see our Qwen 3.7 Max review.
Are these benchmark scores actually real?
Short answer, they are unverified for now. Every Rio-specific number lives on the model card, published by the same team that built the model. The benchmarks themselves are legitimate, but a real benchmark is not the same thing as an independently confirmed score, and vendors often run tests with different scaffolds and settings that make head-to-head numbers shaky.
There is a second wrinkle. Analysts who dug into the release found that much of Rio’s edge appears to come from SwiReasoning, yet the repository does not include the code that implements it. That gap makes the results hard to reproduce, which is exactly what independent verification needs. The fair way to treat Rio 3.5 today is as a promising open-weight release with aggressive first-party claims, not as a proven new champion.
SwiReasoning: the trick behind the gains
SwiReasoning is a training-free inference framework that lets a model switch between explicit chain-of-thought and silent latent-space reasoning, guided by entropy-based confidence signals. In plain terms, the model only “thinks out loud” when it is unsure and otherwise reasons quietly in hidden space, which can improve both accuracy and token efficiency.
The technique comes from a research paper by Shi and colleagues published in late 2025, not from Rio itself. That is part of what makes the release interesting; IplanRIO combined an existing open base model with a published academic method to squeeze out extra performance. It is a smart, low-cost recipe, and it is one more sign that frontier gains no longer require a giant lab budget.
Why a city government building AI matters
Forget the benchmark spat for a second, because the bigger story is who shipped this. A municipal IT department in Brazil released a model in the same weight class as releases from Alibaba, DeepSeek, and Moonshot. That is global acceleration in action, and it shows how open base models plus public research are spreading serious AI capability far beyond Silicon Valley and Hangzhou.
It also points to a healthier AI ecosystem. When a city hall in the Global South can fine-tune a top-tier model for its own languages and needs, the technology stops being the private property of a few companies. We have already seen open releases like Nex-N2-Pro, Kimi K2.6a DeepSeek V4 chip away at the lead of closed labs. Rio 3.5 adds a new kind of player to that list, a public institution rather than a startup.
Specs, license, and can you run it?
The headline specs are serious. Rio 3.5 Open 397B is a Mixture-of-Experts transformer with 397B total parameters a 17B active. It is built on 60 layers with 512 experts and 10 selected per token. Model supports a native context of about 262,000 tokens, stretched to a claimed 1 million with scaling techniques. It also handles multimodal image and text input across English, Portuguese, Chinese, and several other languages.
The catch is size. The weights come in at roughly 807 GB, which puts local hosting out of reach for normal laptops and even most workstations. The Licence MIT means companies and researchers can build on it freely. Realistically you will access it through a cloud provider rather than your own machine. Open weights are great for transparency and trust; they are not the same as being easy to run.
What this means for everyday AI users
For most people, the practical lesson of Rio 3.5 is that you do not need to chase or host any single model. New frontier-class systems now arrive almost weekly from labs, startups, and even city governments, and no one model wins at everything. The smart move is access to many models, not loyalty to one.
That is the idea behind Fello AI on Mac. Instead of juggling separate subscriptions, you get Claude, ChatGPT, Gemini, Grok, and DeepSeek in one native app for $9.99 a month. The lineup updates as new models prove themselves. You skip the 807 GB downloads and the per-provider bills, and you switch to whichever model is best for the task in front of you. New to it? Start with the getting started guide.
Závěr
Rio 3.5 Open 397B is an exciting release, and it deserves both the applause and a dose of caution. A city government fine-tuning a top-tier open model and posting numbers that nudge past Alibaba’s latest is a real milestone for AI outside the usual hubs. Just remember the scores are first-party, the wins are benchmark-specific, and the model is built on Qwen rather than from scratch.
Watch for independent benchmarks over the coming weeks to see whether Rio 3.5 holds up under outside testing. Either way, the trend is clear. Serious AI is going global, and that is good news for everyone who uses it.
FAQ
Who made Rio 3.5 Open 397B?
It was released by IplanRIO, the municipal IT company of Rio de Janeiro’s city government in Brazil, and published openly on Hugging Face.
Is Rio 3.5 really better than Qwen?
On its own model card it beats Qwen 3.7 Plus on four of five benchmarks, trailing only on MMLU-Pro. The numbers are first-party and not yet independently audited, so “better” is real but narrow and self-reported.
What is Rio 3.5 based on?
It is fine-tuned from Alibaba’s open Qwen 3.5-397B-A17B base model, with a SwiReasoning inference layer added to boost reasoning and efficiency.
Can I download and run Rio 3.5?
Yes, the weights are open under an MIT license, but at around 807 GB they need serious server hardware. Most users will reach it through a cloud host rather than a personal computer.
What is SwiReasoning?
It is a training-free method that switches a model between visible chain-of-thought and hidden latent reasoning based on confidence, aiming to raise accuracy while using fewer tokens.




