MiniMax M3: Release Date, Sparse Attention, and What to Expect from China’s Next AI Flagship
In late May 2026, MiniMax’s Head of Engineering, Skyler Miao, broke cover with the first technical preview of MiniMax M3, the Shanghai lab’s next flagship language model. The pitch is sharp: 9.7× faster prefill and 15.6× faster decoding at 1-million-token context versus the current MiniMax-M2.7. The trick is reintroducing sparse attention, an architecture MiniMax explicitly killed in its own M2 generation a year earlier. That reversal is the story. In this preview we cover what’s confirmed, what’s still teased, when M3 is likely to ship, and how the speed claims hold up. We also look at what it means for the race against Claude, GPT, Gemini, and DeepSeek, and what […]
