MiniMax M3: Release Date, Sparse Attention, and What to Expect from China’s Next AI Flagship
In late May 2026, MiniMax’s Head of Engineering, Skyler Miao, broke cover with the first technical preview of MiniMax M3, the Shanghai lab’s next flagship language model. The pitch is sharp: 9.7× faster prefill and 15.6× faster decoding at 1-million-token context versus the current MiniMax-M2.7. The trick is reintroducing sparse attention, an architecture MiniMax explicitly […]
