2026-06-17

MiniMax M3: Open-Weight Model Hits 1M-Token Context and Outperforms GPT-5.5 on Coding

Summary: MiniMax shipped M3 on June 1, introducing MiniMax Sparse Attention (MSA) — a new architecture that claims to be the first open-weight model combining a 1M-token context, native multimodal input, and frontier-level coding performance.

Key Facts

MSA architecture: 9× faster prefill and 15× faster decoding at 1M-token context vs M2; 1/20th the per-token compute
Benchmark: SWE-Bench Pro score of 59.0% — above GPT-5.5 and Gemini 3.1 Pro; priced at an estimated 5–10% of GPT-5.5 API cost
Multimodal: natively handles image and video input, plus desktop computer operation
API available immediately; model weights and technical report to be released within 10 days of launch

Why It Matters

M3 is the first model MiniMax claims simultaneously achieves open weights + 1M-token context + native multimodality. Delivering frontier coding performance at a fraction of closed-API costs puts real pressure on OpenAI and Anthropic's enterprise pricing — especially for agentic workflows and long-document processing where context length is the bottleneck.

MiniMax Releases MiniMax M3 with MSA Architecture — MarkTechPost
MiniMax-M3 debuts, eclipsing GPT-5.5 and Gemini 3.1 Pro — VentureBeat

MiniMax M3: Open-Weight Model Hits 1M-Token Context and Outperforms GPT-5.5 on Coding

Key Facts

Why It Matters

Read More

매주 핵심 AI 소식, 한 번에 받기