본문으로 건너뛰기
All news

MiniMax M3: Open-Weight Model Hits 1M-Token Context and Outperforms GPT-5.5 on Coding

Summary: MiniMax shipped M3 on June 1, introducing MiniMax Sparse Attention (MSA) — a new architecture that claims to be the first open-weight model combining a 1M-token context, native multimodal input, and frontier-level coding performance.

Key Facts

  • MSA architecture: 9× faster prefill and 15× faster decoding at 1M-token context vs M2; 1/20th the per-token compute
  • Benchmark: SWE-Bench Pro score of 59.0% — above GPT-5.5 and Gemini 3.1 Pro; priced at an estimated 5–10% of GPT-5.5 API cost
  • Multimodal: natively handles image and video input, plus desktop computer operation
  • API available immediately; model weights and technical report to be released within 10 days of launch

Why It Matters

M3 is the first model MiniMax claims simultaneously achieves open weights + 1M-token context + native multimodality. Delivering frontier coding performance at a fraction of closed-API costs puts real pressure on OpenAI and Anthropic's enterprise pricing — especially for agentic workflows and long-document processing where context length is the bottleneck.

Read More

뉴스레터 구독

무료 뉴스레터

매주 핵심 AI 소식, 한 번에 받기

쏟아지는 AI·LLM 뉴스 중 꼭 알아야 할 것만 골라 메일로 보내드려요. 뉴스레터 발송이 시작되면 구독자분들께 가장 먼저 보내드립니다.