Z.ai Ships GLM-5.2: Usable 1M-Token Context, No Benchmarks
One-liner: Z.ai launched GLM-5.2 on June 13 with a functional 1-million-token context window — five times its predecessor — but made the unusual choice to ship without any public benchmark numbers.
Key Facts
- 744B-parameter Mixture-of-Experts architecture; 40B active parameters per token
- 1M-token context (≈5× GLM 5.1's ~200K limit); output up to 131,072 tokens
- Two reasoning modes: High and Max thinking effort
- API pricing: $1.40 input / $4.40 output per million tokens
- No benchmark scores at launch — no SWE-bench, LiveCodeBench, or HumanEval published
- MIT-licensed open weights promised via Hugging Face in the week following launch
Why It Matters
The benchmark-free launch is the real story. Rather than competing on leaderboard rankings, Z.ai led with raw context capacity for coding and agentic workflows. If the 1M-token window delivers in production, it shifts competitive pressure from eval metrics toward real-world throughput — a pattern worth watching as Chinese labs increasingly skip the benchmark arms race in favor of capability-first releases.
Read More
- Z.ai Launches GLM-5.2 — MarkTechPost