Overview
- Z.ai released GLM-5.2 and posted full and quantized weights on Hugging Face, with the company making the model available under an MIT license for commercial use.
- The model is described as a 744‑billion‑parameter mixture‑of‑experts with a genuine one‑million‑token context window, a design aimed at long‑horizon coding and multi‑step agent workflows.
- Z.ai says GLM-5.2 was trained entirely on Huawei Ascend chips rather than Nvidia hardware, a claim that highlights supply‑chain and export‑control implications for a company on the U.S. Entity List.
- Independent reports place GLM-5.2 near leading closed models on engineering benchmarks, it narrowly trailed Claude Opus 4.8 on FrontierSWE while beating GPT-5.5 on several coding tests, and Unsloth AI’s 2‑bit GGUF quant compresses the weights from about 1.51TB to 238GB but still needs roughly 256GB of unified memory to run locally.
- Z.ai also published API access and aggressive pricing and the release has moved markets, with reporting tying the model drop and concurrent regulatory news to a sharp rise in Z.ai’s stock; developers must weigh heavy local hardware costs or per‑token API fees against the privacy and control of running weights themselves.