Chinese Team Opens 1.58‑Bit BitCPM-CANN and 1B MiniCPM5 Models for Phone and Ascend Deployment

The releases promise up to sixfold inference memory savings so larger models can run on phones and browsers while the claims await independent community validation.

Overview

Facewall AI with Tsinghua and OpenBMB published and open‑sourced the BitCPM-CANN family (0.5B, 1B, 3B, 8B) and the MiniCPM5-1B base on public platforms during releases on May 25–26.
The team says BitCPM-CANN uses a 1.58‑bit ternary weight format (values −1/0/1) to cut inference memory by about six times compared with BF16 while retaining 90%–97% of model ability on 11 core tasks.
MiniCPM5-1B is a 1 billion‑parameter model whose INT4 quantized weights shrink to about 0.5 GB so it can run in mobile browsers and on phones, and the developers report it tops AA-Index results for models under 2B parameters.
All training and operators were implemented natively on Huawei Ascend hardware, with the team reporting roughly a month of software‑stack adaptation and a training recipe that does quantization‑aware training first and then distillation from a full‑precision teacher.
Reporters note the context of rising memory costs for devices and servers and emphasize that the published performance and deployment claims are based on the developers' data and now require independent benchmarks and community replication.