Overview
- Tencent disclosed Thursday it had released HY3‑Preview and put it into Yuanbao, CodeBuddy, WorkBuddy, QQ, and Tencent Docs.
- The model uses a mixture‑of‑experts design with about 295 billion total parameters but only 21 billion active per request, which steers easy questions to lighter experts to cut cost and speed up replies.
- Coverage conflicts on access, with Forbes and Decrypt reporting open‑source releases on GitHub and Hugging Face plus paid APIs on Tencent Cloud, while the South China Morning Post calls the model closed‑source.
- Early product metrics cited by Tencent show big gains in its assistants, including a 54% drop in first‑token latency, a 47% cut in end‑to‑end response time, and stable agent runs hundreds of steps long.
- Tencent rebuilt its training and reinforcement‑learning stack in February and trained HY3 in about 90 days, choosing a ~295B‑parameter ceiling after HY 2.0’s 400B‑plus size to favor reliability, lower cost, and faster deployment.