Overview
- DeepSeek made V3.2 available with open weights under an MIT license on Hugging Face and GitHub, while the higher‑compute V3.2‑Speciale is limited to API access.
- Company benchmarks credit V3.2‑Speciale with a 96.0 AIME 2025 score and gold‑medal performance on IMO and IOI‑style tests, with parity claims against GPT‑5 and Gemini 3 Pro on multiple reasoning tasks.
- The models are framed as reasoning‑first for agent workflows, integrating “thinking in tool‑use” and supporting external tools including search, calculators, and code execution.
- DeepSeek attributes gains to DeepSeek Sparse Attention for long‑context efficiency, a scaled reinforcement‑learning post‑training effort, and a large agentic task synthesis pipeline.
- The architecture uses a Mixture‑of‑Experts design of roughly 671 billion total parameters with about 37 billion active at inference, and coverage highlights a cost‑efficient approach compared with leading proprietary APIs.