Overview
- NVIDIA said its Blackwell platform runs DeepSeek‑V4‑Pro and Flash and measured more than 150 tokens per second per user on a GB200 NVL72.
- Alibaba Cloud launched API access for both DeepSeek‑V4 models on its Bailian platform with prices that match DeepSeek’s official rates.
- Moore Threads and FlagOS completed day‑zero support for DeepSeek‑V4‑Flash on the MTT S5000 GPU using FP8 Tensor Cores to cut data width and ease memory load.
- DeepSeek said the Pro and Flash variants support one‑million‑token context windows and use an MIT license that allows broad commercial integration.
- Inside NVIDIA, more than 10,000 employees now use a GPT‑5.5 version of Codex that runs on GB200 NVL72 systems.