Particle News: NVIDIA Confirms Blackwell Support for DeepSeek‑V4 With 150 Tokens a Second

Overview

NVIDIA said its Blackwell platform runs DeepSeek‑V4‑Pro and Flash and measured more than 150 tokens per second per user on a GB200 NVL72.
Alibaba Cloud launched API access for both DeepSeek‑V4 models on its Bailian platform with prices that match DeepSeek’s official rates.
Moore Threads and FlagOS completed day‑zero support for DeepSeek‑V4‑Flash on the MTT S5000 GPU using FP8 Tensor Cores to cut data width and ease memory load.
DeepSeek said the Pro and Flash variants support one‑million‑token context windows and use an MIT license that allows broad commercial integration.
Inside NVIDIA, more than 10,000 employees now use a GPT‑5.5 version of Codex that runs on GB200 NVL72 systems.