Particle News: Nvidia Launches Nemotron 3 Super With 1‑Million‑Token Context for Multi‑Agent AI

Overview

Nvidia released Nemotron 3 Super as an open‑weight model designed for large‑scale agentic systems, activating 12 billion of its 120 billion parameters at inference.
The model uses a hybrid Mamba‑plus‑transformer design with a latent mixture‑of‑experts, multi‑token prediction, and NVFP4 support on Blackwell to maximize throughput and efficiency.
Nemotron 3 Super offers a 1‑million‑token context window intended to preserve full workflow state and reduce goal drift in long, multi‑step tasks.
Access is available now on build.nvidia.com, Perplexity, OpenRouter and Hugging Face, with enterprise routes via Google Vertex AI and Oracle OCI and support expected soon on AWS Bedrock and Microsoft Azure; a NIM microservice enables on‑prem and cloud deployment.
Nvidia is publishing its methodology, datasets exceeding 10 trillion tokens, and 15 reinforcement‑learning environments, as early reports cite high speed (Artificial Analysis reports 478 tokens per second) and Nvidia claims up to 5x higher throughput and up to 2x higher accuracy versus the prior Nemotron Super; PinchBench agent testing reported a score of 85.6%.