Particle.news
Download on the App Store

Nvidia Launches Nemotron 3 Super, an Open 120B Model for Agentic, Long‑Context AI

Independent tests show top‑tier throughput with competitive accuracy, with access already live on major model hubs and select clouds.

Overview

  • Nemotron 3 Super runs a hybrid Mamba‑Transformer Mixture‑of‑Experts design with 120 billion parameters (12 billion active) and a native 1‑million‑token context window for long, multi‑agent workflows.
  • Nvidia is releasing open weights plus training artifacts, including over 10 trillion tokens of datasets and 15 reinforcement‑learning environments, with deployment as an NIM microservice on workstations, data centers or cloud.
  • Nvidia claims up to 5× higher throughput and up to 2× higher accuracy versus the previous Nemotron Super, with NVFP4 on Blackwell cutting memory and enabling up to 4× faster inference than FP8 on Hopper.
  • Artificial Analysis reports an overall intelligence score around 36 and about 478 output tokens per second—faster than prior models—placing it below leading proprietary systems on raw intelligence but strong on speed and efficiency.
  • Availability spans build.nvidia.com, Perplexity, OpenRouter and Hugging Face, with enterprise access on Google Cloud Vertex AI and Oracle Cloud Infrastructure and integrations by partners such as Perplexity, Palantir, Amdocs, Cadence and Siemens.