Particle.news
Download on the App Store

Nvidia Unveils Cosmos 3, an Open World Model for Robots and Vehicles

The model aims to cut the cost and risk of training physical AI through physics-aware action data generation, including synthetic rare scenarios.

Overview

  • Cosmos 3, which Nvidia unveiled Monday, June 1, 2026, is an open multimodal world model built to let robots and autonomous vehicles predict, simulate and generate action trajectories before they move.
  • Nvidia says the model was trained on about 20 trillion multimodal tokens that include nearly a billion images, 400 million real and synthetic videos, ambient audio, text and action data from humans and robots.
  • The system uses a paired mixture-of-transformers design to process vision, sound, text and action inputs and to output predictive video of future states and explicit robot actions such as joint angles and gripper trajectories.
  • Nvidia released two sizes now—Cosmos 3 Super for high-fidelity physics tasks and Cosmos 3 Nano for fast responses—and said an Edge variant for local inference on devices will arrive soon.
  • By offering open models and an industry coalition with initial partners like Agile Robots and Black Forest Labs, Nvidia hopes to lower data and safety barriers for developers and make its platform central to physical AI development.