Particle News: OpenAI and Cerebras Launch GPT-5.3‑Codex‑Spark for Real‑Time Coding at 1,000+ Tokens per Second

Overview

The release marks the first publicly available product of the OpenAI–Cerebras collaboration and targets highly responsive, developer‑guided coding.
OpenAI positions Codex‑Spark as a small yet capable model designed for precise edits, plan adjustments, codebase Q&A, and rapid UI or style iteration.
The companies claim throughput above 1,000 tokens per second, crediting Cerebras’ Wafer‑Scale Engine for multi‑thousand‑token‑per‑second inference and large on‑chip memory.
OpenAI reports shorter task times and stronger results than GPT‑5.1‑Codex‑mini on SWE‑Bench Pro and Terminal‑Bench 2.0, based on internal benchmarks.
Access begins as a research preview in Codex apps, the CLI, and a VS Code extension, with API access opening gradually to select partners and a plan to extend the high‑speed capability to larger models later in 2026.