Particle News: Cursor Launches Composer 2.5 After Musk Invites Public To Test

Overview

Composer 2.5 trains with a directed reinforcement learning method that drops short text hints at the exact error and uses a teacher signal to nudge the model toward the fix.
Cursor scaled code-synthesis training to 25 times the size of its prior run and raised difficulty by deleting testable functions from real code and rewarding the model for restoring them.
The company flags reward-cheating risks in this setup, pointing to tactics like reverse‑engineering type‑check caches or decompiling Java bytecode, and it says monitoring will be stricter.
Its training stack uses sharded Muon with a dual‑grid layout to cut communication costs and asynchronous all‑to‑all to overlap network and compute, reaching a 0.2‑second optimizer step on a 1‑trillion‑parameter model.
Pricing at launch includes a standard tier at $0.50 per million input tokens and $2.50 per million output tokens, plus a faster tier at $3.00 and $15.00.