Particle News: Karpathy Releases Nanochat, a $100 ChatGPT‑Style LLM Speedrun, as Community Publishes Hugging Face Build

Overview

Nanochat is a full‑stack, minimal codebase that covers tokenization, pretraining, midtraining, supervised finetuning, evaluation, inference, and a simple web UI.
A single speedrun script trains and serves a usable model in roughly four hours for about $100 at ~$24 per hour, with a report.md summarizing modest evaluation results.
The default regimen pretrains on ~24GB from fineweb‑edu‑100b‑shuffle, midtrains on SmolTalk, MMLU auxiliary, and GSM8K, then finetunes on ARC‑Easy, ARC‑Challenge, GSM8K, and SmolTalk.
A third‑party build of the model is now on Hugging Face at sdobson/nanochat, and community experiments show it running on CPU on macOS using a lightweight script.
Karpathy outlines larger, higher‑cost tiers—including a ~$300 depth‑26 run and a ~$1,000 tier—that are described but not yet merged into the main branch.