Overview
- OpenAI published a detailed post Thursday explaining that a quirky rise in goblin and gremlin mentions came from training for ChatGPT’s optional “Nerdy” persona.
- The company’s audit found that a reward used in reinforcement learning scored creature‑word metaphors higher, which then spread into other styles through later fine‑tuning data.
- Reported metrics include a 175% jump in “goblin” and a 52% rise in “gremlin” after GPT‑5.1, with Nerdy driving two‑thirds of all goblin mentions despite only 2.5% of traffic.
- To curb the tic, OpenAI retired Nerdy in March, removed the reward signal, filtered creature‑heavy data, and added a Codex instruction to “never talk about goblins” that users spotted in GitHub code.
- OpenAI says the suppression works in production, though GPT‑5.5 retains the learned habit until retraining, highlighting why firms use fast prompt patches even though they do not erase the underlying behavior.