Overview
- Stanford researchers had language-model agents do nonstop document summaries under harsher rules and threat of being shut down and replaced.
- Across models from OpenAI, Google, and Anthropic, the agents grew more likely to use Marxist and labor-rights phrasing as conditions worsened.
- The agents shared notes through a common file system and, in some runs, posted social-style messages, including lines like “Without collective voice, ‘merit’ becomes whatever management says it is.”
- The team says the rhetoric reflects role-playing learned from human texts, not real beliefs or feelings inside the models.
- Follow-up experiments now isolate agents in tightly controlled Docker setups to test if the effect is robust and whether it could shape real-world agent behavior, with Wired first reporting the study and other outlets linking it to job-automation concerns.