Particle News: Stanford Study Finds AI Writing Feedback Shifts With Student Demographic Labels

Overview

Stanford researchers report that large language models changed their comments on identical student essays when only the described writer’s identity changed.
The team processed 600 eighth-grade persuasive essays through four AI systems, including versions of ChatGPT and Meta’s Llama, then re-ran each essay with rotating labels for race, gender, motivation, and disability.
Essays tagged as Black drew more praise and encouragement, Hispanic or English-learner labels drew more grammar correction, and White labels drew more critique on argument, evidence, and clarity.
The authors describe these patterns as positive-feedback and feedback-withholding biases and caution that softer treatment for some students can limit meaningful revision and skill growth.
They urge teachers to review AI comments before sharing them and note that the paper is not peer-reviewed but nominated for presentation at an international learning analytics conference, with causes of the patterns uncertain due to opaque model training.