Overview
- A group of five major publishers and author Scott Turow, which filed Tuesday in Manhattan federal court, accuses Meta of copying millions of books and journal articles to train its Llama AI models without permission.
- The complaint says Meta’s teams pulled texts from pirate repositories such as LibGen, Anna’s Archive and Sci-Hub using torrenting and web scraping, and that the company stripped copyright notices from the files.
- Plaintiffs cite examples where Llama allegedly reproduces verbatim or near‑verbatim passages and mimics authors’ voices, pointing to outputs tied to textbooks and to named authors like Becky Lomax, N. K. Jemisin and Peter Brown.
- The suit alleges Mark Zuckerberg personally halted licensing talks in April 2023 and authorized the use of unauthorized datasets, and it seeks class certification, damages and court orders to stop use of the works and to disclose what was used.
- Meta says training on copyrighted material can be fair use and vows to fight the case, in a landscape with split rulings and a $1.5 billion author settlement with Anthropic that could shape how AI firms license training data and how writers get paid.