Particle News: Publishers Curb Internet Archive Access to Thwart AI Scraping

Overview

The Financial Times says it blocks bots from OpenAI, Anthropic, Perplexity, and the Internet Archive from scraping its paywalled site.
Because most FT stories are paywalled, Matt Rogerson says typically only unpaywalled FT articles appear in the Wayback Machine.
The New York Times confirms it is blocking the Internet Archive’s bot, citing unauthorized, unfettered access that could be used by AI companies.
Industry executives say the Internet Archive’s API offers structured data that makes it an attractive target for AI scrapers, while the Wayback Machine is considered less risky.
Reddit previously cut off the Internet Archive over API concerns, and shifting bot rules and archival policies are already limiting what the public can find in web archives.