Headline Roundup • August 13th, 2025
Reddit Blocks Wayback Machine From Archiving Site
Summary from the AllSides News Team
Reddit severely limited the Internet Archive Wayback Machine’s access on Monday after accusing AI companies of violating site policies and scraping the website for data.
The Details: Going forward, Reddit will only allow the Wayback Machine to archive its homepage.
For Context: The Wayback Machine is an online tool that allows users to view websites as they’ve previously appeared on certain days. Reddit has recently been restricting access to its data, particularly for AI training purposes, and requiring companies to pay for it instead. The company previously made deals with Google and OpenAI and took legal action against Anthropic, alleging the company continued to scrape its data for AI models despite stating it had stopped.
Key Quotes: Reddit spokesperson Tim Rathschmidt said, “Internet Archive provides a service to the open web, but we’ve been made aware of instances where AI companies violate platform policies, including ours, and scrape data from the Wayback Machine.” Wayback Machine director Mark Graham said, “We have a longstanding relationship with Reddit and continue to have ongoing discussions about this matter.”
How The Media Covered It: The story was not widely covered by mainstream media outlets. Blaze Media (Right bias) credited The Verge (Lean Left) for the report that Reddit’s limiting of Wayback would ramp up on Monday. Fast Company (Lean Left) framed the move as part of broader “AI data wars” where AI companies are trying to improve the sourcing of their models and Reddit is becoming more protective of its on-site content.
Written by the AllSides staff (of humans). Learn more. Support our mission.
Featured Coverage of this Story
Reddit says that it has caught AI companies scraping its data from the Internet Archive’s Wayback Machine, so it’s going to start blocking the Internet Archive from indexing the vast majority of Reddit. The Wayback Machine will no longer be able to crawl post detail pages, comments, or profiles; instead, it will only be able to index the Reddit.com homepage, which effectively means Internet Archive will only be able to archive insights into which news headlines and posts were most popular on a given day.

Jakub Porzycki/NurPhoto via Getty Images
As the battle to train artificial intelligence models becomes more intense and Reddit’s rich content library becomes more valuable, the social media giant has taken steps to block the Internet Archive from indexing its pages.
While the Wayback Machine has historically recorded all Reddit pages, comments and user profiles, the company has put limits on what the system can scrape. Moving forward, it will only be permitted to archive the site’s homepage, which shows popular posts and news headlines of the day, but not user comments or post history.
As artificial intelligence models continue to grow and develop, their demand for more and more data also increases rapidly. Now, some companies are making it tougher for AI scrapes to happen, unless companies pay a price.
Reddit has announced that it will be severely limiting the Internet Archive's Wayback Machine's access to the communication platform following its accusation that AI companies have been scraping the website for Reddit data. The platform will only be allowing the Internet Archive to save the home page of its website.