Headline Roundup • August 13th, 2025

Reddit Blocks Wayback Machine From Archiving Site

Technology,AI,Artificial Intelligence,Reddit,Internet,Data

Summary from the AllSides News Team

Reddit severely limited the Internet Archive Wayback Machine’s access on Monday after accusing AI companies of violating site policies and scraping the website for data.

The Details: Going forward, Reddit will only allow the Wayback Machine to archive its homepage.

For Context: The Wayback Machine is an online tool that allows users to view websites as they’ve previously appeared on certain days. Reddit has recently been restricting access to its data, particularly for AI training purposes, and requiring companies to pay for it instead. The company previously made deals with Google and OpenAI and took legal action against Anthropic, alleging the company continued to scrape its data for AI models despite stating it had stopped.

Key Quotes: Reddit spokesperson Tim Rathschmidt said, “Internet Archive provides a service to the open web, but we’ve been made aware of instances where AI companies violate platform policies, including ours, and scrape data from the Wayback Machine.” Wayback Machine director Mark Graham said, “We have a longstanding relationship with Reddit and continue to have ongoing discussions about this matter.”

How The Media Covered It: The story was not widely covered by mainstream media outlets. Blaze Media (Right bias) credited The Verge (Lean Left) for the report that Reddit’s limiting of Wayback would ramp up on Monday. Fast Company (Lean Left) framed the move as part of broader “AI data wars” where AI companies are trying to improve the sourcing of their models and Reddit is becoming more protective of its on-site content.

Written by the AllSides staff (of humans). Learn more. Support our mission.

Featured Coverage of this Story

Reddit will block the Internet Archive

The Verge

See rating details

News

Reddit says that it has caught AI companies scraping its data from the Internet Archive’s Wayback Machine, so it’s going to start blocking the Internet Archive from indexing the vast majority of Reddit. The Wayback Machine will no longer be able to crawl post detail pages, comments, or profiles; instead, it will only be able to index the Reddit.com homepage, which effectively means Internet Archive will only be able to archive insights into which news headlines and posts were most popular on a given day.

Open on The Verge

AI data wars push Reddit to block the Wayback Machine

Fast Company

See rating details

AI data wars push Reddit to block the Wayback Machine

Jakub Porzycki/NurPhoto via Getty Images

News

As the battle to train artificial intelligence models becomes more intense and Reddit’s rich content library becomes more valuable, the social media giant has taken steps to block the Internet Archive from indexing its pages.

While the Wayback Machine has historically recorded all Reddit pages, comments and user profiles, the company has put limits on what the system can scrape. Moving forward, it will only be permitted to archive the site’s homepage, which shows popular posts and news headlines of the day, but not user comments or post history.

Open on Fast Company

Reddit bars Internet Archive from its website, sparking access concerns

Blaze Media

See rating details

News

As artificial intelligence models continue to grow and develop, their demand for more and more data also increases rapidly. Now, some companies are making it tougher for AI scrapes to happen, unless companies pay a price.

Reddit has announced that it will be severely limiting the Internet Archive's Wayback Machine's access to the communication platform following its accusation that AI companies have been scraping the website for Reddit data. The platform will only be allowing the Internet Archive to save the home page of its website.

Open on Blaze Media

More headline roundups