Back to Wire
Internet Archive Study Reveals 35% of New Websites Are AI-Generated Since 2022
Science

Internet Archive Study Reveals 35% of New Websites Are AI-Generated Since 2022

Source: 404Media Original Author: Matthew Gault 2 min read Intelligence Analysis by Gemini

Sonic Intelligence

00:00 / 00:00
Signal Summary

A study found 35% of new websites since 2022 are AI-generated, altering web content.

Explain Like I'm Five

"Smart computer programs are now making lots of new websites, so many that about one out of every three new websites you see was made by a computer since 2022! Scientists found that these computer-made websites are often happier and use simpler words, but they aren't necessarily full of lies."

Original Reporting
404Media

Read the original article for full context.

Read Article at Source

Deep Intelligence Analysis

A collaborative study involving researchers from Stanford, Imperial College London, and the Internet Archive has delivered compelling quantitative evidence of artificial intelligence's profound and rapid impact on the digital content landscape. The findings indicate that a staggering 35% of all new websites published between late 2022 and mid-2025 were either entirely AI-generated or significantly AI-assisted. This dramatic shift, occurring in just three years since the public launch of advanced generative AI models, underscores a fundamental transformation in how online information is produced, moving from predominantly human-authored to a significant hybrid or AI-first model.

The research, which leveraged the Internet Archive's vast data and employed the high-accuracy Pangram v3 AI-detection software, provides critical data points for understanding the "Dead Internet Theory" in a new light. Prior to ChatGPT's release in late 2022, the proportion of AI-generated websites was negligible, highlighting the explosive growth. Crucially, the study systematically tested six common critiques leveled against AI-generated text. Contrary to widespread fears, the researchers found that AI-generated content did not necessarily lead to a proliferation of factual inaccuracies or a failure to cite sources. Instead, the primary confirmed effects were a reduction in semantic diversity and a tendency towards a more positive, less verbose tone.

The implications of this rapid AI integration are multifaceted. While concerns about disinformation may be partially alleviated by these findings, the homogenization of online discourse and the potential for a less semantically rich internet present new challenges. The sheer volume of AI-generated content could fundamentally alter search engine optimization, content discovery, and the perceived authenticity of online information. This transformation necessitates a re-evaluation of content strategies for publishers, a focus on AI literacy for consumers, and continued research into the long-term effects on human creativity and critical thinking in an increasingly AI-permeated digital environment.
AI-assisted intelligence report · EU AI Act Art. 50 compliant

Impact Assessment

This research provides quantitative evidence of AI's rapid and significant impact on the internet's content landscape, confirming a substantial shift in how digital information is created. It challenges some prevailing assumptions about AI-generated text, particularly regarding disinformation and source citation, while highlighting new concerns about content homogenization.

Key Details

  • Researchers from Stanford, Imperial College London, and the Internet Archive conducted the study.
  • 35% of newly published websites by mid-2025 were classified as AI-generated or AI-assisted.
  • This figure is up from zero before ChatGPT's launch in late 2022.
  • The study sampled websites from August 2022 to May 2025 using the Wayback Machine.
  • Pangram v3 AI-detection software was used, demonstrating the highest detection rate.
  • Only two of six common critiques of AI text were confirmed: less semantic diversity and a more positive tone.
  • AI-generated text was not found to proliferate lies or cut out sources.

Optimistic Outlook

The rapid adoption of AI for website generation could democratize content creation, enabling more individuals and small businesses to establish an online presence efficiently. If AI tools improve in diversity and factual accuracy, they could significantly boost productivity and content volume without necessarily degrading overall quality.

Pessimistic Outlook

A web dominated by AI-generated content risks a homogenization of voice and style, potentially leading to a less diverse and engaging internet experience. While the study found no increase in lies, the sheer volume of AI-generated text could still make it harder to discern authoritative human-created content, impacting trust and information discovery.

Stay on the wire

Get the next signal in your inbox.

One concise weekly briefing with direct source links, fast analysis, and no inbox clutter.

Free. Unsubscribe anytime.

Continue reading

More reporting around this signal.

Related coverage selected to keep the thread going without dropping you into another card wall.