Web crawlers are draining Wikipedia’s resources and infrastructure capacity

The news: Wikimedia, the host of Wikipedia, is experiencing unheard-of surges in traffic due to web crawlers that are scraping data for AI training.

  • Bandwidth for downloading multimedia files has grown 50% since January 2024, threatening infrastructure sustainability and stretching data center resources thin.
  • Bots account for 65% of its most expensive traffic. The company said in a blog post that this is disproportionate, since bots account for only 35% of overall page views.

You've read 0 of 2 free articles this month.

Get more articles - create your free account today!