Snap Inc. Faces Class Action Lawsuit over Allegedly Scraping Millions of YouTube Videos to Train its AI

Aashir Ashfaq
5 Min Read
Snap Inc. Faces Class Action Lawsuit over Allegedly Scraping Millions of YouTube Videos to Train its AI
Credit: Snap Inc.

Snap, Inc., the parent company of Snapchat, is facing a proposed class action lawsuit filed on January 23, 2026, accusing the social media giant of illegally downloading millions of YouTube videos to train its generative AI model, bypassing the platform’s own protective measures and violating creators’ intellectual property rights. The case, captioned Ted Entertainment Inc. et al. v. Snap Inc. (Case No. 2:26  cv  00754), was filed in California under the Digital Millennium Copyright Act (DMCA).

How the Scraping Allegedly Worked

According to the 22 page lawsuit, Snap used backend automated tools to fraudulently download the visual and audio files of millions of YouTube videos through a process known as scraping. The complaint alleges Snap specifically accessed two large AI training datasets called HD  VILA  100M and Panda  70M, which contain detailed information about millions of YouTube videos.

To avoid detection, Snap allegedly used video downloading programs and virtual machines that rotated IP addresses to circumvent YouTube‘s technological protection measures. The lawsuit argues this goes far beyond the “ordinary” use of a site visitor, constituting a deliberate and systematic effort to extract data at scale.

YouTube’s Terms Were Allegedly Ignored

YouTube‘s Terms of Service explicitly prohibit scraping, unauthorized downloading, bulk extraction, and other forms of data mining of audiovisual content, except through expressly permitted features or licensed application programming interfaces. The complaint argues that these contractual restrictions operate alongside the platform’s technical safeguards to prevent exactly the kind of access Snap is alleged to have carried out, regardless of whether the videos in question are formally copyright registered.

“[Snap]’s actions were not only unlawful, but an unconscionable attack on the community of content creators whose content is used to fuel the multi trillion dollar generative AI industry without any compensation,” the lawsuit states.

Who Is Behind the Lawsuit

The plaintiffs are a group of popular YouTube content creators, including the teams behind h3h3 Productions, H3 Podcast Highlights, Mr. ShortGame Golf, and Golfholics, who claim that hundreds of their videos were scraped without authorization and are included in the HD  VILA  100M and Panda  70M datasets. The proposed class seeks to represent all individuals in the United States who uploaded original videos to YouTube that were partially or entirely included in those datasets.

The Commercial Motive

The lawsuit does not treat this as an accidental oversight. It argues Snap intentionally sought vast amounts of training data to develop a well trained generative AI product that would give the company a competitive advantage against rivals in the technology and social media space, and which it could leverage for commercial profitability. The complaint notes that for any AI model, the more data consumed during training, “the better the AI product” becomes.

Lessons for Brands to Avoid

This case is part of a fast growing wave of AI training data litigation and carries clear warnings for any brand or technology company working with generative AI:

  • Do not assume public content is free to use. Content posted online is not automatically available for commercial AI training. Platform terms of service and intellectual property law apply regardless of public visibility
  • Respect platform terms and technical protections. Circumventing a platform’s technological safeguards, even to access data that appears publicly available, can constitute a DMCA violation and expose a company to significant legal liability
  • Secure proper licenses for training data. Brands building or partnering with AI tools should verify that all training datasets were obtained through licensed, consented, or legally permissible channels
  • Be transparent with creators. The absence of compensation or disclosure when using creator content for commercial AI development is increasingly viewed by courts and the public as a fundamental rights violation
  • Audit third party AI partnerships. If your brand relies on AI tools built by vendors, ask how those models were trained. Downstream liability risk is real and growing as litigation in this space accelerates
Share This Article