Reddit filed a lawsuit against generative AI company Anthropic in California court -- formally beginning the "copyright war over AI data training." The essence: beyond simple web crawling, the billions of conversation data that Reddit community has created over time have supported AI company commercial models, but were collected and used without proper compensation or permission. June 4, 2025: Reddit filed against Anthropic in San Francisco Superior Court on 5 counts (breach of contract, unjust enrichment, computer asset violation, contract interference, and copyright infringement). The allegations: Anthropic crawled Reddit content using web scrapers that violated Reddit terms of service (which prohibit automated data collection without permission); this crawled data was used to train Claude models; Reddit had established a Data API program specifically for AI company licensing (Reddit signed a 60M USD/year deal with Google for training data); Anthropic used the data without entering a licensing agreement; the unjust enrichment claim quantifies the value Anthropic received from Reddit data relative to the compensation Reddit received (zero). The broader AI training data legal landscape: Reddit-Anthropic is one of multiple parallel legal battles over AI training data -- similar cases involve New York Times vs. OpenAI and Microsoft; Getty Images vs. Stability AI; authors vs. Meta; Universal Music Group vs. AI music generation companies. The legal theory being tested: copyright law protects expression but not facts; AI training involves processing expression to extract statistical patterns; whether this constitutes infringing copying or permissible transformation is genuinely unsettled law; the Reddit lawsuit adds the contract dimension (terms of service violation) alongside copyright -- potentially a stronger claim. The industry implications: if Reddit succeeds, every AI company that crawled web content without ToS-compliant access faces similar liability; the practical outcome may be that AI companies retroactively license training data or face ongoing legal exposure.
Reddit Files First Lawsuit Against Anthropic — 'We Are Not Free Data'
Unauthorized use of data for AI model training — now from 'tech ethics' to 'courtroom battle.' Reddit filed a lawsuit against generative AI company Anthropic in California.

Source: META-X metax.kr
AI Model Training Data Unauthorized Use -- Now a "Court Battle" Not Just "Tech Ethics"
ⓒ META-X metax.kr
All rights reserved.
Free to share with attribution.
All rights reserved.
Free to share with attribution.