Bartz v. Anthropic: All you need to know about the largest copyright settlement in history
The clash between AI's feeding on data and the Constitution's promise to reward authors is playing out in a US court, and its resolution will influence how machine learning models are built and governed.
Harsh Gour
Published on: 27 September 2025, 11:29 am

ARTIFICIAL INTELLIGENCE (‘AI’) MODEL TRAINING and anthropocentric authorship have collided in Bartz v. Anthropic, the recently settled class action in which authors sued the AI firm Anthropic for training its Claude model on pirated books.
The stakes could not be higher: this $1.5 billion deal, the largest U.S. copyright settlement ever, signals to the booming AI industry that taking free literary "raw material" from shadow libraries comes at immense cost. The deal sends a message that taking copyrighted works from pirate websites is against the statutory rights granted under copyright law. Authors Guild CEO Mary Rasenberger hailed the settlement as "a vital step" and warned that AI companies "cannot simply steal authors' creative work…just because they need books to develop quality LLMs".
This clash between AI's feeding on data and the Constitution's promise to reward authors is playing out in court, and its resolution will influence how machine learning models are built and governed.
Chronology of Bartz v. Anthropic (2024–2025): In August 2024, nonfiction authors Andrea Bartz, Charles Graeber and Kirk Johnson sued Anthropic, alleging it had copied their books to train its AI without permission. In June 2025, Judge William Haskell Alsup held in a summary judgment that training Claude on legally acquired books was "transformative" fair use, but he refused to excuse Anthropic's mass downloading of pirated books. By July, the court certified a class of all registered rightsholders whose works were taken from known pirate sites (LibGen and PiLiMi). The parties announced a $1.5 billion settlement on September 5, 2025, which Judge Alsup preliminarily approved on a September 25 hearing.
The deal sends a message that taking copyrighted works from pirate websites is against the statutory rights granted under copyright law.
Systemic unpacking
Copyright law and fair use: Under the U.S. Copyright Act, authors have exclusive rights to copy and distribute their books, subject only to narrow exceptions. Section 107's "fair use" doctrine permits some unlicensed uses if they are socially valuable and don't unfairly supplant the market for the original. In AI model training cases, courts have focused on whether the model's use is transformative - essentially turning the text into something qualitatively new.
Judge Alsup viewed text‐based training as highly transformative: he wrote that the technology was "among the most transformative many of us will see in our lifetimes" (comparing AI learning to how humans learn by reading). He held that using lawfully purchased books for destructive digitisation and model training was "quintessentially transformative" and thus protected by fair use.
In contrast, he rejected Anthropic's bid for blanket immunity for its central library of pirated books, finding that "downloading millions of pirated books to build a permanent digital library" was not justified by fair use. In other words, the court drew a line: the training process itself (on authorised inputs) is OK, but Anthropic's underlying acquisition of those inputs via piracy remains infringement. Importantly, Judge Alsup left open the question whether training on pirated copies would ever be fair use; he expressed skepticism that copying first from a pirate site could later be "subsumed" by training use. Skadden LLP notes these rulings are highly fact-specific - one cannot simply generalise that all AI training is fair use - and Justice Chhabria (Kadrey v. Meta) has warned that even a strong finding of transformation "is not the end of the analysis".
Class certification: Judge Alsup certified a nationwide class of copyright owners in July 2025. The class includes all beneficial and legal copyright owners of any book versions downloaded by Anthropic from the LibGen or PiLiMi databases (so long as the work was properly registered with the U.S. Copyright Office and has an ISBN/ASIN). In practice, only about 500,000 of the roughly 7 million downloaded titles qualified under these criteria.