Zuckerberg Himself Said "Stop the Licensing"

On May 5, a complaint was filed in the U.S. District Court for the Southern District of New York. The defendants: Meta and Mark Zuckerberg. The plaintiffs: five publishers — Hachette, Macmillan, McGraw Hill, Elsevier, and Cengage — along with author Scott Turow. The charge: "one of the largest copyright infringements in history."

What sets this lawsuit decisively apart from earlier AI copyright cases is this: the plaintiffs have brought concrete allegations that the CEO himself directed the infringement. Anyone in publishing should read every sentence of this complaint closely. No case shows more vividly how copyright protection works in the AI era — and how it gets broken.

What's New Here

This isn't the first time authors have taken Meta to court over copyright. In June 2025, thirteen writers — including Sarah Silverman and Junot Díaz — brought a similar suit against Meta and lost, after Judge Vince Chhabria ruled that Meta's training of its Llama models qualified as fair use.

The new lawsuit takes direct aim at the foundation of that ruling. The plaintiffs are advancing two claims. First, that Meta deliberately circumvented copyright protection mechanisms. Second, that Meta was actively considering licensing agreements until Zuckerberg personally shut the effort down. If both are established as fact, the fair-use shield collapses.

The complaint pulls no punches: "Meta and Zuckerberg followed their well-known motto: move fast and break things. First they illegally torrented millions of copyrighted books and academic articles from notorious pirate sites and obtained unauthorized web scrapes of nearly the entire internet. Then they copied those stolen fruits over and over to train Llama, Meta's multibillion-dollar generative AI system."

The Smoking Gun: Meta Considered Licensing — Then Walked Away

This is the strongest card in the plaintiffs' hand. After releasing Llama 1, Meta briefly explored licensing agreements with major publishers. Between January and April 2023, the company discussed expanding its "dataset licensing" budget to as much as $200 million.

Then, in early April 2023, the licensing strategy abruptly stopped. According to the complaint, "the buy-versus-pirate question was escalated to Zuckerberg, and after it reached him, Meta's business development team received verbal instructions to halt the licensing effort."

One Meta employee explained the reasoning, as quoted in the complaint: "If we license even a single book, we can no longer lean on the fair-use strategy."

That single sentence is the heart of the case. It suggests Meta could have purchased licenses but deliberately chose not to — precisely because it understood that paying for even one would undercut its fair-use defense. Where intent is this explicit, a simple fair-use argument doesn't hold.

Three Things Publishers Should Watch

The implications for Korean publishers are significant.

First, AI-trained content leaves traces. One of the complaint's most striking claims is that Llama "rapidly generates substitutes for the plaintiffs' and class members' works at scale": verbatim or near-verbatim copies, replacement chapters for academic textbooks, summaries and alternative versions of well-known novels and scholarly articles, and inferior imitations that copy the creative elements of the originals. The complaint even alleges that "Llama tailors its output to mimic the expressive elements and creative choices of specific authors."

The upshot: training data leaves fingerprints in a model's output. The technical and legal groundwork is accumulating for publishers to trace which AI systems their books were used to train.

Second, stripping copyright management information (CMI) is a separate violation. The complaint states that Meta "removed copyright management information from the stolen copyrighted works" — to conceal the training sources and facilitate unauthorized use. Under the U.S. Digital Millennium Copyright Act (DMCA), removing CMI is an independent violation.

This matters for Korean publishers too. The mere fact that copyright notices, metadata, or watermarks embedded in a digital publication were stripped during AI training can itself be grounds for damages. How you manage your publications' metadata is now a frontline copyright defense.

Third, the logic of "skip the license, claim fair use" is likely to crumble. The Meta employee's own words prove the point: choosing not to buy a license you could have bought is hard to shelter under fair use. However this case is decided, the public disclosure that AI companies acted deliberately to avoid licensing deals is consequential in itself.

What This Signals for the Korean Market

This is an American case, but its impact on Korean publishing is direct.

Korean publications were likely swept into the training data. The complaint states that Meta obtained "unauthorized web scrapes of nearly the entire internet." Books and academic papers published in Korean were very likely included. How Korean publishers respond to this will be one of the defining questions of the next few years.

The moment for licensing negotiations is approaching. While American publishers apply pressure through litigation, some AI companies are pivoting toward licensing deals. OpenAI has already signed content licenses with multiple news organizations, and Anthropic has announced content partnerships of its own. Korean publishers, too, should be assessing the licensing value of their catalogs and preparing to come to the table.

Even small publishers can build leverage through collective action. It matters that this suit is filed as a proposed class action — representing not just the five major publishers but every publisher in the same position. Individually, Korean publishers have little bargaining power; moving as an industry association or coalition changes the game entirely.

The Bill for "Move Fast and Break Things"

Meta's motto since its Facebook days — "Move fast and break things" — has come back as an invoice. The slogan was once a badge of honor for tech startups, but when what you break is someone else's rights, it goes by a different name: infringement.

The Meta employee's words quoted in the complaint capture the essence: "If we license even a single book, we can no longer lean on the fair-use strategy." That statement means Meta knew its conduct was the kind that requires a license. It knew — and didn't buy one.

A public record of how AI companies handle content is accumulating fast, and that record will decisively shape the rulings of the next several years. Meanwhile, even before those rulings arrive, negotiating tables between AI companies and content owners are being set up everywhere.

For Korean publishers, what deserves attention isn't the verdict but what surfaces along the way: which books went into the training data, which licensing negotiations were rejected, which metadata was stripped. These facts could become Korean publishers' bargaining chips.

Zuckerberg's decision to "stop the licensing" is now coming back as Meta's heaviest liability. In the AI era, those who hold content rights are not the weak party. They are the ones with a seat at the negotiating table.