Recently, Anthropic, a San Francisco-based AI startup, has been sued by a group of writers for allegedly using a large number of pirated books without authorization to train its popular chatbot, Claude. This marks the first lawsuit against Anthropic by writers, although similar lawsuits have been continuously filed against its competitor OpenAI over the past year. Anthropic, a small company founded by former OpenAI leaders, claims to be developers of more responsible and safety-focused generative AI models, capable of composing emails, summarizing documents, and engaging in natural human conversation.

Anthropic, Claude

However, this lawsuit was filed on Monday with the federal court in San Francisco, accusing Anthropic of "mocking its noble goals" by using resources from pirated books to build AI products. The complaint states: "It is no exaggeration to say that Anthropic's models seek to profit from the human expression and creativity behind every work."

The plaintiffs in this lawsuit are three writers: Andrea Bartz, Charles Graeber, and Kirk Wallace Johnson, who hope to represent a group of similarly affected fiction and non-fiction authors. Although this is the first writer lawsuit against Anthropic, the company also faces litigation from major music publishers, accusing Claude of reusing copyrighted song lyrics.

This case is closely related to a series of lawsuits against large language model developers, including OpenAI and Microsoft, which are engaged in legal battles with renowned writers such as John Grisham, Jodi Picoult, and George R.R. Martin, author of "A Song of Ice and Fire." Additionally, there are lawsuits from media institutions like The New York Times, Chicago Tribune, and Mother Jones.

The common thread in all these cases is that tech companies are using a vast amount of human works without authorization to train AI chatbots, which then generate human-like text. The lawsuits come not only from writers but also from visual artists, music companies, and other creators, all claiming that the profits from generative AI are built on the exploitation of original works.

Although Anthropic and other tech companies argue that the training of AI models falls under the "fair use" doctrine in U.S. law, the lawsuits allege that the dataset "The Pile" used contains a large number of pirated books. Additionally, the complaint refutes the notion that AI systems learn in a way similar to humans, emphasizing that humans learn by purchasing legal books or borrowing from libraries, which at least provides some economic compensation to authors and creators.

Key Points:

📚 Writers sue Anthropic, alleging it used pirated books to train AI chatbot Claude.

⚖️ This is the first writer lawsuit against Anthropic, following multiple similar lawsuits against OpenAI.

💡 Anthropic and other companies argue that AI training is "fair use," but face intense copyright disputes.