Anthropic PBC, a Delaware corporation with its principal place of business in San Francisco, California[1], is facing allegations of copyright infringement related to its large language models (LLMs), particularly the "Claude" family[1]. A class action complaint filed in the Northern District of California asserts that Anthropic has built a multibillion-dollar business by "stealing hundreds of thousands of copyrighted books"[1]. The plaintiffs in the case, Andrea Bartz, Charles Graeber, and Kirk Wallace Johnson, are authors who claim Anthropic has infringed on their copyrights by downloading pirated versions of their works and using them to train its AI models[1].
Anthropic styles itself as a public benefit company, designed to improve humanity[1]. However, the plaintiffs argue that the company's actions, specifically the alleged copyright infringement, make a mockery of its lofty goals[1]. According to its co-founder Dario Amodei, Anthropic is “a company that’s focused on public benefit”[1]. The plaintiffs contend that it is inconsistent with core human values or the public benefit to download hundreds of thousands of books from a known illegal source[1]. They argue that Anthropic has attempted to steal the fire of Prometheus and seeks to profit from strip-mining the human expression and ingenuity behind each one of those works[1].
The complaint states that Anthropic intentionally downloaded known pirated copies of books from the internet, made unlicensed copies of them, and then used those unlicensed copies to digest and analyze the copyrighted expression for its own commercial gain[1]. The plaintiffs claim the end result is a model built on the work of thousands of authors, meant to mimic the syntax, style, and themes of the copyrighted works on which it was trained[1]. This was done without seeking permission or compensating the authors for the use of their material[1].
The lawsuit indicates that Anthropic has admitted to using a dataset called The Pile to train its Claude models[1]. The Pile is an 800 GB+ open-source dataset created for large language model training[1]. It is alleged that one of The Pile’s architects created a dataset included in The Pile called “Books3,” which is a trove of pirated books[1]. Presser described Books3 as a direct download of all books from a different pirated website which comprises “all of bibliotik”[1]. Bibliotik is described as a “notorious pirated collection”[1].
The plaintiffs argue that Anthropic’s Claude LLMs compromise authors’ ability to make a living, in that the LLMs allow anyone to generate—automatically and freely (or very cheaply)—texts that writers would otherwise be paid to create and sell[1]. The Authors Guild, the oldest professional organization representing writers and authors, recently published an earnings study that shows a median writing-related income for full-time authors of just over $20,000, and that full-time traditional authors earn only half of that from their books[1]. The rest comes from activities like content writing—work that is starting to dry up as a result of generative AI systems trained on those writers’ works, without compensation, to begin with[1].
The plaintiffs are bringing this action under the Copyright Act to redress the harm caused by Anthropic’s infringement[1]. They are seeking that the matter be certified as a class action, and that their attorneys be appointed Class Counsel and that they be appointed Class Representatives, and Plaintiffs demand judgment against Defendant as follows[1]: awarding statutory damages or compensatory damages, restitution, disgorgement, attorneys’ fees and costs, and permanently enjoining Anthropic from engaging in the infringing conduct alleged[1].
In addition to the Bartz case, another action, Concord Music Group, Inc. et al. v. Anthropic PBC, 5:24-cv-03811-EKL (N.D. Cal.) (“Concord”), also alleges copyright infringement claims against Anthropic PBC, based on Anthropic’s use of copyrighted lyrics in the development of Claude[3][4]. The Bartz plaintiffs have submitted an Administrative Motion to Consider Whether Cases Should Be Related, arguing that the Bartz suit may be related to the Concord action because both cases involve copyright infringement claims against Anthropic PBC, related to Anthropic’s development of Claude[3][5].
Judge William Alsup has set forth substantive and timing factors that he will consider in determining whether to grant preliminary and/or final approval to a proposed class settlement, focusing on what is in the best interest of absent class members[2]. These factors include adequacy of representation, due diligence, cost-benefit for absent class members, the release, reversion, claim procedure, attorney’s fees, the right to opt out, incentive payment, and notice to class members[2].
Judge Alsup generally requires plaintiff’s counsel not to engage in any class settlement discussion until after class certification, to ensure that both sides know the specific claims suitable for settlement or trial on a class-wide basis as well as the scope of the class -members[2]. This timing ties in well with the general principle that a settlement should usually be negotiated only after adequate and reasonable investigation and discovery by class counsel[2].
To address potential conflict-of-interest or other ethical issues that may arise from interviewing absent putative class members regarding the merits of the case, both sides are required to promptly meet and confer and to agree on a protocol for interviewing absent putative class members[2]. No interviews of absent putative class members may take place unless and until the parties’ proposed protocol is approved or permission is otherwise given[2].
Get more accurate answers with Super Search, upload files, personalized discovery feed, save searches and contribute to the PandiPedia.
Let's look at alternatives: