Andrea Bartz, Charles Graeber, and Kirk Wallace Johnson, along with their loan-out companies, have filed a class action complaint against Anthropic PBC, alleging copyright infringement[6]. The plaintiffs claim that Anthropic built its multibillion-dollar business by illegally copying and using copyrighted books to train its Claude family of large language models (LLMs)[1][6]. The plaintiffs argue that Anthropic's actions compromise authors' ability to make a living, as the LLMs can generate texts that writers would otherwise be paid to create[6]. They contend that Anthropic has profited immensely from this copyright infringement, harming the market for authors' works[6]. Central to the case is the allegation that Anthropic knowingly used pirated materials, specifically the 'Books3' dataset, to train its models[6].
Anthropic, while acknowledging it offers products based on LLMs, denies the core allegations of copyright infringement[7]. The company asserts that its use of copyrighted works falls under the protection of fair use, as defined in 17 U.S.C. § 107[5][7]. They argue that LLMs learn patterns and relationships within data rather than storing contents, and that the responses generated by LLMs are based on a predictive process, not verbatim copying[8]. Anthropic emphasizes that its AI models generate varied responses to similar prompts, highlighting the probabilistic nature of the technology[8]. A key point is to show using this technology is not about expression, but rather extracting statistical information from data[8]. Central to their defense is the claim that the training data is used to 'learn the patterns and connections between words,' similar to how humans learn[1]. Anthropic also disputes the plaintiffs' claim that their copyrighted works were actually used in training the AI models[7].
The plaintiffs assert that the court has subject matter jurisdiction under 28 U.S.C. §§ 1331 and 1338(a) because the action arises under the Copyright Act of 1976[1]. They also assert personal jurisdiction over Anthropic because it has purposely conducted business in the district[1]. Venue is claimed to be proper under 28 U.S.C. § 1400(a) and 28 U.S.C. § 1391(b)(2) due to Anthropic's infringing activities and commercialization of those activities within the district[1].
The court set a number of deadlines in a case management order, including:
Several key legal and factual issues have emerged as points of contention between the parties [1 1]. These include:
These issues also involve technical aspects of how LLMs function, source of training data, and the nature of the AI's output[8][7]. The court has emphasized the need for accurate briefing and representations from counsel, particularly regarding potential hazards to public health, safety, or well-being[3].
A central aspect of the case involves the discovery of electronically stored information (ESI)[9]. Key points regarding ESI include:
To facilitate the management of ESI, a specific protocol was established, addressing aspects such as data formats, metadata fields, and redaction[7][9]. A key component is to determine whether Anthropic used specific copyrighted materials, such as those in the Books3 dataset, for training its AI models[5]. The court stressed candidness in these matters[5].
Several motions and deadlines have been set forth, including a motion to dismiss[7] and a motion for class certification[4]. The court has emphasized that all filings must include the date and time of the hearing or conference[3]. Initially, there was a dispute regarding the order of hearing summary judgment and class certification motions.
Judge Alsup requires plaintiff’s counsel not to engage in any class settlement discussion until after class certification[2].
Judge Alsup also recognizes some form of pre-certification of settlement classes and recognizes there are circumstances where class members will be better served by class negotiations before certification[2].
In any such circumstances, counsel may apply to be “interim counsel,” and ask for express authorization to negotiate on behalf of a specified putative class[2].
The COVID-19 pandemic is no excuse to waive any local, federal, or court rules[3].
As of August 23, 2024, full settlement discussions at any time with respect to the individual claim are permitted[2]. Full settlement discussions as to class claims are permitted once those class claims are certified or interim counsel are appointed[2].
The court requires both sides to promptly meet and confer and to agree on a protocol for interviewing absent putative class members[2]. In their joint case management statement due at the outset of the case, the parties shall either describe their agreed-upon protocol or explain why no such protocol is necessary in their particular case[2]. It has become a recurring problem in putative class actions that one or both sides may wish to interview absent putative class members regarding the merits of the case, potentially giving rise to conflict-of-interest or other ethical issues[2]. No interviews of absent putative class members may take place unless and until the parties’ proposed protocol is approved or permission is otherwise given[2].
Get more accurate answers with Super Search, upload files, personalized discovery feed, save searches and contribute to the PandiPedia.
Let's look at alternatives: