Can Copyright Law Save Writers from Extinction?
The most successful writers of our time fear they will not be able to compete in the literary marketplace.[1] In what certainly feels like crisis mode, four proposed class action lawsuits have been filed in the past six months against OpenAI by writers including John Grisham, Ta-Nehisi Coates, Junot Diaz, Jodi Picoult, Elin Hilderbrand, Jonathan Franzen, Michael Chabon, George Saunders, and George R.R. Martin.[2]
In one representative complaint the authors explain their position:
Plaintiffs seek to represent a class of professional fiction writers whose works spring from their own minds and their creative literary expression. These authors’ livelihoods derive from the works they create. But Defendants’ [AI systems] endanger fiction writers’ ability to make a living, in that the [systems] allow anyone to generate—automatically and freely . . . texts that they would otherwise pay writers to create.[3]
In other words, the authors appear to be searching for a legal theory to explain why it’s wrong for generative AI to make their professions obsolete.[4] At least two legal theories emerge from the authors’ complaints.
Theory One: OpenAI Wronged Authors by Copying Their Novels
One legal theory argues that AI companies infringed plaintiffs’ copyrighted works in the unauthorized copying of them in order to train the AI models.[5] This theory is common to each of the proposed class actions and is the most straightforward: the authors never gave OpenAI permission to copy their novels and yet they copied them and fed them into its system anyway.[6]
Proving that OpenAI did this also seems straightforward: “when ChatGPT is prompted,” the Chabon complaint alleges, “it generates not only summaries, but in-depth analyses of the themes present in Plaintiffs’ copyrighted works, which is only possible if the underlying GPT model was trained using Plaintiffs’ works.”[7]
The problem with this theory is that many of the outputs plaintiffs cite—including summarizing and analyzing—very likely fall under fair use.[8] In Author’s Guild v. Google, the Second Circuit held that copying books for the purpose of uploading them into a database where users could freely search for verbatim excerpts of the books fell under fair use because the use was sufficiently transformative.[9] If copying a book in order to provide users with a verbatim excerpt falls under fair use, it’s hard to see how OpenAI’s copying of a book in order to provide a summary or analysis would not fall under fair use.[10]
Theory Two: ChatGPT’s Outputs Harm Authors
Another legal theory would distinguish between different outputs. For example, this theory might concede that summaries or analyses fall under fair use but outputs that use copyrighted characters,[11] create sequels to copyrighted works,[12] or create stories “in the style of” an author do not fall under fair use.[13] Plaintiffs are not conceding that any outputs fall under fair use at this point in the litigation, but one motion to dismiss filed by OpenAI indicates that they will have to narrow their claims;[14] and the Author’s Guild’s reference to Warhol v. Goldsmith in an open letter cited in their complaint, suggests that they may strategically differentiate between particular outputs.[15]
This theory has problems too, though. For a work to infringe, it must be “substantially similar to” the copyrighted work.[16] Most of ChatGPT’s outputs will not be “substantially similar to” the copyrighted works—at least in the traditional sense. For example, a short story that is unquestionably “in the style of” Junot Diaz is not considered “substantially similar” to Junot Diaz’s work, and thus would not be held to infringe. Indeed, typically an author’s “style” cannot be copyrighted at all.[17]
But by addressing specific outputs, this theory could sidestep the Author’s Guild v. Google problem and allow for more flexible, creative arguments. On October 30th, a judge in the District of Northern California gave a visual artist leave to amend her claim that outputs produced by an AI system trained on her works were derivative.[18] While the judge was clearly skeptical that works without substantial similarity could be derivative, he specifically left room for the plaintiff to clarify the portion of her claim relating to outputs that “can be so similar to plaintiff’s styles or artistic identities to be misconstrued as ‘fakes.’”[19]
On top of greater flexibility and creativity, focusing on the outputs more accurately addresses authors’ fears. The existential threat that ChatGPT presents does not lie in its ability to summarize the Game of Thrones series—it lies in its ability to effortlessly output a million series stylistically, thematically, and technically similar to Game of Thrones.
This theory could also be more practical in terms of the remedies it affords the authors: whereas AI companies cannot undo what they have already done in training their models on copyrighted material, they seem to have the capacity to prohibit the models from responding to certain prompts.[20]
But even in the impossible scenario where OpenAI not only prohibits models from responding to certain prompts, but also magically erases all copyrighted works from its models, would these authors—the greatest, most prolific authors of our time—be able to compete with a system that can generate in seconds more literature than all these authors combined could possibly compose in a lifetime?
Footnotes