Getty Images Sues Stability AI: A Turning Point in AI Copyright
Getty Images filed suit against artificial intelligence company Stability AI for its “brazen infringement”[1] of more than 12 million of Getty’s photographs used to train its text-to-image tool Stable Diffusion.[2] The stock image site seeks up to $150,000 per infringement for these 12 million photos – a staggering total of about $1.8 trillion.[3]
The complaint, filed in Delaware federal court on February 3, 2023, follows Getty’s suit against Stability AI in the UK, as well as a class-action lawsuit against Stability AI and two other AI generators filed in the Northern District of California.[4]
As AI generators continue their meteoric rise, the legal landscape surrounding them remains undefined.[5] Online artist communities have expressed their discomfort with the growth of AI-generated art[6] and IP attorneys have been preparing for a wave of litigation.[7] This suit, however, is the first substantive legal challenge to AI art generators.
Stability AI uses copyrighted images and their associated metadata to train Stable Diffusion to produce photorealistic images when given a text command.[8] Getty argues that Stability AI infringes its copyright because they used their data to train Stable Diffusion without a license.[9] Additionally, Getty claims that Stable Diffusion often generates low-quality images that contain a modified version of the Getty Images watermark, creating consumer confusion and infringing their trademark.[10]
An important, unsolved question at the core of this lawsuit is whether you can use copyright-protect data to train AI models.[11]
AI firms have argued that this practice is covered by fair use––the doctrine that allows certain unlicensed use of copyrighted protected works.[12] When determining if something is covered by fair use, courts consider four main factors, two of which are particularly significant in this context.[13] First, if the material is considerably changed––a “transformative” use––the test weighs in favor of fair use.[14] Second, courts consider whether the use threatens the economic viability of the original creator by entering into competition with them.[15]
The prominence of these factors could mean that using vast amounts of copyrighted inputs to train AI is covered by fair use.[16] The process of training AI uses millions of images; the training datasets for AI tools are so large that there are tools to check if your images are already in one.[17] The copyrighted training data has been “transformed” into novel pictures and covered by fair use.
However, AI firms may use their training systems to generate new content that may not be covered by fair use.[18] Daniel Gervais, a Vanderbilt Law professor who writes extensively at the intersection of intellectual property and AI, explains this concept with a hypothetical: “If you give an AI 10 Stephen King novels and say, ‘Produce a Stephen King Novel,’ then you’re directly competing with Stephen. Would that be fair use? Probably not,” says Gervais.[19]
Although using millions of images, Stable Diffusion’s retention of the Getty watermark on some of their images demonstrates that the training data has not been significantly transformed, despite the occasionally “grotesque” final images.[20] Additionally, Stable Diffusion is marketing its output to those “seeking creative imagery,” and so competes directly with Getty Images.[21] The extent of the transformative use of Getty’s data as well as the extent to which Stability and Getty compete will likely be central to this case.[22]
How can AI firms avoid litigation? The most obvious solution and the one Getty is seeking, is to license the data to AI databases.[23] Craig Peters, Getty’s CEO, stated that the company was not interested in halting the development of AI art, but to create a new legal status quo, one that facilitates a negotiation between intellectual property rights owners and those who wish to use their data.[24] For Peters, the legal state of AI art is analogous to the genesis of digital music.[25] Peters envisions licensing agreements between datasets and AI art generators that mirror those of music-sharing sites like Napster and Spotify.[26]
Others, such as Thomas Magnani, partner and head of the technology transactions practice at Arnold & Porter, argue that licensing agreements modeled after Spotify are not only prohibitively expensive but not directly applicable to a generator like Stable Diffusion.[27] The outcome for Spotify is the artist’s work, but the outcome for Stable Diffusion is not the underlying image, but a new work that cannot be traced back to the underlying copyrighted image on which the AI was trained.[28] Tracking down the copyrighted original for payment becomes increasingly difficult as the AI tool becomes more sophisticated.[29] Magnani points to another licensing model––a flat fee for a subscription and access to all of Getty’s data.[30] Copyright owners could then opt-out of training datasets and drastically reduce the size of these datasets.[31]
For now, these models are hypothetical. Legal experts, AI generators, and artists will be following this case to see whether these models are needed, and if so which model will dominate the world of AI art.
Footnotes