Fair use or foul play? The legal tug-of-war over AI-created works

Art courtesy of Midjourney

Art: The artistic rendering of the scales of justice amid a cyber-like background was created courtesy of Midjourney.

_____________

If 2023 was the year we were surprised by the entrance of “artificial intelligence” into our lexicon, 2024 will be the year we begin to live with AI in earnest. The technology will move from a singular buzzy app in the cultural zeitgeist — think ChatGPT — to a force that undergirds much of the software we already use. Yet even as the technology blends into everyday products and services, society at large will need to wrestle with several fundamental shifts resulting from an increasingly automated world — redefining the nature of knowledge work and how we interact with creative output.

The AI systems capable of handling a multiplicity of tasks, from writing a haiku to creating an itinerary for a weeklong trip to Portugal, are trained on a vast corpus of data — so vast that the state-of-the-art models have essentially already trained on all publicly available data.

As the technology continues to evolve, the question of whether the use of publicly available — and in some cases legally protected — material for training AI models constitutes copyright infringement has been thrust into the spotlight. Artists, authors, and other content creators like Sarah Silverman, John Grisham, and George R.R. Martin have sued developers of AI systems claiming copyright violations. Defenders of the technology, including myself, argue that copyright laws in the United States protect this type of beneficial use under the doctrine of fair use — a legal argument, ironically, that defends Silverman’s ability to create parodies, perform her comedy, and use other people’s material without first obtaining their permission.

Why are the artists upset?

Lawsuits involving AI currently winding their way through the court system mainly fall into three categories: copyright, privacy, and defamation. The copyright cases claim, in essence, that AI systems use a creator’s work without permission. These violations, they argue, occurred in the process of training the AI system and are repeated in every operation of the system (i.e., every droplet in the ocean is represented throughout the ocean, no matter where and at what depth you sample). They go on to argue that the AI models’ mere existence constitutes an infringement — known in copyright as a derivative work. Copyright experts, technology lawyers like myself, and presiding judges have responded skeptically to this argument, and defendants have succeeded in dismissing the bulk of plaintiffs’ claims in two high-profile lawsuits filed this year.

The privacy lawsuits claim that by scraping the internet for training data, the AI system violates plaintiffs’ privacy rights. Historically, claims that using publicly published and freely available information is a violation of privacy have not fared well because the information is, well, public.

And last, the defamation case involves a radio talk show host upset that ChatGPT implied that the host participated in embezzling funds from a nonprofit. That case is currently pending before a court in Georgia.

What is copyright?

Some creatives, like digital artist Greg Rutkowski, have objected that AI models generate work that copies their “style.” The problem is, style alone is generally not protectable. Copyright law in the United States is rooted in the Constitution and protects original works of authorship by granting exclusive rights to creators, such as the right to reproduce, distribute, perform, and display the work. You cannot, however, protect facts, ideas, concepts, or style.

To be ripe for protection, a work must be “fixed in any tangible medium of expression,” meaning it must be written or recorded on something that can be shown to others. The Founders believed those rights were important to recognize and safeguard in order to promote knowledge and learning. But even before they drafted the Constitution, courts had begun to recognize that certain unauthorized reproductions should not amount to an infringement of an author’s rights. That delicate balance, which underpins democratic principles like free speech, was codified in the Copyright Act of 1976. Known as the fair use doctrine, the concept strikes a balance between the rights of the copyright holder and the public interest.

How does fair use affect AI cases?

Generally speaking, AI copyright cases involve two types of claims: “input” and “output.” The arguments that touch on training an AI model on preexisting works are input claims. The other arguments implicate the output of AI models — for example, when a prompt for, say, “a layer cake made out of a stratigraphic cross-section of the Sonoran Desert” yields something too similar to the work of a photographer.

And while generative AI is new, aggrieved artists claiming unfair copying are not. Creators have always drawn inspiration from other creators (which explains why there was more than one impressionist painter). But courts are frequently asked to adjudicate the line between inspiration and infringement. Earlier this year, we saw Ed Sheeran successfully defend a copyright lawsuit filed by Marvin Gaye’s estate, claiming that Sheeran had been a little too inspired by Gaye’s 1973 hit “Let’s Get It On.”

When you think about copyright infringement, you’re likely thinking about an output claim: Yours looks (or sounds) too much like mine. A court will compare the original work and the accused work to determine whether there is a “substantial similarity” between the two. These are painstaking, case-by-case assessments that are subject to a number of defenses unless plaintiffs can show that the AI models are inherently infringing machines — a near-impossible task given the well-established principles espoused by the US Supreme Court in the Sony “Betamax” case. (That opinion famously determined Sony’s videotape recording technology was capable of “substantial non-infringing uses.”)

Input claims, by contrast, ultimately boil down to one question: Does the process of training AI models with publicly available data amount to a copyright infringement, or is that use protected by the fair use doctrine? In evaluating whether fair use applies, courts consider factors such as the purpose of use, the nature of the copyrighted work, the amount used, and the impact on the work’s market value. Here, the nature of the technology itself favors AI developers since the goal of training a foundation model is neither to infringe, nor replace the market for, the originals. Instead, the training process is designed to create a computer-based reasoning engine by identifying patterns in our written texts and other digital content. In fact, if operating correctly, machine learning models will not replicate or distribute the copyrighted content in its original form.

The input-claim analysis involves not only what works were used to train the AI systems, but also what elements of the works the AI systems used. While AI training can involve massive datasets (and in some cases, it digests the work in its entirety), the elements it focuses on — known as the factual and functional elements of the work — are not protectable under copyright law. Instead, copyright protects only the creative and unique elements of a work. Historically, courts have found that building systems that conduct semantic analysis, enhance search functionality, or train plagiarism detection software can be a fair use, even when copying entire books and articles in the process.

Critics argue that using copyrighted material without compensation risks diminishing the value of creative work and reduces incentives to create. This is overblown. In a world where content continues to proliferate — and it has exploded — human creatives and curators as arbiters of taste become more valuable, not less. After all, with the advent of digital cameras and smartphones, the volume of photographs is exponentially greater, but the market for fine art film photography hasn’t softened. What’s more, this argument ignores the potential benefits AI offers creatives as a tool. Alexander Reben and Sougwen Chung are just two artists who’ve successfully embraced the tools, mixing technology with tradition, while urging fellow artists to engage with AI rather than compete against it.

Fair use isn’t just a legal technicality — it’s a crucial principle that fuels our ability to learn, create, and innovate. It allows educators to use copyrighted materials in classrooms, empowers artists to experiment with existing works, sparking fresh perspectives and discourse, and allows us all to contribute to the societal commons.

Ultimately, courts will decide whether to break decades of legal precedent to find in favor of the artists and authors. But regardless of how the courts rule, Congress could (theoretically) amend the Copyright Act to carve out AI from fair use. That would be misguided. Not only would it concede AI supremacy to global rivals like China (because it would place an added burden on training AI systems in the United States), but it would also be an affront to the advancement of science, which, according to the Constitution, is a guiding principle of copyright.

Amir R. Ghavi is a partner at Fried Frank LLP. He represents AI foundation model developers, including defending against multiple copyright litigations filed by artists and content licensors.

More from GZERO Media

- YouTube

On GZERO World, Ian Bremmer sits down with Jennifer Sciubba to explore a looming global crisis: population collapse. With fertility rates below replacement levels in two-thirds of the world, what does this mean for the future of work, healthcare, and retirement systems? In the US, Vice President-Elect JD Vance and Elon Musk are already sounding the alarm, the latter saying it's “a much bigger risk” to civilization than global warming. Can governments do anything to stop it?

Senegal's Presidential Bassirou Diomaye Faye casts his ballot during the early legislative election, at a polling station in Ndiaganiao, Mbour, Senegal on Nov. 17, 2024.

Abdou Karim Ndoye/Senegal's Presidency/Handout via Reuters

President Bassirou Diomaye Faye called the snap vote eight months after taking office, seeking a majority mandate for economic reforms as the country grapples with high inflation and widespread unemployment.

Brazil's President Luiz Inacio Lula da Silva greets UN General-Secretary Antonio Guterres ahead of the G20 summit, in Rio de Janeiro, Brazil, on Nov. 16, 2024.

Ricardo Stuckert/Brazilian Presidency/Handout via Reuters

As G20 leaders meet in Rio de Janeiro on Monday, it’s not just the city’s famed statue of Christ the Redeemer casting a shadow: it’s US President-elect Donald Trump.

President Joe Biden, South Korea's President Yoon Suk Yeol, and Japan's Prime Minister Shigeru Ishiba participate in a trilateral meeting at the Asia-Pacific Economic Cooperation summit in Lima, Peru, on Nov. 15, 2024.

REUTERS/Leah Millis

In a joint press conference on Friday at the APEC summit in Lima, Peru, US President Joe Biden, South Korean President Yoon Suk-yeol, and Japan’s Prime Minister Shigeru Ishiba warned of the latest “dangerous and destabilizing” cooperation between Russia and North Korea.

Former President Donald Trump attends court during closing arguments in his civil business fraud trial at the New York Supreme Court on Jan. 11, 2024.
John Nacion/NurPhoto via Reuters

Donald Trump’s victory in the US presidential election puts the country in an unprecedented position. He’s the first convicted felon to win the presidency and was elected to the nation’s highest office while facing multiple criminal cases at the federal and state level. What will happen to these criminal proceedings?

- YouTube

The world is quietly being reshaped by a demographic time bomb: Birthrates are plummeting, and the global population is rapidly aging. By 2050, one in six people will be over 65. While the overall population is still increasing—driven by growth in developing countries like Nigeria and Pakistan—experts predict it will peak in about 60 years. The shift to depopulation will have huge implications for the future of work, healthcare, and retirement. So what can we do about it? On Ian Explains, Ian Bremmer breaks down the different strategies governments are using to try to get people to have more kids, particularly in East Asia, where the population crisis is severe.

The Puerto Princesa Forest Restoration Initiative is a project to plant more than 400,000 seedlings to restore Palawan forests destroyed by Super Typhoon Odette in the Philippines. It’s part of a larger global effort by the Priceless Planet Coalition, launched by Mastercard with Conservation International and the World Resources Institute, to fund the restoration of 100 million trees around the world. These projects extend beyond carbon sequestration — they’re aimed at creating economic opportunities for women in the region, enabling them to better provide for their families. Read more about how many local women and community members are leading the charge on nursery construction, maintenance, and seedling production.

- YouTube

Listen: The world is on the brink of one of the most fundamental demographic shifts in modern human history: populations are getting older, and birth rates are plummeting. By 2050, one in six people on Earth will be over 65, which will have a huge impact on the future of work, healthcare, and social security. On the GZERO World Podcast, Ian Bremmer sits down with Jennifer Sciubba, President & CEO of the Population Reference Bureau, to discuss declining fertility, the aging crisis, and why government efforts all over the world to get people to have more babies don’t seem to be working.