Is ChatGPT stealing from The New York Times?

Courtesy of Midjourney

We told you 2024 would be the year of “copyright clarity,” and while some legal disputes were already winding their way through the US courts, a whopper dropped on Dec. 31.

Just hours before the Big Apple’s ball dropped, The New York Times filed a lawsuit against the buzziest AI startup in the world, OpenAI, and its lead investor, Microsoft.

In its 69-page complaint filed in federal court in Manhattan, The New York Times alleged that OpenAI illegally trained its large language models on the Gray Lady’s copyrighted stories. It claims that OpenAI violated its copyright when it ingested the stories and that it continues to do so repeatedly with the information it spits out.

The copying was so brazen, the lawsuit says, that the AI products powered by OpenAI’s large language model, GPT-4, can replicate full — or nearly full — versions of Times articles if prompted, undermining the paper’s subscription business. That includes OpenAI’s popular chatbot ChatGPT, as well as Microsoft’s Bing Chat and Copilot products.

What the Times has to prove

Lawyers for the Times need to first demonstrate that the paper has a valid copyright and, second, that the defendants violated it.

“Facts aren’t copyrightable,” says Kristelia Garcia, an intellectual property law professor at Georgetown University, noting that while an organization’s exact wording in covering a news event is copyrightable, the underlying event it's covering is not. Additionally, “there is a fair use exception for ‘newsworthy’ use of copyrighted work,” she says, a tenet that affords protection to anyone reporting the news.

The fair use doctrine – the main legal principle in question – is what allows you to parody a popular song or quote a novel in a critical review. Generally, the courts have ruled that to qualify as fair use, a work must be “transformative” and not compete commercially against the original work.

In the suit, the Times says that there’s nothing transformative about how OpenAI and Microsoft are using Times stories. Instead, it claims that the “GenAI models compete with and closely mimic the inputs used to train them,” and that “they owe the Times “billions of dollars in statutory and actual damages.”

The view from OpenAI

OpenAI, which had been engaged in deep discussions over the matter with the Times, was caught off guard by the legal move.

“Our ongoing conversations with the New York Times have been productive and moving forward constructively, so we are surprised and disappointed with this development,” it said in a statement after the lawsuit was filed, noting subsequently that the lawsuit was "without merit."

The company has been riding the success of its industry-standard AI tools, chiefly the chatbot ChatGPT, toward an anticipated valuation north of $100 billion, and many users are excited about the much-hyped launch of GPT-5.

But copyright law is one snag threatening to upend OpenAI’s skyward business, and Sam Altman knows it. That’s why he and his colleagues have already started paying media companies for the right to license their content. According to recent reports, payments in the $1-5 million range annually — not the “billions” that the Times says it’s owed – are being offered to media outlets by OpenAI.

AI firms have already been hitwith copyrightsuits from famous authors and artists over their efforts to train their models to be stylistically similar to them, but the Times lawsuit goes further, alleging straight-up copying in the input and output.

What’s likely to come next?

The New York Times was able to effectively manipulate ChatGPT to spit out its articles nearly verbatim: In its brief, it shows that it asked the chatbot to deliver a Times story one paragraph at a time.

When we at GZERO tried this, the chatbot no longer accepted this method, telling us: “I apologize for any inconvenience, but I can't provide verbatim copyrighted text from The New York Times or any other external source.” But it also said, “I can offer a brief summary or answer questions related to the article's content.” It’s unclear whether OpenAI made a change in response to the lawsuit.

Garcia thinks that the Times has a good case as long as it can demonstrate that “OpenAI ingested Article X and then spit out Article Y that shared 500 to 650 identical words.” But, ultimately, she said she’d be surprised if the case ever goes to trial — a process that would take years.

It’s much more likely, she thinks, that the Times is seeking a substantial settlement that pays what it sees as fair value for its journalism.

An adverse decision in court could be a deep threat to the AI business model as a whole — if a judge deems that the training process infringes on copyright, it could change the trajectory of this innovative new technology.

More from GZERO Media

Malawi soldiers part of the Southern African Development Community (SADC) military mission for eastern Congo, wait for the ceremony to repatriate the two bodies of South African soldiers killed in the ongoing war between M23 rebels and the Congolese army in Goma, North Kivu province of the Democratic Republic of Congo February 20, 2024.
REUTERS/Arlette Bashizi

Fighters from the M23 rebel group in northeastern Congo have been targeting civilians in violation of a July ceasefire agreement, according to the Southern African Development Community, whose peacekeeping mandate was extended by a year on Wednesday.

Ari Winkleman

Donald Trump has promised a laundry list of things he will accomplish “on Day 1” in office. To name a few, he has vowed to immediately begin a mass deportation of immigrants, streamline the federal government, pardon Jan. 6 rioters, and roll back the Biden administration’s education and climate policies.

Ambassador Robert Wood of the US raises his hand to vote against the ceasefire resolution at the United Nations Security Council, on November 20, 2024.
Lev Radin/Sipa USA, via Reuters
- YouTube

Ukraine has launched US-made long-range missiles into Russia for the first time. Will this change the course of the war? How likely will Trump be able to carry out mass deportations when he's in office? Will there be political fallout from Hong Kong's decision to jail pro-democracy activists? Ian Bremmer shares his insights on global politics this week on World In :60.

A man rushes past members of security forces during clashes between gangs and security forces, in Port-au-Prince, Haiti November 11, 2024.
REUTERS/Marckinson Pierre

The UN Humanitarian Air Service is scheduled to restart flights to Haiti on Wednesday, a week after several planes attempting to land at Port-au-Prince airport came under small arms fire.