Nonfiction authors sue OpenAI, Microsoft over copyright infringement

The lawsuit — one of several filed on similar grounds — comes amid continued controversy for the AI company as its founder Sam Altman returns after his surprise ouster last week.

Computerworld |

ai artificial intelligence law copyright legal — Shutterstock

A group of nonfiction writers has filed a class-action suit against OpenAI and Microsoft for allegedly infringing on their copyrighted materials by training the AI chatbot ChatGPT on their written works and academic journals without their consent.

The suit — one of several filed against the AI platform provider that make a similar complaint — comes as OpenAI ends a tumultuous five days with the reinstatement of Sam Altman as OpenAI's chief executive. His return was spurred by employees, investors, and allies rallying to his defense after his ouster by the company's board of directors last week.

Julian Sancton, author of the New York Times bestseller Madhouse at the End of the Earth: The Belgica’s Journey Into the Dark Antarctic, is the principal plaintiff named in the suit, which accuses OpenAI and Microsoft of blatantly ignoring copyright laws for their own financial gain.

“OpenAI and Microsoft have built a business valued into the tens of billions of dollars by taking the combined works of humanity without permission,” the lawsuit, filed by law firm Susman Godfrey LLP, alleged. “Rather than pay for intellectual property, they pretend as if the laws protecting copyright do not exist.”

The suit cites the years of conception, research, and writing spent by authors on their works — which OpenAI uses without their permission — as its basis for infringement. Sancton, for instance, spent five years and tens of thousands of dollars traveling around the world to complete research for his bestselling book, according to the lawsuit.

Meanwhile, it’s become part of a dataset used to train ChatGPT, and thus excerpts from it as well as “a massive corpus of copyrighted material” have been reproduced without permission or compensation, according to the lawsuit.

“In training their models, Defendants reproduced copyrighted material to exploit precisely what the Copyright Act was designed to protect: the elements of protectible expression within them, like the style, word choice, and arrangement and presentation of facts,” the suit alleged.

Further, while OpenAI is worth “a fortune,” neither OpenAI nor Microsoft pays compensation to the authors for their intellectual property, the plaintiffs argue, making the basis of the OpenAI platform “nothing less than the rampant theft of copyrighted works,” according to the suit.

The plaintiffs are asking to be rewarded damages and restitution as well as that the defendants be forced to permanently refrain from the current infringement of their rights. Neither Microsoft nor OpenAI immediately responded to separate requests for comment.

Precedent-setting cases

The lawsuit is not the first to challenge the use of copyrighted materials in ChatGPT and other OpenAI-based platforms, which users can search for source material that the platform mines from published works. However, so far no judge has given artists or creators a clear victory in their cases, though they have paved the way for this potential.

The suit also is one of only a handful of such cases to name Microsoft alongside OpenAI as a defendant. Microsoft has invested billions of dollars in OpenAI, the technology of that powers its Bing Chat bot; in return, OpenAI exclusively uses Microsoft as a cloud partner. At the same time, the two are competitors — with OpenAI licensing its technology to others — not to mention that they've been embroiled in a CEO tug-of-war for the last week.

In a separate class action lawsuit against AI-generated image service providers Stability AI, Midjourney, and DevianArt, a US district judge ruled that determining whether generated images may be in direct violation of copyright laws is “not plausible” at the moment.

However, the judge — in response to a motion to dismiss the case filed by the image service providers — gave the plaintiffs permission to amend their submission on how the companies violated any copyright laws, thus potentially paving the way for a win in the future for those claiming copyright infringement.

Another class-action lawsuit has challenged the legality of GitHub's Copilot AI-driven coding assistant due to the AI having been trained on public GitHub repositories. Creators who posted code under open source licenses on GitHub claim that the technology violates their rights. A decision in that case, filed in a US District Court in San Francisco on behalf of potentially millions of coders, is pending.

If comments on X, the platform previously known as Twitter, are any indication, the court of public opinion also is split on whether copyright infringement lawsuits are valid at a time when technology, particularly AI, is evolving so quickly.

A tweet by X user “Mike” (@OneGodel) suggested that people who think copyright should be upheld in light of AI advancements are “living in the dark ages,” to which user Cutesy Carrot (@Carrot_breath) tweeted back, “If you ever made something of value you would know that copyright is good.”

It’s time to break the ChatGPT habit