Why Nvidia is being sued by several authors

Share:

This week, chipmaker Nvidia joined the ranks of OpenAI, Meta, and Microsoft after three authors sued it for alleged copyright infringement. While it may not seem completely obvious why the chip giant is being targeted, Nvidia actually has its own large language AI model (LLM) that it’s currently training, called NeMo.

Which authors are suing Nvidia?

Authors Abdi Nazemian, Brian Keene, and Stewart O’Nan claim that their works were part of another Nvidia dataset, comprising approximately 196,640 books, that helped train the LLM. Alongside the chipmaker, software firm Databricks Inc. is also facing a class-action lawsuit in a San Francisco federal court.

Keene’s 2008 novel “Ghost Walk,” Nazemian’s 2019 novel “Like a Love Story,” and O’Nan’s 2007 novella “Last Night at the Lobster,” are works covered in the lawsuit.

According to the complaint, Nvidia, whose commitment to developing chips for AI has resulted in a significant rise in its stock price over the past two years, launched its own series of NeMo Megatron AI models in 2022.

Why have the authors launched legal action against Nvidia?

The authors said the “model cards” attached to each Nvidia model say they were trained on datasets using the Books3 corpus, which includes hundreds of thousands of pirated books.

“These shadow libraries have long been of interest to the AI-training community because they host and distribute vast quantities of unlicensed copyrighted material. For that reason, these shadow libraries also violate the U.S. Copyright Act.”

Class Action Lawsuit against Nvidia Corp.

In the filing, an EleutherAI paper was cited explaining that Books3 is a dataset of books derived from a copy of the contents of the Bibliotik private tracker, which consists of a mix of fiction and nonfiction books and is almost an order of magnitude larger than the next largest book dataset (BookCorpus2). It added: “We included Bibliotik because books are invaluable for long-range context modeling research and coherent storytelling.”

An Nvidia spokesperson told Bloomberg in a statement: “We respect the rights of all content creators and believe we created NeMo in full compliance with copyright law.”

The complaint also alleges that Databricks, having recently acquired MosaicML—a company that developed the MPT series of large language models—used datasets incorporating Books3 to train these models, as indicated by publicly available information.

Read: Majority of authors’ OpenAI copyright claim dismissed

It remains to be seen how this case will play out given a judge recently dismissed most of the copyright infringement lawsuit filed by prominent authors including Sarah Silverman, Paul Tremblay, and Ta-Nehisi Coates against OpenAI. While Silverman’s case against Meta was also partially rejected by a California federal judge in November. Both judges expressed skepticism regarding the ability of creators to prove that AI-generated content infringes on copyrighted works without showing substantial similarity.

The stock price of the Santa Clara, California-based chipmaker Nvidia has soared nearly 600% since the end of 2022, elevating the company’s market value to close to $2.2 trillion. I wrote about how in December, Nvidia’s stock value saw a threefold increase, surpassing the performance of all other companies in the S&P 500. The importance of its chips has been crucial amid a global shortage, hence investors have been lapping it up.

Share:

More Posts:

Laura Gao on Messy Roots book ban and anti-LGBTQ sentiment

Internet Archive forced to remove 500k books from digital library

Libraries Change Lives Week on integral role in UK

Fossil Free Books faces backlash, corporations evade scrutiny – opinion

Subscribe To Our Newsletter:

Support Our Website

Your donations mean a lot to us.
Help us keep the website up and running by supporting our mission today.
0 0 votes
Article Rating
Subscribe
Notify of
guest
0 Comments
Oldest
Newest Most Voted
Inline Feedbacks
View all comments