July 13, 2023  •  4 min read •  By Marty Swant

On Wednesday, U.S. lawmakers dug deeper into the implications of generative artificial intelligence on copyright protections, highlighting both short-term and long-term questions about how tech companies collect data to train AI and generate content based on it.

The hearing — the third in a series held by the Senate Judiciary Committee, follows previous discussions in May and June, which focused on other aspects of AI and IP such as copyright law, innovation and competition.

One of the key areas of discussion focused on whether companies should be required to let users opt out of having their data used for training AI models. Ben Brooks, head of public policy for Stability AI, said the startup has already received more than 160 million opt-out requests from people who don’t want their images used in its AI models.

However, when Sen. Chris Coons asked if Stability AI pays people who allow the company to use their data, Brooks avoided a direct answer, choosing instead to point out that it’s important to have a large data set.

“To make that workable, it’s important to have that diversity,” Brooks said.

The discussion also comes as tech companies face new legal challenges. This week, a class action lawsuit was filed against Google that claimed the tech giant violated state and federal copyright and privacy laws when developing its AI products. The complaint comes less than a month after OpenAI was hit with a separate but similar lawsuit from the same firm. Earlier this year, Stability AI was sued by Getty Images, which claimed the startup wrongly used millions of its images to train its AI models.

Executives from AI companies that testified said their AI models are trained based on content that’s considered “fair use.” However, Sen. Marsha Blackburn said “fair use” protections have become a “fairly useful way to steal” intellectual property. Another committee member, Sen. Amy Klobuchar, suggested lawmakers should consider banning AI-generated content for some uses and cited concerns about how AI might be used to create misinformation around political ads. (It’s worth noting the two senators are opposed on virtually all other issues.)

AI experts say it’s difficult to remove content from large language models after it’s already been used as training data. It’s also hard for people to opt out of sharing data unless they already know what’s been opted in, noted Jeffrey Harleston, general counsel and evp of business and legal affairs at Universal Music Group. He thinks companies should get consent before using content for training AI, adding that some artists don’t want their music distributed for reasons that go beyond commerciality.

“Creativity is the soundtrack to our lives,” Harleston said. “And without the fundamentals of copyright, we might not have ever known them.”

Matthew Sag, an AI expert at the Emory School of Law, said he was “quite alarmed” that the discussion focused on commercial protections without focusing on regular people who still want protection from deep fakes and other AI content.

One suggestion was a new “anti-impersonation law” to protect artists from being impersonated by AI, which was proposed by Dana Rao, evp and general counsel at Adobe. During his opening remarks, Rao — who oversees the company’s legal and policy teams – said the law should include statutory damages so artists aren’t burdened with proving actual damages. (Adobe and other companies are looking to create new standards for generative AI through organizations like the Content Authenticity Initiative, which now has more than 1,500 members including Stability AI, Universal Music Group and Getty Images.)

Representing artists impacted by AI was Karla Ortiz, a concept artist and visual developer whose work includes Black Panther, Guardians of the Galaxy and Doctor Strange. During her testimony, Ortiz recalls being “horrified” after learning how her designs helped train AI models and that she’s “never worried about my future as an artist until now.” (Ortiz has sued Stability AI in the past.)

“I found that almost the entirety of my work, the work of almost every artist I know and the work of hundreds of thousands of artists have been taken without our consent, credit or compensation,” she said. “These works were stolen and used to train for-profit technologies with datasets that contain billions of image and text data pairs.”

https://digiday.com/?p=510895