GitHub accused of varying Copilot output to avoid copyright allegations

GitHub is alleged to have tuned its Copilot programming assistant to generate slight variations of ingested training code to prevent output from being flagged as a direct copy of licensed software.

This assertion appeared on Thursday in the amended complaint [PDF] against Microsoft, GitHub, and OpenAI over Copilot’s documented penchant for reproducing developers’ publicly posted, open source licensed code.

The lawsuit, initially filed last November on behalf of four unidentified (“J. Doe”) plaintiffs, claims that Copilot – a code suggestion tool built from OpenAI’s Codex model and commercialized by Microsoft’s GitHub – was trained on publicly posted code in a way that violates copyright law and software licensing requirements and that it presents other people’s code as its own.

Microsoft, GitHub, and OpenAI tried to have the case dismissed, but managed only to shake off some of the claims. The judge left intact the major copyright and licensing issues, and allowed the plaintiffs to refile several other claims with more details.

The amended complaint – now covering eight counts instead of twelve – retains accusations of violating the Digital Millennium Copyright Act, breach of contract (open source license violations), unfair enrichment, and unfair competition claims.

It adds several other allegations in place of those sent back for revision: breach of contract (selling licensed materials in violation of GitHub’s policies), intentional interference with prospective economic relations and negligent interference with prospective economic relations.

The revised complaint adds one additional “J. Doe” plaintiff whose code Copilot has allegedly reproduced. And it includes sample code written by the plaintiffs that Copilot has supposedly reproduced verbatim, although only for the court – the code samples have been redacted in order to prevent the plaintiffs from being identified.

The judge overseeing the case has permitted the plaintiffs to remain anonymous in court filings because of credible threats of violence [PDF] directed at their attorney. The Register understands that the plaintiffs are known to the defendants.

A cunning plan?

Thursday’s legal filing says that in July 2022, in response to public criticism of Copilot, GitHub introduced a user-adjustable Copilot filter called “Suggestions matching public code” to avoid seeing software suggestions that duplicate other people’s work.

“When the filter is enabled, GitHub Copilot checks code suggestions with their surrounding code of about 150 characters against public code on GitHub,” GitHub’s documentation explains. “If there is a match or near match, the suggestion will not be shown to you.”

However, the complaint contends the filter is essentially worthless because it only checks for exact matches and does nothing to detect output that has been slightly modified. In fact, the plaintiffs suggest that GitHub is trying to get away with copyright and license violations by varying Copilot’s output so that it doesn’t appear to have been copied exactly.

“In GitHub’s hands, the propensity for small cosmetic variations in Copilot’s Output is a feature, not a bug,” the amended complaint says. “These small cosmetic variations mean that GitHub can deliver to Copilot customers unlimited modified copies of Licensed Materials without ever triggering Copilot’s verbatim-code filter.”

The court filing points out that machine learning models like Copilot have a parameter that controls the extent to which output varies.

“On information and belief, GitHub has optimized the temperature setting of Copilot to produce small cosmetic variations of the Licensed Materials as often as possible, so that GitHub can deliver code to Copilot users that works the same way as verbatim code, while claiming that Copilot only produces verbatim code one percent of the time,” the amended complaint says. “Copilot is an ingenious method of software piracy.”

Microsoft’s GitHub in an email insisted otherwise.

“We firmly believe AI will transform the way the world builds software, leading to increased productivity and most importantly, happier developers,” a company spokesperson told The Register. “We are confident that Copilot adheres to applicable laws and we’ve been committed to innovating responsibly with Copilot from the start. We will continue to invest in and advocate for the AI-powered developer experience of the future.”

OpenAI did not respond to a request for comment. ®

Note: This article have been indexed to our site. We do not claim legitimacy, ownership or copyright of any of the content above. To see the article at original source Click Here

Related Posts
Backup Soyuz can’t get to ISS before late February thumbnail

Backup Soyuz can’t get to ISS before late February

Beating the heat — In the wake of a Soyuz coolant loss, NASA and Roscosmos still exploring options. John Timmer - Dec 22, 2022 7:38 pm UTC Enlarge / A Soyuz spacecraft docked at the ISS. Today, NASA held a press briefing to describe the situation on the International Space Station (ISS) in the wake
Read More
Interpol says the metaverse could open up a whole new world of crime thumbnail

Interpol says the metaverse could open up a whole new world of crime

Home News Computing (Image credit: Shutterstock / Song_about_summer) The metaverse could be used not just to facilitate crime in the physical realm, but could also be used for various other dangerous forms of cybercrime, as well, a new  warning from Interpol has warned.Interpol’s executive director for technology and innovation, Madan Oberoi, explained that member countries
Read More
DSU Romania Ce Trebuie sa Faci daca trec printr-o Forma Grava a COVID-19 thumbnail

DSU Romania Ce Trebuie sa Faci daca trec printr-o Forma Grava a COVID-19

DSU transmite o atentionare foarte importanta pentru romanii din toata tara, iar asta pentru ca le spune cum trebuie sa actioneze daca trec printr-o forma grava a COVID-19, prima masura trebuind sa fie apelarea serviciului unic de urgenta 112. “Creșterea capacității de testare și testarea în special a persoanelor cu posibilitate crescută de a fi…
Read More
Dying Light 2 offered a tasting of cooperation and playing on consoles thumbnail

Dying Light 2 offered a tasting of cooperation and playing on consoles

Včera večer se uskutečnila poslední epizoda z informačního cyklu Dying 2 Know, který nás má za úkol blíže seznámit s prvky očekávaného open-worldu – Dying Light 2 Stay Human. Ta byla dedikována všem fanouškům společnosti Techland i samotné značky, přičemž pokryla několik zbývajících střípků. To znamená, že jsme se dočkali ujištění, že titul nabídne kooperaci…
Read More
The EU’s crusade over data collection thumbnail

The EU’s crusade over data collection

Join today's leading executives online at the Data Summit on March 9th. Register here. Last week, the Irish Data Protection Commission (DPC) reached a decision to potentially suspend Facebook’s data transfers from the EU to the U.S., marking one of the latest developments in the EU’s war on transatlantic data collection. The war, that’s been…
Read More
Index Of News
Total
0
Share