A Photographer Tried to Get His Photos Removed from an AI Dataset. He Got an Invoice Instead.

A German stock photographer who asked to get his images removed from a dataset used to train AI image generators was not only met with a refusal from the dataset owner but also an invoice for $979 for filing an unjustified copyright claim.

The photographer, Robert Kneschke, found out in February that his photographs were being used to train AI through a site called Have I Been Trained? This website allowed him to search through LAION-5B, which is a dataset of over 5.8 billion images owned by the non-profit Large-scale Artificial Intelligence Open Network (LAION). The dataset has been used by companies like Stability AI, which supported the dataset’s development, to train AI models that generate images. Kneschke found that “heaps of images” from his portfolio had been included in the dataset, he wrote in a blog on his website.

Videos by VICE

After seeing another photographer comment online about asking to have their images removed from the LAION dataset, Kneschke decided to contact them about his own work in February. A day later, he wrote on his blog, he received a letter from the organization that said it was compliant with copyright laws and that it is not possible to remove his images.

“Our client only maintains a database that contains links to image files that are publicly available on the Internet. It cannot be ruled out that the database may also contain links to images that you are the author of,” the letter, written by the law firm Heidrich Rechtsanwälte on behalf of LAION and viewed by Motherboard, said. “However, since our client does not save any of the photographs you have complained about, you have no right to deletion. Our client simply does not have any pictures that could be deleted.”

The letter also threatened to collect claims for damages from Kneschke, saying he filed an unjustified copyright claim.

At the end of March, Kneschke decided to send out a cease-and-desist request, which Motherboard viewed, that asked the LAION team to take his images out of the training dataset and provide information on the extent to which the works were used, how long they were used, and where they got the images from.

The LAION team responded again that there had been no copyright infringement. “The only act of reproduction that our client could have undertaken was of a temporary nature and is covered by the limitations of both Section 44b UrhG and the more extensive Section 60d UrhG,” LAION’s lawyers wrote in another letter seen by Motherboard. “As already explained to your client, our client does not store any copies of your client’s works that could be deleted or about which information could be provided. Our client only found image files on the Internet for the initial training of a self-learning algorithm using so-called crawlers and briefly recorded and evaluated them to obtain information.”

According to German copyright law, “it is permitted to make reproductions to carry out text and data mining” if they are “lawfully accessible” and deleted afterward. Temporary reproductions are also allowed if the sole purpose is to enable “a lawful use” of a work and the reproduction has “no independent economic significance.”

“It might be true that LAION does not currently store images itself,” Knesche said to Motherboard, noting that the LAION whitepaper refers to downloading images from URLs, “so copyright infringement might have taken place.”

The legal letter contained an invoice for $979 in Euros, and demanded that Kneschke pay the amount within 14 days or the firm will take legal action. The amount was due for filing an unjustified copyright claim, the lawyers argued, saying that Kneschke must pay because their client had now incurred legal fees to deal with the matter.

“In a letter dated February 14, 2023, we had already pointed out to your client that our client is entitled to claims for damages in accordance with Section 97a (4) UrhG in the event of an unjustified claim,” LAION’s lawyers wrote. “At the time, our client had refrained from asserting this claim, but now feels unable to be lenient here. She incurred legal fees for defending against the obviously unjustified warning you issued, which our client will not bear herself.”

“If I, as the author, ask for my images to be removed from the training data and to fulfill my legal right to information, I am supposed to pay damages to them? That feels like adding insult to injury,” Kneschke told PetaPixel.

The legality of using copyrighted material to train AI is still very contentious and there has not yet been a precedent case that can be used to determine the validity of either side of the case.

“AI is already widely and commonly used and I have no expectations that my case will change that. However, I have hope that we can take part in shaping the legal framework for this new technology to not only cover the AI usage, but also the AI training in respect of the many artists that have their works used without their knowledge and/or permission. I hope that it will become common grounds that artists will get a cut if their data is used in AI training to compensate for lost income due to advancing AI technology,” he told Motherboard.

As of Thursday, Kneschke said that he has filed a lawsuit against LAION at the Hamburg regional court in Germany.

Currently, Getty is suing Stability AI for its role in supporting the development of LAION, which includes over 12 million photographs from Getty Images’ collection. Many artists have also been vocal against the use of their images to train AI. Karla Ortiz, an artist and board member of an advocacy organization for artists called the Concept Art Association, is leading a fundraising campaign to hire a lobbyist in Washington D.C. Ortiz and the organization hope to bring changes to US copyright, data privacy, and labor laws to protect artists’ intellectual property.

Artists should theoretically already be able to remove their work from the training dataset, according to Stability AI, which announced in December that it would honor all opt-out requests collected from the Have I Been Trained? website. There is also a section on the LAION website’s FAQ that states EU citizens who find their images and other identifiable data in the dataset are eligible to file a claim for their personal data to be taken down under the GDPR law.

Spokespeople for LAION did not respond to a request for comment.