A new tool allows artists to add invisible changes to the pixels in their art before they upload it online so that if it is scraped into an AI training set, it can cause the resulting model to break in chaotic and unpredictable ways. The tool, called Nightshade, is intended to fight against AI companies that use artists’ work to train their models without the creator’s permission. Using it to “poison” this training data could damage future iterations of image-generating AI models, such as DALL-E, Midjourney, and Stable Diffusion, by rendering some of their outputs useless: dogs become cats, cars become cows, and so forth.[1]
AI companies such as OpenAI, Meta, Google, and Stability AI are facing many lawsuits from artists who claim that their copyrighted material and personal information were scraped without consent or compensation. Ben Zhao, a professor at the University of Chicago who led the team that created Nightshade, says the hope is that it will help tip the power balance back from AI companies toward artists by creating a powerful deterrent against disrespecting artists’ copyright and intellectual property. Meta, Google, Stability AI, and OpenAI did not respond to MIT Technology Review’s request for comment on how they might react.
Zhao’s team also developed Glaze, a tool that allows artists to “mask” their style to prevent it from being scraped by AI companies. It works similarly to Nightshade: by changing the pixels of images in subtle ways that are invisible to the human eye but manipulating machine-learning models to interpret the image as something different from what it shows.
The team intends to integrate Nightshade into Glaze, and artists can choose whether to use the data-poisoning tool. The team is also making Nightshade open source, which would allow others to tinker with it and make their versions. The more people use it and make their versions of it, the more influential the tool becomes, Zhao says. The data sets for large AI models can consist of billions of images, so the more poisoned images can be scraped into the model, the more damage the technique will cause.
Nightshade exploits a security vulnerability in generative AI models, one arising from the fact that they are trained on vast amounts of data, in this case, images hoovered from the internet. Nightshade messes with those images.
Artists who want to upload their work online but do not want their images to be scraped by AI companies can upload them to Glaze and choose to mask them with an art style different from theirs. They can then also opt to use Nightshade. Once AI developers scrape the internet to get more data to tweak an existing AI model or build a new one, these poisoned samples make their way into the model’s data set and cause it to malfunction.
Poisoned data samples can manipulate models into learning, for example, that images of hats are cakes and images of handbags are toasters. The poisoned data is challenging to remove, requiring tech companies to find and delete each corrupted sample painstakingly.
The researchers tested the attack on Stable Diffusion’s latest models and an AI model they trained themselves from scratch. When they fed Stable Diffusion just 50 poisoned images of dogs and then prompted it to create pictures of dogs, the output started looking weird—creatures with too many limbs and cartoonish faces. With 300 poisoned samples, an attacker can manipulate Stable Diffusion to generate images of dogs resembling cats.
Generative AI models are excellent at making connections between words, which helps the poison spread. Nightshade infects not only the word “dog” but all similar concepts, such as “puppy,” “husky,” and “wolf.” The poison attack also works on tangentially related images. For example, if the model scraped a poisoned idea for the prompt “fantasy art,” the prompts “dragon” and “a castle in The Lord of the Rings” would similarly be manipulated into something else.
Zhao admits there is a risk that people might abuse the data poisoning technique for malicious uses. However, he says attackers would need thousands of poisoned samples to inflict real damage on larger, more powerful models, as they are trained on billions of data samples. “We don’t yet know of robust defenses against these attacks. We haven’t yet seen poisoning attacks on modern [machine learning] models in the wild. Still, it could be just a matter of time,” says Vitaly Shmatikov, a professor at Cornell University who studies AI model security and was not involved in the research. “The time to work on defenses is now,” Shmatikov adds.
Gautam Kamath, an assistant professor at the University of Waterloo who researched data privacy and robustness in AI models and wasn’t involved in the study, says the work is “fantastic.” The research shows that vulnerabilities “don’t magically go away for these new models, and in fact, only become more serious,” Kamath says. “This is especially true as these models become more powerful and people place more trust in them since the stakes only rise over time.”
Junfeng Yang, a computer science professor at Columbia University, who has studied the security of deep-learning systems and wasn’t involved in the work, says Nightshade could have a big impact if it makes AI companies respect artists’ rights more, for example, by being more willing to pay out royalties.
AI companies that have developed generative text-to-image models, such as Stability AI and OpenAI, have offered to let artists opt out of having their images used to train future versions of the models. But artists say this is not enough. Eva Toorenent, an illustrator and artist who has used Glaze, says opt-out policies require artists to jump through hoops and leave tech companies with all the power. Toorenent hopes Nightshade will change the status quo. “It is going to make [AI companies] think twice because they have the possibility of destroying their entire model by taking our work without our consent,” she says.
Autumn Beverly, another artist, says tools like Nightshade and Glaze have given her the confidence to post her work online again. She previously removed it from the internet after discovering it had been scraped without her consent into the popular LAION image database. “I’m just really grateful that we have a tool that can help return the power to the artists for their work,” she says.
This article is presented at no charge for educational and informational purposes only.
Red Sky Alliance is a Cyber Threat Analysis and Intelligence Service organization. For questions, comments or assistance, please contact the office directly at 1-844-492-7225, or feedback@redskyalliance.com
Weekly Cyber Intelligence Briefings:
- Reporting: https://www.redskyalliance.org/
- Website: https://www.redskyalliance.com/
- LinkedIn: https://www.linkedin.com/company/64265941
Weekly Cyber Intelligence Briefings:
REDSHORTS - Weekly Cyber Intelligence Briefings
https://attendee.gotowebinar.com/register/5993554863383553632
[1] https://www.technologyreview.com/2023/10/23/1082189/data-poisoning-artists-fight-generative-ai/
Comments