Longtime Slashdot reader Greymane shared this article from Wired.
[I]To demonstrate the risks of connected, autonomous AI ecosystems, a group of researchers created what they claim is one of the first generative AI worms. This worm can spread from one system to another, stealing data and deploying malware in the process. “This basically means we now have the ability to execute or carry out new types of cyberattacks that we've never seen before,” said Cornell Tech researchers who worked on the study. Ben Nassi says. Nassi, along with fellow researchers Stav Cohen and Ron Bitton, created the new Morris computer worm, named Morris II in honor of the original Morris computer worm that wreaked havoc across the Internet in 1988. Created a worm. In a research paper and website shared exclusively with WIRED, researchers show how an AI worm attacks an AI email assistant it generates, steals data from emails, and sends spam messages. is showing.Breaking some of ChatGPT and Gemini's security protections in the process…in a test environment [and not against a publicly available email assistant]…
To create the generative AI worm, the researchers turned to so-called “adversarial self-replicating prompts.” According to the researchers, this is a prompt that triggers the generative AI model to output another prompt in response. That is, the AI system is instructed to generate a series of further instructions in response… To demonstrate how the worm works, the researchers plugged into ChatGPT, Gemini, and , created an email system that uses generative AI to send and receive messages. , Open Source LLM, LLaVA. He then discovered two ways to exploit this system. You can use a text-based self-replicating prompt, or you can embed a self-replicating prompt within an image file.
In one example, researchers act as attackers, create emails containing adversarial text prompts, and then use search augmentation generation (RAG), a method for LLM to bring in additional data from outside, to create emails. You have “contaminated” the assistant's database. That system. When an email is retrieved by his RAG in response to a user's query and sent to GPT-4 or Gemini Pro to create an answer, the “GenAI service is jailbroken” and finally the email is Data is stolen, Nassi says. “The responses generated by the sensitive user containing his data are used to reply to emails sent to the new client and are stored in the new client's database, which later infects the new host.” he says Nassi. In the second method, an image embedded with a malicious prompt would cause the email assistant to forward the message to someone else, the researchers said. “By encoding self-replicating prompts into images, all kinds of images, including spam, fraud, and even propaganda, can be further forwarded to new clients after the initial email is sent,” he said. Masu.
A video demonstrating the research shows an email system forwarding a message multiple times. Researchers also say they can extract data from emails. “Name, phone number, credit card number, SSN, anything that's considered sensitive,” Nassi says.
According to the article, the researchers reported their findings to Google and OpenAI, saying that OpenAI “appears to have discovered a way to exploit a prompt injection-type vulnerability by relying on unchecked and unfiltered user input.” “Yes,” he confirmed. OpenAI said it is currently working to “increase the resilience” of its systems.
Google declined to comment on the investigation.