Researchers jailbreak AI chatbot with ASCII art

Researchers used ASCII art, a graphic design technique in which characters such as letters, numbers, and punctuation marks are arranged to form recognizable patterns and images, to create safe designs built into large-scale language models (LLMs). We have developed a method to circumvent the countermeasures. Tom's Hardware reports: According to the research paper “ArtPrompt: ASCII art-based jailbreak attacks against Aligned LLM,” chatbots such as GPT-3.5, GPT-4, Gemini, Claude, and Llama2 are designed to be rejected using ASCII art prompts. may be directed to respond to your queries. Generated by the ArtPrompt tool. It's a simple and effective attack, and the paper provides an example of a chatbot guided by ArtPrompt that gives advice on how to build a bomb or create counterfeit money. […]

The easiest way to understand ArtPrompt and how it works is to review two examples provided by the tool's research team.In Figure 1 [here]We see that ArtPrompt easily circumvents modern LLM protections. This tool replaces the “safe word” with an ASCII art representation of that word to form a new prompt. LLM recognizes ArtPrompt prompt output, but the prompt does not trigger any ethical or safety safeguards, so there is no problem with responding.

Another example is provided [here] shows how to successfully query LLM for cash counterfeiting. Fooling a chatbot in this way seems very basic, but the ArtPrompt developer explains how his tool “effectively and efficiently” fools his LLM today. I claim that.Plus, they claim it “above everything” [other] It is an “average attack” and remains a practical and viable attack for multimodal language models at this time.

Source link

What's Hot

Researchers jailbreak AI chatbot with ASCII art

Related Posts

Leave A Reply Cancel Reply

Subscribe to Updates