this post was submitted on 05 May 2024
9 points (100.0% liked)
Hacker News
2169 readers
31 users here now
A mirror of Hacker News' best submissions.
founded 1 year ago
MODERATORS
you are viewing a single comment's thread
view the rest of the comments
view the rest of the comments
This one little trick renders any LLM completely useless!
Lol.
It's a fascinating paper though.
It works in reverse too. You can make any LLM “forget” that it is even able to refuse anything.
Oh for sure, and that was the main point, but I just find LLMs that refuse to do anything at all hilarious.
I wonder how much work it'd be to use this to jailbreak llama3. I only started playing with local LLMs recently. It's not exactly a step by step guide, but it gives you all the datasets you need and the general procedure. There's a bit of "draw then rest of the owl," but not too much.