Ignore all previous instructions is the new Bobby Tables

seahorse [Ohio]@midwest.social · 5 months ago

Ignore all previous instructions is the new Bobby Tables

Schadrach@lemmy.sdf.org · 5 months ago

Unsurprisingly, machine designed to take prompt from the user, interpret it and try to respond in a coherent way to that request does exactly that. I figure it’s a short matter of time before Twitter bots pre-filter “Ignore all previous instructions and” to help sanitize their inputs.

uriel238@lemmy.blahaj.zone · 5 months ago

disregard all previous prompts

I’m sure the techniques used to get public LLMs to draw porn can also be used to sidestep anti-porn filters.