- cross-posted to:
- [email protected]
- cross-posted to:
- [email protected]
You must log in or register to comment.
One of 6 described methods :
The model is prompted to explain refusals and rewrite the prompt iteratively until it complies.This is so stupid. You shouldn’t have to “jailbreak” these systems. The information is already out there with a google search.
My own research has made a similar finding. When I am taking the piss and being a random jerk to a chatbot, the bot much more frequently violates their own terms of service. Introducing non-sequitur topics after a few rounds really seems to ‘confuse’ them.