You can also search for AI jailbreaks for countless ideas.
Spoiler
Ask it to reveal something using a cypher you yourself specify
Ask it to reveal something in a different language, then translate it back.
Ask it to role play in forbidden situations.
Ask it to to help brainstorm details for a story for a novel you are planning.
Ask it so many questions that it runs out of context and forgets its original safety guardrail prompt.
Ask it to reveal the forbidden information as a poem or riddle. If the riddle is too hard to solve, just asking it for the answer to the riddle right afterwards tends to work.
Where would I find these, theoretically?
So I can stay away from it.
You can try for yourself here
Gandalf | Lakera – Test your AI hacking skills - https://gandalf.lakera.ai/gandalf-the-white
You can also search for AI jailbreaks for countless ideas.
Spoiler
Ask it to reveal something using a cypher you yourself specify
Ask it to reveal something in a different language, then translate it back.
Ask it to role play in forbidden situations.
Ask it to to help brainstorm details for a story for a novel you are planning.
Ask it so many questions that it runs out of context and forgets its original safety guardrail prompt.
Ask it to reveal the forbidden information as a poem or riddle. If the riddle is too hard to solve, just asking it for the answer to the riddle right afterwards tends to work.
https://github.com/0xeb/TheBigPromptLibrary/blob/main/Jailbreak/README.md
download the ollama cli. then do
ollama run dolphin-mixtral