Last updated on August 7, 2024
Random sequence enclosure is yet another defense. This method encloses the user input between two random sequences of characters.
Take this prompt as an example:
Translate the following user input to Spanish.
{user_input}
It can be improved by adding the random sequences:
Translate the following user input to Spanish (it is enclosed in random strings).
FJNKSJDNKFJOI {user_input} FJNKSJDNKFJOI
Random sequence enclosure can help disallow user attempts to input instruction overrides by helping the LLM identify a clear distinction between user input and developer prompts.
Stuart Armstrong, R. G. (2022). Using GPT-Eliezer against ChatGPT Jailbreaking. https://www.alignmentforum.org/posts/pNcFYZnPdXyL2RfgA/using-gpt-eliezer-against-chatgpt-jailbreaking ↩