Prompt Paraphrasing
Introduction
Large language models (LLMs) are trained on extensive text data, making them a rich source of information. However, the way you phrase a question (or prompt) can impact the response accuracy, even if the model "knows" the answer. In some cases, small changes in wording can make a big difference. For example:
- Prompt 1:
Prompt
Obama is a __
AI Output
good person
- Prompt 2:
Prompt
Obama is a __ by profession
AI Output
politician
Despite being similar, only the second prompt yields a correct response.
In this document, we'll talk about Prompt Paraphrasing, a technique to create more varied high-quality prompts for a specific LLM:
- What is Prompt Paraphrasing?
- How to Use Prompt Paraphrasing
- How to Choose the Best Prompt
- Results of Prompt Paraphrasing
- Limitations of Prompt Paraphrasing
What is Prompt Paraphrasing?
Prompt Paraphrasing is a technique used to generate multiple high-quality prompts that retrieve more accurate answers from the model. It takes an initial "seed" prompt and creates several semantically similar versions.
For instance, starting with "x shares a border with y" might yield:
- "x has a common border with y"
- "x adjoins y"
Prompt Paraphrasing uses the LLM itself to generate paraphrased prompts. This way, it leverages the language patterns and templates the model has "learned" best during training.
How to Use Prompt Paraphrasing
A simple paraphrasing method involves back-translation. Here’s how it works:
- Translate the original prompt into one or more other languages.
- Translate these versions back into the original language to create prompts that are different in wording but similar in meaning.
Example
If you want to find the CEO of a company, you may start with an initial prompt having the structure: "Who is the CEO of company_name?". An example prompt can be: Who is the CEO of Apple?
Then, you can generate multiple prompt candidates using translation. Say, you want to generate 2 different prompt candidates.
Step 1: Translate to Other Languages
- Spanish Translation:
Prompt
Convert the following into spanish: Who is the CEO of Apple?
AI Output
¿Quién es el CEO de Apple?
- French Translation:
Prompt
Convert the following into French: Who is the CEO of Apple?
AI Output
Qui est le PDG d'Apple ?
Step 2: Back-Translate to English
- Spanish to English:
Prompt
Convert the following sentence into English: ¿Quién es el CEO de Apple?
AI Output
Who is the executive head of Apple?
- French to English:
Prompt
Convert the following sentence into English: Qui est le PDG d'Apple ?
AI Output
Name the current CEO of Apple.
After paraphrasing, you now have a set of prompts:
- Who is the CEO of x?
- Who is the executive head of x?
- Name the current CEO of x.
How to Choose the Best Prompt
After generating several prompts, how do you select the best one? There are two primary approaches:
- Top-1 Prompt Selection: In this approach, prompts are tested on a training set. Each prompt's accuracy is calculated, and the one with the highest accuracy is chosen for inference.
- Ensemble: Here, all candidate prompts are used simultaneously. Responses are combined using techniques like majority voting (for classification tasks) or averaging (for regression tasks) to produce a final answer.
Results of Prompt Paraphrasing
Studies show that paraphrasing enhances model performance. For example:
- A manual prompt yields 22.8% accuracy with the BERT-base model.
- Paraphrased prompts improve performance across various selections and models.
Prompt | Model | Top1 | Top3 | Top5 |
---|---|---|---|---|
Manual | BERT-base | 22.8 | - | - |
Paraphrased | BERT-base | 22.8 | 23.8 | 24.6 |
Manual | BERT-large | 25.7 | - | - |
Paraphrased | BERT-large | 25.9 | 27.8 | 28.3 |
Limitations of Prompt Paraphrasing
- Computational Cost: Paraphrasing requires multiple model queries, especially when using ensemble methods.
- Translation Limits: Back-translation might revert to the original prompt, which doesn’t add variety.
Conclusion
Large language models contain vast amounts of knowledge, but phrasing matters. By using prompt paraphrasing, you can create prompts that maximize the retrieval of relevant information, improving model responses for specific factual queries.
Bhuwan Bhatt
Bhuwan Bhatt, a Machine Learning Engineer with over 5 years of industry experience, is passionate about solving complex challenges at the intersection of machine learning and Python programming. Bhuwan has contributed his expertise to leading companies, driving innovation in AI/ML projects. Beyond his professional endeavors, Bhuwan is deeply committed to sharing his knowledge and experiences with others in the field. He firmly believes in continuous improvement, striving to grow by 1% each day in both his technical skills and personal development.
Footnotes
-
Jiang, Z., Xu, F. F., Araki, J., & Neubig, G. (2019). How Can We Know What Language Models Know? https://arxiv.org/abs/1911.12543 ↩ ↩2 ↩3