π¦ Self-Refine Prompting
-
Refining through feedback: Self-Refine enhances Large Language Model outputs by iteratively improving initial results based on model feedback.
-
Practical application: It is a simple three-step process: generate output, get feedback, and refine the answer, repeating until the output is satisfactory.
-
Performance boost: Self-Refine significantly improves performance on tasks such as code optimization and sentiment analysis, especially for larger models.
What is Self-Refine Prompting?
Large Language Models (LLMs) can solve a wide variety of tasks. Still, they can often fall short in addressing intricate requirements: tasks involving multiple different objectives or tasks involving hard-to-define goals. The initial output from the LLM in such cases involves some inaccuracies and false ideas.
When given a problem, humans come up with an initial draft and then refine iteratively to improve it based on self-provided feedback. For instance, when writing an email for a work colleague, you may first write a direct request such as "Send me the data ASAP". This may look like an okay email to send to friends, but with your work colleague, you may feel the need to be formal. Based on this self-provided feedback, you may re-phrase the email to: "Hi Ashley, could you please send me the data at your earliest convenience?"
Inspired by humans' ability to refine the solution, Self-Refine prompting aims to improve the initial outputs from LLMs through iterative feedback and refinement. It is a 3 step approach involving:
- Initial output: Prompt the model to get the initial output.
- Feedback: Pass the prompt and initial output back to the model to get the feedback.
- Refinement: Pass the feedback back to the model to get the refined output.
It is an iterative process and continues till the model output meets the stopping criteria.
How to Use Self-Refine Prompting?
Step 1: Prompt the model to get the output.
Let's prompt the model to generate Python code to find the greatest number among three given numbers.
Step 2: Get feedback Send the output back to the same model to get feedback. If the code cannot be improved anymore, ask the model to say so which is the stopping criteria.
Step 3: Implement feedback Prompt the LLM to use its feedback and improve the existing code.
Step 4: Ask for more feedback.
Since no more improvements are necessary, we stop the iteration.
What Are Self-Refine Prompting Results?
Employing Self-Refine boosts the model performance and outperforms the previous state-of-the-art across all tasks. Some notable results include:
- GPT-4's performance increases by 8.7 units for code optimization when augmented using Self-Refine.
- Self-Refine improves the performance in code readability by at least 13.9 units.
- Self-Refine improves the performance in sentiment reversal tasks by at least 21.6 units.
Self-Refine results on various tasks using the latest models from OpenAI
Limitations of Self-Refine Prompting
While Self-Refine shows remarkable performance gain across a variety of tasks, there are a few limitations to this approach:
- The base model needs to be capable of following instructions provided by the users. So, primitive LMs may not be able to benefit from this approach.
- The results are based on tests performed using the dataset in English.
- Bad actors can use the technique to steer the model into generating toxic or harmful text.
Conclusion
Self-refine enables LLMs to iteratively refine their own output without the need for labeled data, training, or a separate language model. The technique is simple and can be used across a wide variety of tasks including, but not limited to, code optimization, code readability, math reasoning, acronym generation, etc.
Bhuwan Bhatt
Bhuwan Bhatt, a Machine Learning Engineer with over 5 years of industry experience, is passionate about solving complex challenges at the intersection of machine learning and Python programming. Bhuwan has contributed his expertise to leading companies, driving innovation in AI/ML projects. Beyond his professional endeavors, Bhuwan is deeply committed to sharing his knowledge and experiences with others in the field. He firmly believes in continuous improvement, striving to grow by 1% each day in both his technical skills and personal development.
Footnotes
-
Aman Madaan. (2023). Self-Refine: Iterative Refinement with Self-Feedback. https://arxiv.org/abs/2303.17651 β© β©2