🟢 Introduction to Self-Criticism Prompting Techniques
Welcome to the Self-Criticism Prompting section of the advanced Prompt Engineering Guide.
When working with Large Language Models (LLMs), a common problem is making sure their responses are both accurate and reliable. One powerful approach to tackling this is by prompting LLMs to critique their own outputs—a technique that has shown great success in helping models refine and improve their responses.
In this section, we’ll overview a set of Self-Criticism prompting techniques designed to improve the model’s performance through self-assessment, iterative reasoning, and error detection.
Here’s what you’ll explore:
-
Self-Calibration prompts LLMs to evaluate their own responses, helping them spot mistakes and reduce false positives and negatives.
-
Self-Refine lets LLMs iteratively improve their initial answers, step by step, to enhance both accuracy and quality.
-
Reversing Chain-of-Thought (RCoT) helps models detect hallucinations by comparing the original problem with a newly reconstructed version.
-
Self-Verification improves accuracy by generating multiple solutions and testing them against hidden portions of the original question.
-
Chain-of-Verification (CoVe) refines responses by having the model ask and answer verification questions to critique and improve its output.
-
Cumulative Reasoning (CR) breaks down complex tasks into smaller steps, refining each one until a solid solution is reached.
Let’s dive into these techniques and see how they can transform your LLM outputs!
Self-Calibration
One challenge with LLMs is that they can deliver both correct and incorrect answers with the same level of confidence, making it hard to know which ones to trust. Self-Calibration addresses this by prompting the model to assess its own output after generating a response. It helps LLM to spot mistakes and reduce the chances of false positives or negatives.
On the Self-Calibration page, you'll learn how to implement Self-Calibration.
Self-Refine
Self-Refine works a bit like how humans approach tasks: we create a rough draft, then improve it by reviewing and refining. In this technique, LLMs follow a similar process. They generate an initial output, then iteratively refine it step by step, boosting both accuracy and quality as they go.
On the Self-Refine page, you'll learn how Self-Refine works and its 3 key steps.
Reversing Chain-of-Thought (RCoT)
Reversing Chain-of-Thought (RCoT) takes the idea behind Chain-of-Thought (CoT) prompting and flips it around to help detect and fix hallucinations or incorrect assumptions. RCoT first prompts the model to solve a problem, then asks it to create a new problem based on its initial solution. The model compares the original and the new problem, helping it spot any inconsistencies.
On the RCoT, you'll learn how RCoT works and how it helps detect hallucinations.
Self-Verification
While Chain-of-Thought (CoT) prompting is great for reasoning, it lacks an error correction mechanism. Self-Verification solves this by generating multiple candidate solutions using CoT. Then, it evaluates each one by masking parts of the original question. The LLM has to predict the missing information based on the rest of the question and its generated solution.
On the Self-Verification page, you'll learn how to use Self-Verification to improve LLM accuracy by verifying conclusions against the original context.
Chain-of-Verification (CoVe)
Chain-of-Verification (CoVe) is similar to Chain-of-Thought (CoT) prompting, but instead of generating intermediate steps, CoVe has the model generate verification questions to evaluate its initial response. The model then answers these questions to refine its final output.
On the CoVe page, you'll explore how CoVe works in more detail.
Cumulative Reasoning
Cumulative Reasoning (CR) breaks down problem-solving into multiple steps, each evaluated by an LLM to decide whether to accept or reject them. If the process leads to the final answer, the model stops. If not, it keeps refining until it reaches a solution.
On the CR page, you'll learn about Cumulative Reasoning, three roles it relies on, and its applications.
Conclusion and Next Steps
Self-Criticism prompting offers a powerful way to enhance the performance of LLMs by encouraging them to assess and refine their own outputs. By applying these methods, you can reduce errors, boost the quality of responses, and increase the confidence you have in the model's conclusions.
Take these tools and start integrating them into your workflows to unlock the full potential of LLMs. Happy prompting!
Valeriia Kuka
Valeriia Kuka, Head of Content at Learn Prompting, is passionate about making AI and ML accessible. Valeriia previously grew a 60K+ follower AI-focused social media account, earning reposts from Stanford NLP, Amazon Research, Hugging Face, and AI researchers. She has also worked with AI/ML newsletters and global communities with 100K+ members and authored clear and concise explainers and historical articles.
🟢 Chain-of-Verification (CoVe)
🟦 Cumulative Reasoning
🟦 Reversing Chain-of-Thought (RCoT)
🟢 Self-Calibration
🟦 Self-Refine
◆ Self-Verification
Footnotes
-
Huang, J., Gu, S. S., Hou, L., Wu, Y., Wang, X., Yu, H., & Han, J. (2022). Large Language Models Can Self-Improve. https://arxiv.org/abs/2210.11610 ↩
-
Saurav Kadavath. (2022). Language Models (Mostly) Know What They Know. https://arxiv.org/abs/2207.05221 ↩
-
Aman Madaan. (2023). Self-Refine: Iterative Refinement with Self-Feedback. https://arxiv.org/abs/2303.17651 ↩
-
Tianci Xue. (2023). RCOT: Detecting and Rectifying Factual Inconsistency in Reasoning by Reversing Chain-of-Thought. ↩
-
Jason Wei. (2022). Chain-of-Thought Prompting Elicits Reasoning in Large Language Models. ↩ ↩2 ↩3
-
Yixuan Weng. (2022). Large Language Models are Better Reasoners with Self-Verification. https://arxiv.org/abs/2212.09561 ↩
-
Shehzaad Dhuliawala. (2023). Chain-of-Verification Reduces Hallucination in Large Language Models. https://arxiv.org/abs/2309.11495 ↩
-
Yifan Zhang. (2023). Cumulative Reasoning with Large Language Models. https://arxiv.org/abs/2308.04371 ↩