🟦 Consistency-based Self-adaptive Prompting (COSP)
What is COSP?
Consistency-based Self-Adaptive Prompting (COSP) is a novel technique designed to improve the reasoning capabilities of Large Language Models (LLMs) in Zero-Shot settings. LLMs have demonstrated impressive abilities, but their performance in reasoning tasks often varies significantly depending on the approach used.
Two main methods—Few-Shot Prompting (providing the model with handpicked examples) and Zero-Shot Chain-of-Thought (CoT) prompting (triggering step-by-step reasoning)—have shown success, but each has limitations. COSP addresses these by automatically selecting useful in-context examples from the LLM’s own generated responses, without needing labeled data or handcrafted prompts.
Why is COSP Needed?
- Few-Shot prompting requires careful selection of examples, which is time-consuming and task-specific.
- Zero-Shot CoT prompting often underperforms due to lack of guidance, leading to spurious reasoning paths.
- COSP solves these problems by leveraging the model's own generated outputs, selecting the most useful examples based on consistency, diversity, and minimal repetition.
How COSP Differs from Existing Techniques
Zero-Shot CoT vs. COSP
While Zero-Shot CoT relies on trigger phrases alone to prompt the model, COSP takes this further by:
- Generating multiple outputs for each question.
- Selecting the best in-context examples based on consistency and diversity.
- Using majority voting to improve the reliability of the final answer.
Few-Shot CoT vs. COSP
Few-Shot CoT requires manually selecting a small number of example questions and answers to guide the model. This is effective but labor-intensive and not scalable across tasks. COSP achieves similar or better performance without any labeled data, automatically selecting relevant in-context examples from the model's own outputs.
Benefits and Applications
- Higher accuracy: COSP consistently outperforms Zero-Shot and even Few-Shot baselines on reasoning tasks, improving accuracy by up to 15%.
- No labeled data needed: COSP works without any labeled data or handcrafted examples, making it scalable and efficient.
- Consistency-driven: By focusing on consistency and diversity, COSP improves the reliability of predictions in Zero-Shot scenarios.
How COSP Works
- Stage 1: Generating Responses: The model generates multiple reasoning paths for each test question using Zero-Shot CoT. These paths are then assessed for their reliability based on consistency across answers.
- Stage 2: Selecting Demonstrations: The best responses are selected as in-context demonstrations based on criteria like consistency (whether the same answer is repeated), minimal repetition, and diversity of reasoning paths. These selected examples are then used to guide the model’s final prediction.
- Majority Vote: The final prediction is chosen by majority vote across multiple generated reasoning paths.
How to Use COSP
COSP can be applied to any reasoning-based task where Zero-Shot performance is needed. It requires:
- Access to a Large Language Model
- Unlabeled test questions
The system will then follow these steps:
1. Generate responses
Run the model multiple times on each test question using Zero-Shot CoT.
Prompt
[Test question]
Let's think step by step.
For example:
Prompt
Henry had 11 dollars. For his birthday, he got 18 more dollars but spent 10 on a new game. How much money does he have now?
Let's think step by step.
AI Outputs for this example:
1. "11 + 18 = 29, 29 - 10 = 19"
2. "Henry has 27 dollars."
3. "He has 11 + 18 - 10 = 19."
4. "He bought 11 games and added 18. Then subtracted 10 for the game."
2. Select demonstrations
COSP automatically selects the most consistent and diverse reasoning paths to use as in-context examples.
Selected Example:
"11 + 18 = 29, 29 - 10 = 19."
3. Final prediction
COSP uses the selected examples to prompt the model again and choose the best answer based on a majority vote.
Final Answer: "19 dollars."
Results of COSP
COSP significantly improves performance on reasoning tasks when compared to Zero-Shot and even Few-Shot CoT. Here are the results from COSP applied to multiple reasoning benchmarks:
Task | Zero-Shot CoT | Few-Shot CoT | COSP |
---|---|---|---|
MultiArith | 67.2% | 81.0% | 85.0% |
AddSub | 69.1% | 72.4% | 78.9% |
GSM-8K | 20.9% | 30.3% | 30.2% |
StrategyQA | 57.2% | 67.9% | 64.7% |
Conclusion
COSP offers a powerful solution for improving zero-shot reasoning in LLMs. It removes the need for manual example crafting, instead relying on the model’s own outputs to guide reasoning. By combining consistency, diversity, and repetition analysis, COSP leads to significant performance improvements across multiple reasoning tasks. This makes it a scalable and efficient approach for improving LLM reasoning in real-world applications.
Valeriia Kuka
Valeriia Kuka, Head of Content at Learn Prompting, is passionate about making AI and ML accessible. Valeriia previously grew a 60K+ follower AI-focused social media account, earning reposts from Stanford NLP, Amazon Research, Hugging Face, and AI researchers. She has also worked with AI/ML newsletters and global communities with 100K+ members and authored clear and concise explainers and historical articles.
Footnotes
-
Wan, X., Sun, R., Dai, H., Arik, S. O., & Pfister, T. (2023). Better Zero-Shot Reasoning with Self-Adaptive Prompting. https://arxiv.org/abs/2305.14106 ↩ ↩2