Last updated on October 3, 2024
Consistency-based Self-Adaptive Prompting (COSP) is a novel technique designed to improve the reasoning capabilities of Large Language Models (LLMs) in Zero-Shot settings. LLMs have demonstrated impressive abilities, but their performance in reasoning tasks often varies significantly depending on the approach used.
Two main methods—Few-Shot Prompting (providing the model with handpicked examples) and Zero-Shot Chain-of-Thought (CoT) prompting (triggering step-by-step reasoning)—have shown success, but each has limitations. COSP addresses these by automatically selecting useful in-context examples from the LLM’s own generated responses, without needing labeled data or handcrafted prompts.
While Zero-Shot CoT relies on trigger phrases alone to prompt the model, COSP takes this further by:
Few-Shot CoT requires manually selecting a small number of example questions and answers to guide the model. This is effective but labor-intensive and not scalable across tasks. COSP achieves similar or better performance without any labeled data, automatically selecting relevant in-context examples from the model's own outputs.
COSP can be applied to any reasoning-based task where Zero-Shot performance is needed. It requires:
The system will then follow these steps:
Run the model multiple times on each test question using Zero-Shot CoT.
[Test question]
Let's think step by step.
For example:
Henry had 11 dollars. For his birthday, he got 18 more dollars but spent 10 on a new game. How much money does he have now?
Let's think step by step.
AI Outputs for this example:
1. "11 + 18 = 29, 29 - 10 = 19"
2. "Henry has 27 dollars."
3. "He has 11 + 18 - 10 = 19."
4. "He bought 11 games and added 18. Then subtracted 10 for the game."
COSP automatically selects the most consistent and diverse reasoning paths to use as in-context examples.
Selected Example:
"11 + 18 = 29, 29 - 10 = 19."
COSP uses the selected examples to prompt the model again and choose the best answer based on a majority vote.
Final Answer: "19 dollars."
COSP significantly improves performance on reasoning tasks when compared to Zero-Shot and even Few-Shot CoT. Here are the results from COSP applied to multiple reasoning benchmarks:
Task | Zero-Shot CoT | Few-Shot CoT | COSP |
---|---|---|---|
MultiArith | 67.2% | 81.0% | 85.0% |
AddSub | 69.1% | 72.4% | 78.9% |
GSM-8K | 20.9% | 30.3% | 30.2% |
StrategyQA | 57.2% | 67.9% | 64.7% |
COSP offers a powerful solution for improving zero-shot reasoning in LLMs. It removes the need for manual example crafting, instead relying on the model’s own outputs to guide reasoning. By combining consistency, diversity, and repetition analysis, COSP leads to significant performance improvements across multiple reasoning tasks. This makes it a scalable and efficient approach for improving LLM reasoning in real-world applications.
Wan, X., Sun, R., Dai, H., Arik, S. O., & Pfister, T. (2023). Better Zero-Shot Reasoning with Self-Adaptive Prompting. https://arxiv.org/abs/2305.14106 ↩ ↩2