π’ Self-Generated In-Context Learning (SG-ICL)
- Self-Generated In-Context Learning (SG-ICL) is a technique used to get few shot examples directly from the model you're trying to get answers from.
- It's intuitive, easy-to-use, and fast, and it comes in handy when you don't have a dataset of exemplars available.
- However, its speed and ease of use come at the expense of quality, and it doesn't perform as well as other techniques that are done with datasets of exemplars.
Information and Links
Technique | Institution | Date of Publication | Paper |
---|---|---|---|
Self Generated In-Context Learning (SG-ICL) | Seoul National University, Hanyang University, NAVER AI Lab, NAVER CLOVA | Jun 2022 | Self-Generated In-Context Learning: Leveraging Auto-regressive Language Models as a Demonstration Generator |
What is Self-Generated In-Context Learning?
Self-Generated In-Context Learning (SG-ICL) is a technique to generate exemplars for a few-shot standard prompt by asking the model itself to generate them.
Typically, in-context learning (ICL) relies on a few input-label pairs (called exemplars) to help models perform tasks without fine-tuning. However, these demonstrations are often chosen from external datasets, which introduces dependency on external data. SG-ICL generates these demonstrations using the language model itself, reducing reliance on external datasets and improving performance consistency.
How does SG-ICL work?
It works in two steps:
-
Self-Generation Step: The model generates examplars closely related to the specific task at hand, improving input-demonstration correlation.
-
Inference Step: The generated samples are used as exemplars. The model then predicts the class for the test input based on these generated samples, which are tailored to the task, leading to better performance than relying on external examples.
Benefits and Applications
SG-ICL offers several benefits for text classification tasks such as sentiment analysis and natural language inference:
- No external data required: The main advantage of SG-ICL is that you don't need a dataset of exemplars, and it's really easy to use.
- Low variance in performance: By generating task-specific examples, SG-ICL provides more consistent results compared to relying on randomly selected demonstrations from datasets.
How SG-ICL differs from existing techniques
SG-ICL stands out because it self-generates demonstrations instead of retrieving them from external datasets. Here's how it differs from other methods:
-
Few-Shot Learning: Few-shot learning uses a small number of manually selected training samples. SG-ICL, on the other hand, eliminates the need for any external training data, performing well even without direct training samples.
-
Zero-Shot Learning: In zero-shot learning, models perform tasks without any examples or training data. SG-ICL generates its own in-context examples, resulting in better performance than zero-shot models, which have no context for the task.
How to use SG-ICL
The following are the templates for the self generation step:
Task | Self Generation Template |
---|---|
Sentiment analysis (2 categories) | Generate a review: a fast, funny, highly enjoyable movie. Generate a "negative" review: |
Sentiment analysis (5 categories) | Generate a review: it 's worth taking the kids to. Generate a "negative" review: |
Recognizing Textual Entailment | Premise: Dana Reeve, the widow of the actor Christopher Reeve, has died of lung cancer at age 44, according to the Christopher Reeve Foundation. Generate a Hypothesis: Christopher Reeve had an accident. Generate a "true" Hypothesis: |
CommitmentBank | Premise: It was a complex language. Not written down but handed down. One might say it was peeled down. Generate a Hypothesis: the language was peeled down. Generate a "neither" Hypothesis: |
Those templates are used to generate exemplars for the inference step. The templates for the inference step are as follows:
Task | Inference Template | Verbalizer |
---|---|---|
Minimal | a fast , funny , highly enjoyable movie . positive | - |
Sentiment analysis (2 categories) | Review: a fast, funny, highly enjoyable movie. Sentiment : positive | positive / negative |
Sentiment analysis (5 categories) | Review: it 's worth taking the kids to. | terrible / bad / okay / good / great |
Recognizing Textual Entailment | Premise: Dana Reeve, the widow of the actor Christopher Reeve, has died of lung cancer at age 44, according to the Christopher Reeve Foundation. Hypothesis: Christopher Reeve had an accident. True or False? false | true / false |
CommitmentBank | Premise: It was a complex language. Not written down but handed down. One might say it was peeled down. Hypothesis: the language was peeled down Yes, No, or Neither? yes | yes / no / neither |
Note that those templates don't include the exemplars generated in the self generation step, but a real prompt would include the exemplars before the actual inference template.
SG-ICL Example: Sentiment Analysis
Lets say you have a review from a customer of a restaurant that goes like this: "the food was amazingly bad, and the service wasn't anything to write home about."
You want to determine the sentiment of the review, but you don't have a dataset of example reviews to go off of, so you decide to use SG-ICL.
Step 1: Self Generation
First, you include your original review in a prompt that asks the model to generate more example reviews. We'll do two for this example; one positive and one negative.
Note that this initial step is a few shot prompt of its own, with one exemplar. That's why it's better to have "generate a review" prefacing your example.
Prompt
Generate a review: the food was amazingly bad, and the service wasn't anything to write home about.
Generate a positive review:
AI Output
the food was insanely good!
Prompt
Generate a review: the food was amazingly bad, and the service wasn't anything to write home about.
Generate a negative review:
AI Output
I could've gone back home and made homemade lasagna in the time they took to complete my order.
Step 2: Inference
This part is simple. You just put it all together into one prompt, input it into the model, and get back your answer.
Prompt
Review: the food was insanely good! Sentiment: Positive
Review: I could've gone back home and made homemade lasagna in the time they took to complete my order. Sentiment: Negative
Review: the food was amazingly bad, and the service wasn't anything to write home about. Sentiment:
AI Output
Negative
Limitations of SG-ICL
SG-ICL is good for when you don't have a dataset and is also much less computationally expensive than other techniques that are done on a dataset (e.g KNN, Vote-k). It is worse than those other techniques, though, and should only really be used when there's no dataset available or when you have one but you don't have much computational resources.
Conclusion
Self-Generated In-Context Learning is an intuitive method for generating exemplars for a few shot prompt directly from the model that you're going to be prompting. It works best when you don't have access to a dataset of exemplars or when you don't have computational resources available to do operations on a dataset. SG-ICL performs better than zero-shot prompting but not than other techniques that involve operations on datasets, like KNN or Vote-k.
Valeriia Kuka
Valeriia Kuka, Head of Content at Learn Prompting, is passionate about making AI and ML accessible. Valeriia previously grew a 60K+ follower AI-focused social media account, earning reposts from Stanford NLP, Amazon Research, Hugging Face, and AI researchers. She has also worked with AI/ML newsletters and global communities with 100K+ members and authored clear and concise explainers and historical articles.
Footnotes
-
Kim, H. J., Cho, H., Kim, J., Kim, T., Yoo, K. M., & goo Sang-Lee. (2022). Self-Generated In-Context Learning: Leveraging Auto-regressive Language Models as a Demonstration Generator. https://arxiv.org/abs/2206.08082 β©