Last updated on September 27, 2024
K-Nearest neighbor (KNN) is a technique to choose exemplars for a Few-Shot standard prompt from a dataset of exemplars. The goal is to choose the exemplars that are most relevant to the task at hand, thereby improving the performance of the Large Language Model (LLM).
It works by selecting examples from an external dataset that are most similar to the prompt that you're giving the model. is a value chosen by the user, and it corresponds to the number of exemplars you want to include in the prompt.
First, you need a dataset of example prompts and completions, like the following:
ID | Prompt | Completion |
---|---|---|
1 | What is the capital of France? | The capital of France is Paris. |
2 | How tall is Mount Everest? | Mount Everest is 8,848 meters tall. |
3 | Who wrote 'Romeo and Juliet'? | 'Romeo and Juliet' was written by Shakespeare. |
4 | What is the largest ocean? | The Pacific Ocean is the largest ocean. |
5 | When was the Declaration of Independence signed? | The Declaration of Independence was signed in 1776. |
6 | Who painted the Mona Lisa? | The Mona Lisa was painted by Leonardo da Vinci. |
7 | What is the speed of light? | The speed of light is approximately 299,792 kilometers per second. |
8 | How long does it take to travel to the moon? | It takes about 3 days to travel to the moon. |
9 | What is the capital of Germany? | The capital of Germany is Berlin. |
10 | What is the boiling point of water? | The boiling point of water is 100°C or 212°F at sea level. |
Then, you input your prompt into the KNN function and find the number of prompts from the dataset that are most similar to use as exemplars. You then use those prompts, along with their completions, as exemplars in your Few-Shot prompt and give it to the model with your original question.
Your prompt is "What is overfitting in machine learning?", your value is 2, and your dataset is as follows:
ID | Prompt | Completion |
---|---|---|
1 | Who discovered the law of gravity? | The law of gravity was discovered by Sir Isaac Newton. |
2 | What is supervised learning? | Supervised learning is a type of machine learning where the model is trained on labeled data. |
3 | What is the boiling point of water in Celsius? | The boiling point of water in Celsius is 100°C. |
4 | Who wrote 'Pride and Prejudice'? | 'Pride and Prejudice' was written by Jane Austen. |
5 | What is a neural network? | A neural network is a series of algorithms that attempt to recognize patterns in data, mimicking the human brain. |
6 | What is the capital of Italy? | The capital of Italy is Rome. |
7 | What is the speed of sound in air? | The speed of sound in air is approximately 343 meters per second. |
8 | What is bail in court? | Bail is a temporary release of an accused person awaiting trial. |
9 | What is the tallest mountain in Africa? | The tallest mountain in Africa is Mount Kilimanjaro. |
10 | What is the chemical symbol for gold? | The chemical symbol for gold is Au. |
Once you plug in your prompt into the KNN function, it'll return prompts 2 and 5 as the nearest neighbors. Now, you use those in your Few-Shot prompt:
Q: What is supervised learning? A: Supervised learning is a type of machine learning where the model is trained on labeled data. Q: What is a neural network? A: A neural network is a series of algorithms that attempt to recognize patterns in data, mimicking the human brain. Q: What is overfitting in machine learning? A:
Overfitting in machine learning occurs when a model learns not only the underlying patterns in the training data but also the noise and outliers.
Your prompt is "What are London Dispersion Forces?", your k value is 1, and your dataset is as follows:
ID | Prompt | Completion |
---|---|---|
1 | Who was the first president of the United States? | The first president of the United States was George Washington. |
2 | What country is bordered on three sides by Senegal? | The Gambia is bordered on three sides by Senegal. |
3 | How many continents are there? | There are seven continents on Earth. |
4 | Who developed the periodic table? | The periodic table was developed by Dmitri Mendeleev. |
5 | What was the Great Leap Forward? | The Great Leap Forward was a social and economic campaign initiated by the Chinese Communist Party under the leadership of Mao Zedong from 1958 to 1962 |
6 | What is the capital of Australia? | The capital of Australia is Canberra. |
7 | What is the process of rusting? | Rusting is the oxidation of iron in the presence of water and oxygen, forming iron oxide. |
8 | Who wrote 'The Odyssey'? | 'The Odyssey' was written by Homer. |
9 | What is the largest planet in our solar system? | The largest planet in our solar system is Jupiter. |
10 | What is the freezing point of mercury? | The freezing point of mercury is -38.83°C or -37.89°F. |
Once you plug in your prompt into the KNN function, it'll return prompt 10 as the nearest neighbor. Now, you use prompt 10 in your Few-Shot prompt:
Q: What is the freezing point of mercury? A: The freezing point of mercury is -38.83°C or -37.89°F. Q: What are London Dispersion Forces? A:
London Dispersion Forces are a type of weak intermolecular force that arises from the temporary fluctuations in electron distribution within atoms and molecules.
Your prompt is "What is prior restraint?", your k value is 3, and your dataset is as follows:
ID | Prompt | Completion |
---|---|---|
1 | What is the capital of Brazil? | The capital of Brazil is Brasília. |
2 | What is the purpose of the Constitution? | The Constitution serves as the supreme law of the land, outlining the structure of government and the rights of citizens. |
3 | Who painted the Mona Lisa? | The Mona Lisa was painted by Leonardo da Vinci. |
4 | What is the Miranda warning? | The Miranda warning is a statement police must give to inform suspects of their rights to remain silent and have an attorney. |
5 | What is the boiling point of ethanol? | The boiling point of ethanol is 78.37°C or 173.07°F. |
6 | Who is the author of '1984'? | '1984' was written by George Orwell. |
7 | What is the principle of double jeopardy? | Double jeopardy is a legal principle that prohibits someone from being tried twice for the same crime. |
8 | What is the atomic number of hydrogen? | The atomic number of hydrogen is 1. |
9 | What is the speed of light in a vacuum? | The speed of light in a vacuum is approximately 299,792 kilometers per second. |
10 | Who discovered penicillin? | Penicillin was discovered by Alexander Fleming in 1928. |
Once you plug in your prompt into the KNN function, it'll return prompts 2, 4, and 7 as the nearest neighbors. Now, you use those in your Few-Shot prompt:
Q: What is the purpose of the Constitution? A: The Constitution serves as the supreme law of the land, outlining the structure of government and the rights of citizens. Q: What is the Miranda warning? A: The Miranda warning is a statement police must give to inform suspects of their rights to remain silent and have an attorney. Q: What is the principle of double jeopardy? A: Double jeopardy is a legal principle that prohibits someone from being tried twice for the same crime. Q: What is prior restraint? A:
Prior restraint is a legal concept where the government restricts or prevents publication or speech before it happens, rather than punishing it after it occurs.
Since, for a given prompt, KNN calculates the similarity of all prompts in a dataset, it can be computationally expensive for large datasets. Also, choosing a correct value is arbitrary and can be very difficult if you don't know your dataset well.
KNN is part of a family of algorithms used to choose exemplars most similar to the prompt at hand to boost performance in few-shot prompting. Though it is effective at improving performance, it can be computationally expensive and is only useful for tasks that require high specificity and are too complex for regular exemplar finding methods.
Shi, W., Michael, J., Gururangan, S., & Zettlemoyer, L. (2022). kNN-Prompt: Nearest Neighbor Zero-Shot Inference. https://arxiv.org/abs/2205.13792 ↩
Liu, J., Shen, D., Zhang, Y., Dolan, B., Carin, L., & Chen, W. (2021). What Makes Good In-Context Examples for GPT-3? https://arxiv.org/abs/2101.06804 ↩