Files
langchain/docs/docs/how_to/example_selectors.ipynb
Erick Friis 21d14549a9 docs: v0.2 docs in master (#21438)
current python.langchain.com is building from branch `v0.1`. Iterate on
v0.2 docs here.

---------

Signed-off-by: Weichen Xu <weichen.xu@databricks.com>
Signed-off-by: Rahul Tripathi <rauhl.psit.ec@gmail.com>
Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>
Co-authored-by: jacoblee93 <jacoblee93@gmail.com>
Co-authored-by: Leonid Ganeline <leo.gan.57@gmail.com>
Co-authored-by: Leonid Kuligin <lkuligin@yandex.ru>
Co-authored-by: Averi Kitsch <akitsch@google.com>
Co-authored-by: Nuno Campos <nuno@langchain.dev>
Co-authored-by: Nuno Campos <nuno@boringbits.io>
Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>
Co-authored-by: Martín Gotelli Ferenaz <martingotelliferenaz@gmail.com>
Co-authored-by: Fayfox <admin@fayfox.com>
Co-authored-by: Eugene Yurtsev <eugene@langchain.dev>
Co-authored-by: Dawson Bauer <105886620+djbauer2@users.noreply.github.com>
Co-authored-by: Ravindu Somawansa <ravindu.somawansa@gmail.com>
Co-authored-by: Dhruv Chawla <43818888+Dominastorm@users.noreply.github.com>
Co-authored-by: ccurme <chester.curme@gmail.com>
Co-authored-by: Bagatur <baskaryan@gmail.com>
Co-authored-by: WeichenXu <weichen.xu@databricks.com>
Co-authored-by: Benito Geordie <89472452+benitoThree@users.noreply.github.com>
Co-authored-by: kartikTAI <129414343+kartikTAI@users.noreply.github.com>
Co-authored-by: Kartik Sarangmath <kartik@thirdai.com>
Co-authored-by: Sevin F. Varoglu <sfvaroglu@octoml.ai>
Co-authored-by: MacanPN <martin.triska@gmail.com>
Co-authored-by: Prashanth Rao <35005448+prrao87@users.noreply.github.com>
Co-authored-by: Hyeongchan Kim <kozistr@gmail.com>
Co-authored-by: sdan <git@sdan.io>
Co-authored-by: Guangdong Liu <liugddx@gmail.com>
Co-authored-by: Rahul Triptahi <rahul.psit.ec@gmail.com>
Co-authored-by: Rahul Tripathi <rauhl.psit.ec@gmail.com>
Co-authored-by: pjb157 <84070455+pjb157@users.noreply.github.com>
Co-authored-by: Eun Hye Kim <ehkim1440@gmail.com>
Co-authored-by: kaijietti <43436010+kaijietti@users.noreply.github.com>
Co-authored-by: Pengcheng Liu <pcliu.fd@gmail.com>
Co-authored-by: Tomer Cagan <tomer@tomercagan.com>
Co-authored-by: Christophe Bornet <cbornet@hotmail.com>
2024-05-08 12:29:59 -07:00

278 lines
7.8 KiB
Plaintext

{
"cells": [
{
"cell_type": "raw",
"id": "af408f61",
"metadata": {},
"source": [
"---\n",
"sidebar_position: 1\n",
"---"
]
},
{
"cell_type": "markdown",
"id": "1a65e4c9",
"metadata": {},
"source": [
"# How to use example selectors\n",
"\n",
"If you have a large number of examples, you may need to select which ones to include in the prompt. The Example Selector is the class responsible for doing so.\n",
"\n",
"The base interface is defined as below:\n",
"\n",
"```python\n",
"class BaseExampleSelector(ABC):\n",
" \"\"\"Interface for selecting examples to include in prompts.\"\"\"\n",
"\n",
" @abstractmethod\n",
" def select_examples(self, input_variables: Dict[str, str]) -> List[dict]:\n",
" \"\"\"Select which examples to use based on the inputs.\"\"\"\n",
" \n",
" @abstractmethod\n",
" def add_example(self, example: Dict[str, str]) -> Any:\n",
" \"\"\"Add new example to store.\"\"\"\n",
"```\n",
"\n",
"The only method it needs to define is a ``select_examples`` method. This takes in the input variables and then returns a list of examples. It is up to each specific implementation as to how those examples are selected.\n",
"\n",
"LangChain has a few different types of example selectors. For an overview of all these types, see the below table.\n",
"\n",
"In this guide, we will walk through creating a custom example selector."
]
},
{
"cell_type": "markdown",
"id": "638e9039",
"metadata": {},
"source": [
"## Examples\n",
"\n",
"In order to use an example selector, we need to create a list of examples. These should generally be example inputs and outputs. For this demo purpose, let's imagine we are selecting examples of how to translate English to Italian."
]
},
{
"cell_type": "code",
"execution_count": 36,
"id": "48658d53",
"metadata": {},
"outputs": [],
"source": [
"examples = [\n",
" {\"input\": \"hi\", \"output\": \"ciao\"},\n",
" {\"input\": \"bye\", \"output\": \"arrivaderci\"},\n",
" {\"input\": \"soccer\", \"output\": \"calcio\"},\n",
"]"
]
},
{
"cell_type": "markdown",
"id": "c2830b49",
"metadata": {},
"source": [
"## Custom Example Selector\n",
"\n",
"Let's write an example selector that chooses what example to pick based on the length of the word."
]
},
{
"cell_type": "code",
"execution_count": 37,
"id": "56b740a1",
"metadata": {},
"outputs": [],
"source": [
"from langchain_core.example_selectors.base import BaseExampleSelector\n",
"\n",
"\n",
"class CustomExampleSelector(BaseExampleSelector):\n",
" def __init__(self, examples):\n",
" self.examples = examples\n",
"\n",
" def add_example(self, example):\n",
" self.examples.append(example)\n",
"\n",
" def select_examples(self, input_variables):\n",
" # This assumes knowledge that part of the input will be a 'text' key\n",
" new_word = input_variables[\"input\"]\n",
" new_word_length = len(new_word)\n",
"\n",
" # Initialize variables to store the best match and its length difference\n",
" best_match = None\n",
" smallest_diff = float(\"inf\")\n",
"\n",
" # Iterate through each example\n",
" for example in self.examples:\n",
" # Calculate the length difference with the first word of the example\n",
" current_diff = abs(len(example[\"input\"]) - new_word_length)\n",
"\n",
" # Update the best match if the current one is closer in length\n",
" if current_diff < smallest_diff:\n",
" smallest_diff = current_diff\n",
" best_match = example\n",
"\n",
" return [best_match]"
]
},
{
"cell_type": "code",
"execution_count": 38,
"id": "ce928187",
"metadata": {},
"outputs": [],
"source": [
"example_selector = CustomExampleSelector(examples)"
]
},
{
"cell_type": "code",
"execution_count": 39,
"id": "37ef3149",
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"[{'input': 'bye', 'output': 'arrivaderci'}]"
]
},
"execution_count": 39,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"example_selector.select_examples({\"input\": \"okay\"})"
]
},
{
"cell_type": "code",
"execution_count": 40,
"id": "c5ad9f35",
"metadata": {},
"outputs": [],
"source": [
"example_selector.add_example({\"input\": \"hand\", \"output\": \"mano\"})"
]
},
{
"cell_type": "code",
"execution_count": 41,
"id": "e4127fe0",
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"[{'input': 'hand', 'output': 'mano'}]"
]
},
"execution_count": 41,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"example_selector.select_examples({\"input\": \"okay\"})"
]
},
{
"cell_type": "markdown",
"id": "786c920c",
"metadata": {},
"source": [
"## Use in a Prompt\n",
"\n",
"We can now use this example selector in a prompt"
]
},
{
"cell_type": "code",
"execution_count": 42,
"id": "619090e2",
"metadata": {},
"outputs": [],
"source": [
"from langchain_core.prompts.few_shot import FewShotPromptTemplate\n",
"from langchain_core.prompts.prompt import PromptTemplate\n",
"\n",
"example_prompt = PromptTemplate.from_template(\"Input: {input} -> Output: {output}\")"
]
},
{
"cell_type": "code",
"execution_count": 43,
"id": "5934c415",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Translate the following words from English to Italain:\n",
"\n",
"Input: hand -> Output: mano\n",
"\n",
"Input: word -> Output:\n"
]
}
],
"source": [
"prompt = FewShotPromptTemplate(\n",
" example_selector=example_selector,\n",
" example_prompt=example_prompt,\n",
" suffix=\"Input: {input} -> Output:\",\n",
" prefix=\"Translate the following words from English to Italain:\",\n",
" input_variables=[\"input\"],\n",
")\n",
"\n",
"print(prompt.format(input=\"word\"))"
]
},
{
"cell_type": "markdown",
"id": "e767f69d",
"metadata": {},
"source": [
"## Example Selector Types\n",
"\n",
"| Name | Description |\n",
"|------------|---------------------------------------------------------------------------------------------|\n",
"| Similarity | Uses semantic similarity between inputs and examples to decide which examples to choose. |\n",
"| MMR | Uses Max Marginal Relevance between inputs and examples to decide which examples to choose. |\n",
"| Length | Selects examples based on how many can fit within a certain length |\n",
"| Ngram | Uses ngram overlap between inputs and examples to decide which examples to choose. |"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "8a6e0abe",
"metadata": {},
"outputs": [],
"source": []
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.10.1"
}
},
"nbformat": 4,
"nbformat_minor": 5
}