mirror of
https://github.com/hwchase17/langchain.git
synced 2026-02-21 14:43:07 +00:00
current python.langchain.com is building from branch `v0.1`. Iterate on v0.2 docs here. --------- Signed-off-by: Weichen Xu <weichen.xu@databricks.com> Signed-off-by: Rahul Tripathi <rauhl.psit.ec@gmail.com> Co-authored-by: Harrison Chase <hw.chase.17@gmail.com> Co-authored-by: jacoblee93 <jacoblee93@gmail.com> Co-authored-by: Leonid Ganeline <leo.gan.57@gmail.com> Co-authored-by: Leonid Kuligin <lkuligin@yandex.ru> Co-authored-by: Averi Kitsch <akitsch@google.com> Co-authored-by: Nuno Campos <nuno@langchain.dev> Co-authored-by: Nuno Campos <nuno@boringbits.io> Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com> Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com> Co-authored-by: Martín Gotelli Ferenaz <martingotelliferenaz@gmail.com> Co-authored-by: Fayfox <admin@fayfox.com> Co-authored-by: Eugene Yurtsev <eugene@langchain.dev> Co-authored-by: Dawson Bauer <105886620+djbauer2@users.noreply.github.com> Co-authored-by: Ravindu Somawansa <ravindu.somawansa@gmail.com> Co-authored-by: Dhruv Chawla <43818888+Dominastorm@users.noreply.github.com> Co-authored-by: ccurme <chester.curme@gmail.com> Co-authored-by: Bagatur <baskaryan@gmail.com> Co-authored-by: WeichenXu <weichen.xu@databricks.com> Co-authored-by: Benito Geordie <89472452+benitoThree@users.noreply.github.com> Co-authored-by: kartikTAI <129414343+kartikTAI@users.noreply.github.com> Co-authored-by: Kartik Sarangmath <kartik@thirdai.com> Co-authored-by: Sevin F. Varoglu <sfvaroglu@octoml.ai> Co-authored-by: MacanPN <martin.triska@gmail.com> Co-authored-by: Prashanth Rao <35005448+prrao87@users.noreply.github.com> Co-authored-by: Hyeongchan Kim <kozistr@gmail.com> Co-authored-by: sdan <git@sdan.io> Co-authored-by: Guangdong Liu <liugddx@gmail.com> Co-authored-by: Rahul Triptahi <rahul.psit.ec@gmail.com> Co-authored-by: Rahul Tripathi <rauhl.psit.ec@gmail.com> Co-authored-by: pjb157 <84070455+pjb157@users.noreply.github.com> Co-authored-by: Eun Hye Kim <ehkim1440@gmail.com> Co-authored-by: kaijietti <43436010+kaijietti@users.noreply.github.com> Co-authored-by: Pengcheng Liu <pcliu.fd@gmail.com> Co-authored-by: Tomer Cagan <tomer@tomercagan.com> Co-authored-by: Christophe Bornet <cbornet@hotmail.com>
278 lines
7.8 KiB
Plaintext
278 lines
7.8 KiB
Plaintext
{
|
|
"cells": [
|
|
{
|
|
"cell_type": "raw",
|
|
"id": "af408f61",
|
|
"metadata": {},
|
|
"source": [
|
|
"---\n",
|
|
"sidebar_position: 1\n",
|
|
"---"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"id": "1a65e4c9",
|
|
"metadata": {},
|
|
"source": [
|
|
"# How to use example selectors\n",
|
|
"\n",
|
|
"If you have a large number of examples, you may need to select which ones to include in the prompt. The Example Selector is the class responsible for doing so.\n",
|
|
"\n",
|
|
"The base interface is defined as below:\n",
|
|
"\n",
|
|
"```python\n",
|
|
"class BaseExampleSelector(ABC):\n",
|
|
" \"\"\"Interface for selecting examples to include in prompts.\"\"\"\n",
|
|
"\n",
|
|
" @abstractmethod\n",
|
|
" def select_examples(self, input_variables: Dict[str, str]) -> List[dict]:\n",
|
|
" \"\"\"Select which examples to use based on the inputs.\"\"\"\n",
|
|
" \n",
|
|
" @abstractmethod\n",
|
|
" def add_example(self, example: Dict[str, str]) -> Any:\n",
|
|
" \"\"\"Add new example to store.\"\"\"\n",
|
|
"```\n",
|
|
"\n",
|
|
"The only method it needs to define is a ``select_examples`` method. This takes in the input variables and then returns a list of examples. It is up to each specific implementation as to how those examples are selected.\n",
|
|
"\n",
|
|
"LangChain has a few different types of example selectors. For an overview of all these types, see the below table.\n",
|
|
"\n",
|
|
"In this guide, we will walk through creating a custom example selector."
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"id": "638e9039",
|
|
"metadata": {},
|
|
"source": [
|
|
"## Examples\n",
|
|
"\n",
|
|
"In order to use an example selector, we need to create a list of examples. These should generally be example inputs and outputs. For this demo purpose, let's imagine we are selecting examples of how to translate English to Italian."
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": 36,
|
|
"id": "48658d53",
|
|
"metadata": {},
|
|
"outputs": [],
|
|
"source": [
|
|
"examples = [\n",
|
|
" {\"input\": \"hi\", \"output\": \"ciao\"},\n",
|
|
" {\"input\": \"bye\", \"output\": \"arrivaderci\"},\n",
|
|
" {\"input\": \"soccer\", \"output\": \"calcio\"},\n",
|
|
"]"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"id": "c2830b49",
|
|
"metadata": {},
|
|
"source": [
|
|
"## Custom Example Selector\n",
|
|
"\n",
|
|
"Let's write an example selector that chooses what example to pick based on the length of the word."
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": 37,
|
|
"id": "56b740a1",
|
|
"metadata": {},
|
|
"outputs": [],
|
|
"source": [
|
|
"from langchain_core.example_selectors.base import BaseExampleSelector\n",
|
|
"\n",
|
|
"\n",
|
|
"class CustomExampleSelector(BaseExampleSelector):\n",
|
|
" def __init__(self, examples):\n",
|
|
" self.examples = examples\n",
|
|
"\n",
|
|
" def add_example(self, example):\n",
|
|
" self.examples.append(example)\n",
|
|
"\n",
|
|
" def select_examples(self, input_variables):\n",
|
|
" # This assumes knowledge that part of the input will be a 'text' key\n",
|
|
" new_word = input_variables[\"input\"]\n",
|
|
" new_word_length = len(new_word)\n",
|
|
"\n",
|
|
" # Initialize variables to store the best match and its length difference\n",
|
|
" best_match = None\n",
|
|
" smallest_diff = float(\"inf\")\n",
|
|
"\n",
|
|
" # Iterate through each example\n",
|
|
" for example in self.examples:\n",
|
|
" # Calculate the length difference with the first word of the example\n",
|
|
" current_diff = abs(len(example[\"input\"]) - new_word_length)\n",
|
|
"\n",
|
|
" # Update the best match if the current one is closer in length\n",
|
|
" if current_diff < smallest_diff:\n",
|
|
" smallest_diff = current_diff\n",
|
|
" best_match = example\n",
|
|
"\n",
|
|
" return [best_match]"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": 38,
|
|
"id": "ce928187",
|
|
"metadata": {},
|
|
"outputs": [],
|
|
"source": [
|
|
"example_selector = CustomExampleSelector(examples)"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": 39,
|
|
"id": "37ef3149",
|
|
"metadata": {},
|
|
"outputs": [
|
|
{
|
|
"data": {
|
|
"text/plain": [
|
|
"[{'input': 'bye', 'output': 'arrivaderci'}]"
|
|
]
|
|
},
|
|
"execution_count": 39,
|
|
"metadata": {},
|
|
"output_type": "execute_result"
|
|
}
|
|
],
|
|
"source": [
|
|
"example_selector.select_examples({\"input\": \"okay\"})"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": 40,
|
|
"id": "c5ad9f35",
|
|
"metadata": {},
|
|
"outputs": [],
|
|
"source": [
|
|
"example_selector.add_example({\"input\": \"hand\", \"output\": \"mano\"})"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": 41,
|
|
"id": "e4127fe0",
|
|
"metadata": {},
|
|
"outputs": [
|
|
{
|
|
"data": {
|
|
"text/plain": [
|
|
"[{'input': 'hand', 'output': 'mano'}]"
|
|
]
|
|
},
|
|
"execution_count": 41,
|
|
"metadata": {},
|
|
"output_type": "execute_result"
|
|
}
|
|
],
|
|
"source": [
|
|
"example_selector.select_examples({\"input\": \"okay\"})"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"id": "786c920c",
|
|
"metadata": {},
|
|
"source": [
|
|
"## Use in a Prompt\n",
|
|
"\n",
|
|
"We can now use this example selector in a prompt"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": 42,
|
|
"id": "619090e2",
|
|
"metadata": {},
|
|
"outputs": [],
|
|
"source": [
|
|
"from langchain_core.prompts.few_shot import FewShotPromptTemplate\n",
|
|
"from langchain_core.prompts.prompt import PromptTemplate\n",
|
|
"\n",
|
|
"example_prompt = PromptTemplate.from_template(\"Input: {input} -> Output: {output}\")"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": 43,
|
|
"id": "5934c415",
|
|
"metadata": {},
|
|
"outputs": [
|
|
{
|
|
"name": "stdout",
|
|
"output_type": "stream",
|
|
"text": [
|
|
"Translate the following words from English to Italain:\n",
|
|
"\n",
|
|
"Input: hand -> Output: mano\n",
|
|
"\n",
|
|
"Input: word -> Output:\n"
|
|
]
|
|
}
|
|
],
|
|
"source": [
|
|
"prompt = FewShotPromptTemplate(\n",
|
|
" example_selector=example_selector,\n",
|
|
" example_prompt=example_prompt,\n",
|
|
" suffix=\"Input: {input} -> Output:\",\n",
|
|
" prefix=\"Translate the following words from English to Italain:\",\n",
|
|
" input_variables=[\"input\"],\n",
|
|
")\n",
|
|
"\n",
|
|
"print(prompt.format(input=\"word\"))"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"id": "e767f69d",
|
|
"metadata": {},
|
|
"source": [
|
|
"## Example Selector Types\n",
|
|
"\n",
|
|
"| Name | Description |\n",
|
|
"|------------|---------------------------------------------------------------------------------------------|\n",
|
|
"| Similarity | Uses semantic similarity between inputs and examples to decide which examples to choose. |\n",
|
|
"| MMR | Uses Max Marginal Relevance between inputs and examples to decide which examples to choose. |\n",
|
|
"| Length | Selects examples based on how many can fit within a certain length |\n",
|
|
"| Ngram | Uses ngram overlap between inputs and examples to decide which examples to choose. |"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": null,
|
|
"id": "8a6e0abe",
|
|
"metadata": {},
|
|
"outputs": [],
|
|
"source": []
|
|
}
|
|
],
|
|
"metadata": {
|
|
"kernelspec": {
|
|
"display_name": "Python 3 (ipykernel)",
|
|
"language": "python",
|
|
"name": "python3"
|
|
},
|
|
"language_info": {
|
|
"codemirror_mode": {
|
|
"name": "ipython",
|
|
"version": 3
|
|
},
|
|
"file_extension": ".py",
|
|
"mimetype": "text/x-python",
|
|
"name": "python",
|
|
"nbconvert_exporter": "python",
|
|
"pygments_lexer": "ipython3",
|
|
"version": "3.10.1"
|
|
}
|
|
},
|
|
"nbformat": 4,
|
|
"nbformat_minor": 5
|
|
}
|