mirror of
https://github.com/hwchase17/langchain.git
synced 2025-09-06 13:33:37 +00:00
big docs refactor (#1978)
Co-authored-by: Ankush Gola <ankush.gola@gmail.com>
This commit is contained in:
128
docs/modules/prompts/chat_prompt_template.ipynb
Normal file
128
docs/modules/prompts/chat_prompt_template.ipynb
Normal file
@@ -0,0 +1,128 @@
|
||||
{
|
||||
"cells": [
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"id": "6488fdaf",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"# Chat Prompt Template\n",
|
||||
"\n",
|
||||
"Chat Models takes a list of chat messages as input - this list commonly referred to as a prompt.\n",
|
||||
"Typically this is not simply a hardcoded list of messages but rather a combination of a template, some examples, and user input.\n",
|
||||
"LangChain provides several classes and functions to make constructing and working with prompts easy.\n"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 7,
|
||||
"id": "7647a621",
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"from langchain.prompts import (\n",
|
||||
" ChatPromptTemplate,\n",
|
||||
" PromptTemplate,\n",
|
||||
" SystemMessagePromptTemplate,\n",
|
||||
" AIMessagePromptTemplate,\n",
|
||||
" HumanMessagePromptTemplate,\n",
|
||||
")\n",
|
||||
"from langchain.schema import (\n",
|
||||
" AIMessage,\n",
|
||||
" HumanMessage,\n",
|
||||
" SystemMessage\n",
|
||||
")"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"id": "acb4a2f6",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"You can make use of templating by using a `MessagePromptTemplate`. You can build a `ChatPromptTemplate` from one or more `MessagePromptTemplates`. You can use `ChatPromptTemplate`'s `format_prompt` -- this returns a `PromptValue`, which you can convert to a string or Message object, depending on whether you want to use the formatted value as input to an llm or chat model.\n",
|
||||
"\n",
|
||||
"For convience, there is a `from_template` method exposed on the template. If you were to use this template, this is what it would look like:"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 2,
|
||||
"id": "3124f5e9",
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"template=\"You are a helpful assistant that translates {input_language} to {output_language}.\"\n",
|
||||
"system_message_prompt = SystemMessagePromptTemplate.from_template(template)\n",
|
||||
"human_template=\"{text}\"\n",
|
||||
"human_message_prompt = HumanMessagePromptTemplate.from_template(human_template)"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 3,
|
||||
"id": "9c7e2e6f",
|
||||
"metadata": {},
|
||||
"outputs": [
|
||||
{
|
||||
"data": {
|
||||
"text/plain": [
|
||||
"[SystemMessage(content='You are a helpful assistant that translates English to French.', additional_kwargs={}),\n",
|
||||
" HumanMessage(content='I love programming.', additional_kwargs={})]"
|
||||
]
|
||||
},
|
||||
"execution_count": 3,
|
||||
"metadata": {},
|
||||
"output_type": "execute_result"
|
||||
}
|
||||
],
|
||||
"source": [
|
||||
"chat_prompt = ChatPromptTemplate.from_messages([system_message_prompt, human_message_prompt])\n",
|
||||
"\n",
|
||||
"# get a chat completion from the formatted messages\n",
|
||||
"chat_prompt.format_prompt(input_language=\"English\", output_language=\"French\", text=\"I love programming.\").to_messages()"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"id": "0dbdf94f",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"If you wanted to construct the MessagePromptTemplate more directly, you could create a PromptTemplate outside and then pass it in, eg:"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 8,
|
||||
"id": "5a8d249e",
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"prompt=PromptTemplate(\n",
|
||||
" template=\"You are a helpful assistant that translates {input_language} to {output_language}.\",\n",
|
||||
" input_variables=[\"input_language\", \"output_language\"],\n",
|
||||
")\n",
|
||||
"system_message_prompt = SystemMessagePromptTemplate(prompt=prompt)"
|
||||
]
|
||||
}
|
||||
],
|
||||
"metadata": {
|
||||
"kernelspec": {
|
||||
"display_name": "Python 3 (ipykernel)",
|
||||
"language": "python",
|
||||
"name": "python3"
|
||||
},
|
||||
"language_info": {
|
||||
"codemirror_mode": {
|
||||
"name": "ipython",
|
||||
"version": 3
|
||||
},
|
||||
"file_extension": ".py",
|
||||
"mimetype": "text/x-python",
|
||||
"name": "python",
|
||||
"nbconvert_exporter": "python",
|
||||
"pygments_lexer": "ipython3",
|
||||
"version": "3.9.1"
|
||||
}
|
||||
},
|
||||
"nbformat": 4,
|
||||
"nbformat_minor": 5
|
||||
}
|
29
docs/modules/prompts/example_selectors.rst
Normal file
29
docs/modules/prompts/example_selectors.rst
Normal file
@@ -0,0 +1,29 @@
|
||||
Example Selectors
|
||||
==========================
|
||||
|
||||
.. note::
|
||||
`Conceptual Guide <https://docs.langchain.com/docs/components/prompts/example-selectors>`_
|
||||
|
||||
|
||||
If you have a large number of examples, you may need to select which ones to include in the prompt. The ExampleSelector is the class responsible for doing so.
|
||||
|
||||
The base interface is defined as below::
|
||||
|
||||
class BaseExampleSelector(ABC):
|
||||
"""Interface for selecting examples to include in prompts."""
|
||||
|
||||
@abstractmethod
|
||||
def select_examples(self, input_variables: Dict[str, str]) -> List[dict]:
|
||||
"""Select which examples to use based on the inputs."""
|
||||
|
||||
|
||||
The only method it needs to expose is a ``select_examples`` method. This takes in the input variables and then returns a list of examples. It is up to each specific implementation as to how those examples are selected. Let's take a look at some below.
|
||||
|
||||
See below for a list of example selectors.
|
||||
|
||||
|
||||
.. toctree::
|
||||
:maxdepth: 1
|
||||
:glob:
|
||||
|
||||
./example_selectors/examples/*
|
@@ -1,4 +1,4 @@
|
||||
# Create a custom example selector
|
||||
# How to create a custom example selector
|
||||
|
||||
In this tutorial, we'll create a custom example selector that selects every alternate example from a given list of examples.
|
||||
|
||||
@@ -10,7 +10,7 @@ An `ExampleSelector` must implement two methods:
|
||||
Let's implement a custom `ExampleSelector` that just selects two examples at random.
|
||||
|
||||
:::{note}
|
||||
Take a look at the current set of example selector implementations supported in LangChain [here](../getting_started.md).
|
||||
Take a look at the current set of example selector implementations supported in LangChain [here](../../prompt_templates/getting_started.md).
|
||||
:::
|
||||
|
||||
<!-- TODO(shreya): Add the correct link. -->
|
@@ -0,0 +1,211 @@
|
||||
{
|
||||
"cells": [
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"id": "861a4d1f",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"# LengthBased ExampleSelector\n",
|
||||
"\n",
|
||||
"This ExampleSelector selects which examples to use based on length. This is useful when you are worried about constructing a prompt that will go over the length of the context window. For longer inputs, it will select fewer examples to include, while for shorter inputs it will select more.\n"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 1,
|
||||
"id": "7c469c95",
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"from langchain.prompts import PromptTemplate\n",
|
||||
"from langchain.prompts import FewShotPromptTemplate\n",
|
||||
"from langchain.prompts.example_selector import LengthBasedExampleSelector"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 2,
|
||||
"id": "0ec6d950",
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"# These are a lot of examples of a pretend task of creating antonyms.\n",
|
||||
"examples = [\n",
|
||||
" {\"input\": \"happy\", \"output\": \"sad\"},\n",
|
||||
" {\"input\": \"tall\", \"output\": \"short\"},\n",
|
||||
" {\"input\": \"energetic\", \"output\": \"lethargic\"},\n",
|
||||
" {\"input\": \"sunny\", \"output\": \"gloomy\"},\n",
|
||||
" {\"input\": \"windy\", \"output\": \"calm\"},\n",
|
||||
"]"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 3,
|
||||
"id": "207e55f7",
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"example_prompt = PromptTemplate(\n",
|
||||
" input_variables=[\"input\", \"output\"],\n",
|
||||
" template=\"Input: {input}\\nOutput: {output}\",\n",
|
||||
")\n",
|
||||
"example_selector = LengthBasedExampleSelector(\n",
|
||||
" # These are the examples it has available to choose from.\n",
|
||||
" examples=examples, \n",
|
||||
" # This is the PromptTemplate being used to format the examples.\n",
|
||||
" example_prompt=example_prompt, \n",
|
||||
" # This is the maximum length that the formatted examples should be.\n",
|
||||
" # Length is measured by the get_text_length function below.\n",
|
||||
" max_length=25,\n",
|
||||
" # This is the function used to get the length of a string, which is used\n",
|
||||
" # to determine which examples to include. It is commented out because\n",
|
||||
" # it is provided as a default value if none is specified.\n",
|
||||
" # get_text_length: Callable[[str], int] = lambda x: len(re.split(\"\\n| \", x))\n",
|
||||
")\n",
|
||||
"dynamic_prompt = FewShotPromptTemplate(\n",
|
||||
" # We provide an ExampleSelector instead of examples.\n",
|
||||
" example_selector=example_selector,\n",
|
||||
" example_prompt=example_prompt,\n",
|
||||
" prefix=\"Give the antonym of every input\",\n",
|
||||
" suffix=\"Input: {adjective}\\nOutput:\", \n",
|
||||
" input_variables=[\"adjective\"],\n",
|
||||
")"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 4,
|
||||
"id": "d00b4385",
|
||||
"metadata": {},
|
||||
"outputs": [
|
||||
{
|
||||
"name": "stdout",
|
||||
"output_type": "stream",
|
||||
"text": [
|
||||
"Give the antonym of every input\n",
|
||||
"\n",
|
||||
"Input: happy\n",
|
||||
"Output: sad\n",
|
||||
"\n",
|
||||
"Input: tall\n",
|
||||
"Output: short\n",
|
||||
"\n",
|
||||
"Input: energetic\n",
|
||||
"Output: lethargic\n",
|
||||
"\n",
|
||||
"Input: sunny\n",
|
||||
"Output: gloomy\n",
|
||||
"\n",
|
||||
"Input: windy\n",
|
||||
"Output: calm\n",
|
||||
"\n",
|
||||
"Input: big\n",
|
||||
"Output:\n"
|
||||
]
|
||||
}
|
||||
],
|
||||
"source": [
|
||||
"# An example with small input, so it selects all examples.\n",
|
||||
"print(dynamic_prompt.format(adjective=\"big\"))"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 5,
|
||||
"id": "878bcde9",
|
||||
"metadata": {},
|
||||
"outputs": [
|
||||
{
|
||||
"name": "stdout",
|
||||
"output_type": "stream",
|
||||
"text": [
|
||||
"Give the antonym of every input\n",
|
||||
"\n",
|
||||
"Input: happy\n",
|
||||
"Output: sad\n",
|
||||
"\n",
|
||||
"Input: big and huge and massive and large and gigantic and tall and much much much much much bigger than everything else\n",
|
||||
"Output:\n"
|
||||
]
|
||||
}
|
||||
],
|
||||
"source": [
|
||||
"# An example with long input, so it selects only one example.\n",
|
||||
"long_string = \"big and huge and massive and large and gigantic and tall and much much much much much bigger than everything else\"\n",
|
||||
"print(dynamic_prompt.format(adjective=long_string))"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 6,
|
||||
"id": "e4bebcd9",
|
||||
"metadata": {},
|
||||
"outputs": [
|
||||
{
|
||||
"name": "stdout",
|
||||
"output_type": "stream",
|
||||
"text": [
|
||||
"Give the antonym of every input\n",
|
||||
"\n",
|
||||
"Input: happy\n",
|
||||
"Output: sad\n",
|
||||
"\n",
|
||||
"Input: tall\n",
|
||||
"Output: short\n",
|
||||
"\n",
|
||||
"Input: energetic\n",
|
||||
"Output: lethargic\n",
|
||||
"\n",
|
||||
"Input: sunny\n",
|
||||
"Output: gloomy\n",
|
||||
"\n",
|
||||
"Input: windy\n",
|
||||
"Output: calm\n",
|
||||
"\n",
|
||||
"Input: big\n",
|
||||
"Output: small\n",
|
||||
"\n",
|
||||
"Input: enthusiastic\n",
|
||||
"Output:\n"
|
||||
]
|
||||
}
|
||||
],
|
||||
"source": [
|
||||
"# You can add an example to an example selector as well.\n",
|
||||
"new_example = {\"input\": \"big\", \"output\": \"small\"}\n",
|
||||
"dynamic_prompt.example_selector.add_example(new_example)\n",
|
||||
"print(dynamic_prompt.format(adjective=\"enthusiastic\"))"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": null,
|
||||
"id": "39f30097",
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": []
|
||||
}
|
||||
],
|
||||
"metadata": {
|
||||
"kernelspec": {
|
||||
"display_name": "Python 3 (ipykernel)",
|
||||
"language": "python",
|
||||
"name": "python3"
|
||||
},
|
||||
"language_info": {
|
||||
"codemirror_mode": {
|
||||
"name": "ipython",
|
||||
"version": 3
|
||||
},
|
||||
"file_extension": ".py",
|
||||
"mimetype": "text/x-python",
|
||||
"name": "python",
|
||||
"nbconvert_exporter": "python",
|
||||
"pygments_lexer": "ipython3",
|
||||
"version": "3.9.1"
|
||||
}
|
||||
},
|
||||
"nbformat": 4,
|
||||
"nbformat_minor": 5
|
||||
}
|
162
docs/modules/prompts/example_selectors/examples/mmr.ipynb
Normal file
162
docs/modules/prompts/example_selectors/examples/mmr.ipynb
Normal file
@@ -0,0 +1,162 @@
|
||||
{
|
||||
"cells": [
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"id": "bc35afd0",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"# Maximal Marginal Relevance ExampleSelector\n",
|
||||
"\n",
|
||||
"The MaxMarginalRelevanceExampleSelector selects examples based on a combination of which examples are most similar to the inputs, while also optimizing for diversity. It does this by finding the examples with the embeddings that have the greatest cosine similarity with the inputs, and then iteratively adding them while penalizing them for closeness to already selected examples.\n"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 3,
|
||||
"id": "ac95c968",
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"from langchain.prompts.example_selector import MaxMarginalRelevanceExampleSelector\n",
|
||||
"from langchain.vectorstores import FAISS\n",
|
||||
"from langchain.embeddings import OpenAIEmbeddings\n",
|
||||
"from langchain.prompts import FewShotPromptTemplate, PromptTemplate\n",
|
||||
"\n",
|
||||
"example_prompt = PromptTemplate(\n",
|
||||
" input_variables=[\"input\", \"output\"],\n",
|
||||
" template=\"Input: {input}\\nOutput: {output}\",\n",
|
||||
")\n",
|
||||
"\n",
|
||||
"# These are a lot of examples of a pretend task of creating antonyms.\n",
|
||||
"examples = [\n",
|
||||
" {\"input\": \"happy\", \"output\": \"sad\"},\n",
|
||||
" {\"input\": \"tall\", \"output\": \"short\"},\n",
|
||||
" {\"input\": \"energetic\", \"output\": \"lethargic\"},\n",
|
||||
" {\"input\": \"sunny\", \"output\": \"gloomy\"},\n",
|
||||
" {\"input\": \"windy\", \"output\": \"calm\"},\n",
|
||||
"]"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 4,
|
||||
"id": "db579bea",
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"example_selector = MaxMarginalRelevanceExampleSelector.from_examples(\n",
|
||||
" # This is the list of examples available to select from.\n",
|
||||
" examples, \n",
|
||||
" # This is the embedding class used to produce embeddings which are used to measure semantic similarity.\n",
|
||||
" OpenAIEmbeddings(), \n",
|
||||
" # This is the VectorStore class that is used to store the embeddings and do a similarity search over.\n",
|
||||
" FAISS, \n",
|
||||
" # This is the number of examples to produce.\n",
|
||||
" k=2\n",
|
||||
")\n",
|
||||
"mmr_prompt = FewShotPromptTemplate(\n",
|
||||
" # We provide an ExampleSelector instead of examples.\n",
|
||||
" example_selector=example_selector,\n",
|
||||
" example_prompt=example_prompt,\n",
|
||||
" prefix=\"Give the antonym of every input\",\n",
|
||||
" suffix=\"Input: {adjective}\\nOutput:\", \n",
|
||||
" input_variables=[\"adjective\"],\n",
|
||||
")"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 5,
|
||||
"id": "cd76e344",
|
||||
"metadata": {},
|
||||
"outputs": [
|
||||
{
|
||||
"name": "stdout",
|
||||
"output_type": "stream",
|
||||
"text": [
|
||||
"Give the antonym of every input\n",
|
||||
"\n",
|
||||
"Input: happy\n",
|
||||
"Output: sad\n",
|
||||
"\n",
|
||||
"Input: windy\n",
|
||||
"Output: calm\n",
|
||||
"\n",
|
||||
"Input: worried\n",
|
||||
"Output:\n"
|
||||
]
|
||||
}
|
||||
],
|
||||
"source": [
|
||||
"# Input is a feeling, so should select the happy/sad example as the first one\n",
|
||||
"print(mmr_prompt.format(adjective=\"worried\"))"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 7,
|
||||
"id": "cf82956b",
|
||||
"metadata": {},
|
||||
"outputs": [
|
||||
{
|
||||
"name": "stdout",
|
||||
"output_type": "stream",
|
||||
"text": [
|
||||
"Give the antonym of every input\n",
|
||||
"\n",
|
||||
"Input: happy\n",
|
||||
"Output: sad\n",
|
||||
"\n",
|
||||
"Input: windy\n",
|
||||
"Output: calm\n",
|
||||
"\n",
|
||||
"Input: worried\n",
|
||||
"Output:\n"
|
||||
]
|
||||
}
|
||||
],
|
||||
"source": [
|
||||
"# Let's compare this to what we would just get if we went solely off of similarity\n",
|
||||
"similar_prompt = FewShotPromptTemplate(\n",
|
||||
" # We provide an ExampleSelector instead of examples.\n",
|
||||
" example_selector=example_selector,\n",
|
||||
" example_prompt=example_prompt,\n",
|
||||
" prefix=\"Give the antonym of every input\",\n",
|
||||
" suffix=\"Input: {adjective}\\nOutput:\", \n",
|
||||
" input_variables=[\"adjective\"],\n",
|
||||
")\n",
|
||||
"similar_prompt.example_selector.k = 2\n",
|
||||
"print(similar_prompt.format(adjective=\"worried\"))"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": null,
|
||||
"id": "39f30097",
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": []
|
||||
}
|
||||
],
|
||||
"metadata": {
|
||||
"kernelspec": {
|
||||
"display_name": "Python 3 (ipykernel)",
|
||||
"language": "python",
|
||||
"name": "python3"
|
||||
},
|
||||
"language_info": {
|
||||
"codemirror_mode": {
|
||||
"name": "ipython",
|
||||
"version": 3
|
||||
},
|
||||
"file_extension": ".py",
|
||||
"mimetype": "text/x-python",
|
||||
"name": "python",
|
||||
"nbconvert_exporter": "python",
|
||||
"pygments_lexer": "ipython3",
|
||||
"version": "3.9.1"
|
||||
}
|
||||
},
|
||||
"nbformat": 4,
|
||||
"nbformat_minor": 5
|
||||
}
|
@@ -0,0 +1,280 @@
|
||||
{
|
||||
"cells": [
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"id": "4aaeed2f",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"# NGram Overlap ExampleSelector\n",
|
||||
"\n",
|
||||
"The NGramOverlapExampleSelector selects and orders examples based on which examples are most similar to the input, according to an ngram overlap score. The ngram overlap score is a float between 0.0 and 1.0, inclusive. \n",
|
||||
"\n",
|
||||
"The selector allows for a threshold score to be set. Examples with an ngram overlap score less than or equal to the threshold are excluded. The threshold is set to -1.0, by default, so will not exclude any examples, only reorder them. Setting the threshold to 0.0 will exclude examples that have no ngram overlaps with the input.\n"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 1,
|
||||
"id": "9cbc0acc",
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"from langchain.prompts import PromptTemplate\n",
|
||||
"from langchain.prompts.example_selector.ngram_overlap import NGramOverlapExampleSelector\n",
|
||||
"from langchain.prompts import FewShotPromptTemplate, PromptTemplate\n",
|
||||
"\n",
|
||||
"example_prompt = PromptTemplate(\n",
|
||||
" input_variables=[\"input\", \"output\"],\n",
|
||||
" template=\"Input: {input}\\nOutput: {output}\",\n",
|
||||
")\n",
|
||||
"\n",
|
||||
"# These are a lot of examples of a pretend task of creating antonyms.\n",
|
||||
"examples = [\n",
|
||||
" {\"input\": \"happy\", \"output\": \"sad\"},\n",
|
||||
" {\"input\": \"tall\", \"output\": \"short\"},\n",
|
||||
" {\"input\": \"energetic\", \"output\": \"lethargic\"},\n",
|
||||
" {\"input\": \"sunny\", \"output\": \"gloomy\"},\n",
|
||||
" {\"input\": \"windy\", \"output\": \"calm\"},\n",
|
||||
"]"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 2,
|
||||
"id": "4f318f4b",
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"# These are examples of a fictional translation task.\n",
|
||||
"examples = [\n",
|
||||
" {\"input\": \"See Spot run.\", \"output\": \"Ver correr a Spot.\"},\n",
|
||||
" {\"input\": \"My dog barks.\", \"output\": \"Mi perro ladra.\"},\n",
|
||||
" {\"input\": \"Spot can run.\", \"output\": \"Spot puede correr.\"},\n",
|
||||
"]"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 3,
|
||||
"id": "bf75e0fe",
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"example_prompt = PromptTemplate(\n",
|
||||
" input_variables=[\"input\", \"output\"],\n",
|
||||
" template=\"Input: {input}\\nOutput: {output}\",\n",
|
||||
")\n",
|
||||
"example_selector = NGramOverlapExampleSelector(\n",
|
||||
" # These are the examples it has available to choose from.\n",
|
||||
" examples=examples, \n",
|
||||
" # This is the PromptTemplate being used to format the examples.\n",
|
||||
" example_prompt=example_prompt, \n",
|
||||
" # This is the threshold, at which selector stops.\n",
|
||||
" # It is set to -1.0 by default.\n",
|
||||
" threshold=-1.0,\n",
|
||||
" # For negative threshold:\n",
|
||||
" # Selector sorts examples by ngram overlap score, and excludes none.\n",
|
||||
" # For threshold greater than 1.0:\n",
|
||||
" # Selector excludes all examples, and returns an empty list.\n",
|
||||
" # For threshold equal to 0.0:\n",
|
||||
" # Selector sorts examples by ngram overlap score,\n",
|
||||
" # and excludes those with no ngram overlap with input.\n",
|
||||
")\n",
|
||||
"dynamic_prompt = FewShotPromptTemplate(\n",
|
||||
" # We provide an ExampleSelector instead of examples.\n",
|
||||
" example_selector=example_selector,\n",
|
||||
" example_prompt=example_prompt,\n",
|
||||
" prefix=\"Give the Spanish translation of every input\",\n",
|
||||
" suffix=\"Input: {sentence}\\nOutput:\", \n",
|
||||
" input_variables=[\"sentence\"],\n",
|
||||
")"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 4,
|
||||
"id": "83fb218a",
|
||||
"metadata": {},
|
||||
"outputs": [
|
||||
{
|
||||
"name": "stdout",
|
||||
"output_type": "stream",
|
||||
"text": [
|
||||
"Give the Spanish translation of every input\n",
|
||||
"\n",
|
||||
"Input: Spot can run.\n",
|
||||
"Output: Spot puede correr.\n",
|
||||
"\n",
|
||||
"Input: See Spot run.\n",
|
||||
"Output: Ver correr a Spot.\n",
|
||||
"\n",
|
||||
"Input: My dog barks.\n",
|
||||
"Output: Mi perro ladra.\n",
|
||||
"\n",
|
||||
"Input: Spot can run fast.\n",
|
||||
"Output:\n"
|
||||
]
|
||||
}
|
||||
],
|
||||
"source": [
|
||||
"# An example input with large ngram overlap with \"Spot can run.\"\n",
|
||||
"# and no overlap with \"My dog barks.\"\n",
|
||||
"print(dynamic_prompt.format(sentence=\"Spot can run fast.\"))"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 5,
|
||||
"id": "485f5307",
|
||||
"metadata": {},
|
||||
"outputs": [
|
||||
{
|
||||
"name": "stdout",
|
||||
"output_type": "stream",
|
||||
"text": [
|
||||
"Give the Spanish translation of every input\n",
|
||||
"\n",
|
||||
"Input: Spot can run.\n",
|
||||
"Output: Spot puede correr.\n",
|
||||
"\n",
|
||||
"Input: See Spot run.\n",
|
||||
"Output: Ver correr a Spot.\n",
|
||||
"\n",
|
||||
"Input: Spot plays fetch.\n",
|
||||
"Output: Spot juega a buscar.\n",
|
||||
"\n",
|
||||
"Input: My dog barks.\n",
|
||||
"Output: Mi perro ladra.\n",
|
||||
"\n",
|
||||
"Input: Spot can run fast.\n",
|
||||
"Output:\n"
|
||||
]
|
||||
}
|
||||
],
|
||||
"source": [
|
||||
"# You can add examples to NGramOverlapExampleSelector as well.\n",
|
||||
"new_example = {\"input\": \"Spot plays fetch.\", \"output\": \"Spot juega a buscar.\"}\n",
|
||||
"\n",
|
||||
"example_selector.add_example(new_example)\n",
|
||||
"print(dynamic_prompt.format(sentence=\"Spot can run fast.\"))"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 6,
|
||||
"id": "606ce697",
|
||||
"metadata": {},
|
||||
"outputs": [
|
||||
{
|
||||
"name": "stdout",
|
||||
"output_type": "stream",
|
||||
"text": [
|
||||
"Give the Spanish translation of every input\n",
|
||||
"\n",
|
||||
"Input: Spot can run.\n",
|
||||
"Output: Spot puede correr.\n",
|
||||
"\n",
|
||||
"Input: See Spot run.\n",
|
||||
"Output: Ver correr a Spot.\n",
|
||||
"\n",
|
||||
"Input: Spot plays fetch.\n",
|
||||
"Output: Spot juega a buscar.\n",
|
||||
"\n",
|
||||
"Input: Spot can run fast.\n",
|
||||
"Output:\n"
|
||||
]
|
||||
}
|
||||
],
|
||||
"source": [
|
||||
"# You can set a threshold at which examples are excluded.\n",
|
||||
"# For example, setting threshold equal to 0.0\n",
|
||||
"# excludes examples with no ngram overlaps with input.\n",
|
||||
"# Since \"My dog barks.\" has no ngram overlaps with \"Spot can run fast.\"\n",
|
||||
"# it is excluded.\n",
|
||||
"example_selector.threshold=0.0\n",
|
||||
"print(dynamic_prompt.format(sentence=\"Spot can run fast.\"))"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 7,
|
||||
"id": "7f8d72f7",
|
||||
"metadata": {},
|
||||
"outputs": [
|
||||
{
|
||||
"name": "stdout",
|
||||
"output_type": "stream",
|
||||
"text": [
|
||||
"Give the Spanish translation of every input\n",
|
||||
"\n",
|
||||
"Input: Spot can run.\n",
|
||||
"Output: Spot puede correr.\n",
|
||||
"\n",
|
||||
"Input: Spot plays fetch.\n",
|
||||
"Output: Spot juega a buscar.\n",
|
||||
"\n",
|
||||
"Input: Spot can play fetch.\n",
|
||||
"Output:\n"
|
||||
]
|
||||
}
|
||||
],
|
||||
"source": [
|
||||
"# Setting small nonzero threshold\n",
|
||||
"example_selector.threshold=0.09\n",
|
||||
"print(dynamic_prompt.format(sentence=\"Spot can play fetch.\"))"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 8,
|
||||
"id": "09633aa8",
|
||||
"metadata": {},
|
||||
"outputs": [
|
||||
{
|
||||
"name": "stdout",
|
||||
"output_type": "stream",
|
||||
"text": [
|
||||
"Give the Spanish translation of every input\n",
|
||||
"\n",
|
||||
"Input: Spot can play fetch.\n",
|
||||
"Output:\n"
|
||||
]
|
||||
}
|
||||
],
|
||||
"source": [
|
||||
"# Setting threshold greater than 1.0\n",
|
||||
"example_selector.threshold=1.0+1e-9\n",
|
||||
"print(dynamic_prompt.format(sentence=\"Spot can play fetch.\"))"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": null,
|
||||
"id": "39f30097",
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": []
|
||||
}
|
||||
],
|
||||
"metadata": {
|
||||
"kernelspec": {
|
||||
"display_name": "Python 3 (ipykernel)",
|
||||
"language": "python",
|
||||
"name": "python3"
|
||||
},
|
||||
"language_info": {
|
||||
"codemirror_mode": {
|
||||
"name": "ipython",
|
||||
"version": 3
|
||||
},
|
||||
"file_extension": ".py",
|
||||
"mimetype": "text/x-python",
|
||||
"name": "python",
|
||||
"nbconvert_exporter": "python",
|
||||
"pygments_lexer": "ipython3",
|
||||
"version": "3.9.1"
|
||||
}
|
||||
},
|
||||
"nbformat": 4,
|
||||
"nbformat_minor": 5
|
||||
}
|
184
docs/modules/prompts/example_selectors/examples/similarity.ipynb
Normal file
184
docs/modules/prompts/example_selectors/examples/similarity.ipynb
Normal file
@@ -0,0 +1,184 @@
|
||||
{
|
||||
"cells": [
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"id": "2d007b0a",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"# Similarity ExampleSelector\n",
|
||||
"\n",
|
||||
"The SemanticSimilarityExampleSelector selects examples based on which examples are most similar to the inputs. It does this by finding the examples with the embeddings that have the greatest cosine similarity with the inputs.\n"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 5,
|
||||
"id": "241bfe80",
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"from langchain.prompts.example_selector import SemanticSimilarityExampleSelector\n",
|
||||
"from langchain.vectorstores import Chroma\n",
|
||||
"from langchain.embeddings import OpenAIEmbeddings\n",
|
||||
"from langchain.prompts import FewShotPromptTemplate, PromptTemplate\n",
|
||||
"\n",
|
||||
"example_prompt = PromptTemplate(\n",
|
||||
" input_variables=[\"input\", \"output\"],\n",
|
||||
" template=\"Input: {input}\\nOutput: {output}\",\n",
|
||||
")\n",
|
||||
"\n",
|
||||
"# These are a lot of examples of a pretend task of creating antonyms.\n",
|
||||
"examples = [\n",
|
||||
" {\"input\": \"happy\", \"output\": \"sad\"},\n",
|
||||
" {\"input\": \"tall\", \"output\": \"short\"},\n",
|
||||
" {\"input\": \"energetic\", \"output\": \"lethargic\"},\n",
|
||||
" {\"input\": \"sunny\", \"output\": \"gloomy\"},\n",
|
||||
" {\"input\": \"windy\", \"output\": \"calm\"},\n",
|
||||
"]"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 6,
|
||||
"id": "50d0a701",
|
||||
"metadata": {},
|
||||
"outputs": [
|
||||
{
|
||||
"name": "stdout",
|
||||
"output_type": "stream",
|
||||
"text": [
|
||||
"Running Chroma using direct local API.\n",
|
||||
"Using DuckDB in-memory for database. Data will be transient.\n"
|
||||
]
|
||||
}
|
||||
],
|
||||
"source": [
|
||||
"example_selector = SemanticSimilarityExampleSelector.from_examples(\n",
|
||||
" # This is the list of examples available to select from.\n",
|
||||
" examples, \n",
|
||||
" # This is the embedding class used to produce embeddings which are used to measure semantic similarity.\n",
|
||||
" OpenAIEmbeddings(), \n",
|
||||
" # This is the VectorStore class that is used to store the embeddings and do a similarity search over.\n",
|
||||
" Chroma, \n",
|
||||
" # This is the number of examples to produce.\n",
|
||||
" k=1\n",
|
||||
")\n",
|
||||
"similar_prompt = FewShotPromptTemplate(\n",
|
||||
" # We provide an ExampleSelector instead of examples.\n",
|
||||
" example_selector=example_selector,\n",
|
||||
" example_prompt=example_prompt,\n",
|
||||
" prefix=\"Give the antonym of every input\",\n",
|
||||
" suffix=\"Input: {adjective}\\nOutput:\", \n",
|
||||
" input_variables=[\"adjective\"],\n",
|
||||
")"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 7,
|
||||
"id": "4c8fdf45",
|
||||
"metadata": {},
|
||||
"outputs": [
|
||||
{
|
||||
"name": "stdout",
|
||||
"output_type": "stream",
|
||||
"text": [
|
||||
"Give the antonym of every input\n",
|
||||
"\n",
|
||||
"Input: happy\n",
|
||||
"Output: sad\n",
|
||||
"\n",
|
||||
"Input: worried\n",
|
||||
"Output:\n"
|
||||
]
|
||||
}
|
||||
],
|
||||
"source": [
|
||||
"# Input is a feeling, so should select the happy/sad example\n",
|
||||
"print(similar_prompt.format(adjective=\"worried\"))"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 8,
|
||||
"id": "829af21a",
|
||||
"metadata": {
|
||||
"scrolled": true
|
||||
},
|
||||
"outputs": [
|
||||
{
|
||||
"name": "stdout",
|
||||
"output_type": "stream",
|
||||
"text": [
|
||||
"Give the antonym of every input\n",
|
||||
"\n",
|
||||
"Input: happy\n",
|
||||
"Output: sad\n",
|
||||
"\n",
|
||||
"Input: fat\n",
|
||||
"Output:\n"
|
||||
]
|
||||
}
|
||||
],
|
||||
"source": [
|
||||
"# Input is a measurement, so should select the tall/short example\n",
|
||||
"print(similar_prompt.format(adjective=\"fat\"))"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 9,
|
||||
"id": "3c16fe23",
|
||||
"metadata": {},
|
||||
"outputs": [
|
||||
{
|
||||
"name": "stdout",
|
||||
"output_type": "stream",
|
||||
"text": [
|
||||
"Give the antonym of every input\n",
|
||||
"\n",
|
||||
"Input: happy\n",
|
||||
"Output: sad\n",
|
||||
"\n",
|
||||
"Input: joyful\n",
|
||||
"Output:\n"
|
||||
]
|
||||
}
|
||||
],
|
||||
"source": [
|
||||
"# You can add new examples to the SemanticSimilarityExampleSelector as well\n",
|
||||
"similar_prompt.example_selector.add_example({\"input\": \"enthusiastic\", \"output\": \"apathetic\"})\n",
|
||||
"print(similar_prompt.format(adjective=\"joyful\"))"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": null,
|
||||
"id": "39f30097",
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": []
|
||||
}
|
||||
],
|
||||
"metadata": {
|
||||
"kernelspec": {
|
||||
"display_name": "Python 3 (ipykernel)",
|
||||
"language": "python",
|
||||
"name": "python3"
|
||||
},
|
||||
"language_info": {
|
||||
"codemirror_mode": {
|
||||
"name": "ipython",
|
||||
"version": 3
|
||||
},
|
||||
"file_extension": ".py",
|
||||
"mimetype": "text/x-python",
|
||||
"name": "python",
|
||||
"nbconvert_exporter": "python",
|
||||
"pygments_lexer": "ipython3",
|
||||
"version": "3.9.1"
|
||||
}
|
||||
},
|
||||
"nbformat": 4,
|
||||
"nbformat_minor": 5
|
||||
}
|
@@ -1,711 +0,0 @@
|
||||
{
|
||||
"cells": [
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"id": "bf038596",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"# Example Selectors\n",
|
||||
"If you have a large number of examples, you may need to select which ones to include in the prompt. The ExampleSelector is the class responsible for doing so. The base interface is defined as below.\n",
|
||||
"\n",
|
||||
"```python\n",
|
||||
"class BaseExampleSelector(ABC):\n",
|
||||
" \"\"\"Interface for selecting examples to include in prompts.\"\"\"\n",
|
||||
"\n",
|
||||
" @abstractmethod\n",
|
||||
" def select_examples(self, input_variables: Dict[str, str]) -> List[dict]:\n",
|
||||
" \"\"\"Select which examples to use based on the inputs.\"\"\"\n",
|
||||
"\n",
|
||||
"```\n",
|
||||
"\n",
|
||||
"The only method it needs to expose is a `select_examples` method. This takes in the input variables and then returns a list of examples. It is up to each specific implementation as to how those examples are selected. Let's take a look at some below."
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 3,
|
||||
"id": "8244ff60",
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"from langchain.prompts import FewShotPromptTemplate"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"id": "861a4d1f",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"## LengthBased ExampleSelector\n",
|
||||
"\n",
|
||||
"This ExampleSelector selects which examples to use based on length. This is useful when you are worried about constructing a prompt that will go over the length of the context window. For longer inputs, it will select fewer examples to include, while for shorter inputs it will select more.\n"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 4,
|
||||
"id": "7c469c95",
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"from langchain.prompts import PromptTemplate\n",
|
||||
"from langchain.prompts.example_selector import LengthBasedExampleSelector"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 5,
|
||||
"id": "0ec6d950",
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"# These are a lot of examples of a pretend task of creating antonyms.\n",
|
||||
"examples = [\n",
|
||||
" {\"input\": \"happy\", \"output\": \"sad\"},\n",
|
||||
" {\"input\": \"tall\", \"output\": \"short\"},\n",
|
||||
" {\"input\": \"energetic\", \"output\": \"lethargic\"},\n",
|
||||
" {\"input\": \"sunny\", \"output\": \"gloomy\"},\n",
|
||||
" {\"input\": \"windy\", \"output\": \"calm\"},\n",
|
||||
"]"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 6,
|
||||
"id": "207e55f7",
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"example_prompt = PromptTemplate(\n",
|
||||
" input_variables=[\"input\", \"output\"],\n",
|
||||
" template=\"Input: {input}\\nOutput: {output}\",\n",
|
||||
")\n",
|
||||
"example_selector = LengthBasedExampleSelector(\n",
|
||||
" # These are the examples it has available to choose from.\n",
|
||||
" examples=examples, \n",
|
||||
" # This is the PromptTemplate being used to format the examples.\n",
|
||||
" example_prompt=example_prompt, \n",
|
||||
" # This is the maximum length that the formatted examples should be.\n",
|
||||
" # Length is measured by the get_text_length function below.\n",
|
||||
" max_length=25,\n",
|
||||
" # This is the function used to get the length of a string, which is used\n",
|
||||
" # to determine which examples to include. It is commented out because\n",
|
||||
" # it is provided as a default value if none is specified.\n",
|
||||
" # get_text_length: Callable[[str], int] = lambda x: len(re.split(\"\\n| \", x))\n",
|
||||
")\n",
|
||||
"dynamic_prompt = FewShotPromptTemplate(\n",
|
||||
" # We provide an ExampleSelector instead of examples.\n",
|
||||
" example_selector=example_selector,\n",
|
||||
" example_prompt=example_prompt,\n",
|
||||
" prefix=\"Give the antonym of every input\",\n",
|
||||
" suffix=\"Input: {adjective}\\nOutput:\", \n",
|
||||
" input_variables=[\"adjective\"],\n",
|
||||
")"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 7,
|
||||
"id": "d00b4385",
|
||||
"metadata": {},
|
||||
"outputs": [
|
||||
{
|
||||
"name": "stdout",
|
||||
"output_type": "stream",
|
||||
"text": [
|
||||
"Give the antonym of every input\n",
|
||||
"\n",
|
||||
"Input: happy\n",
|
||||
"Output: sad\n",
|
||||
"\n",
|
||||
"Input: tall\n",
|
||||
"Output: short\n",
|
||||
"\n",
|
||||
"Input: energetic\n",
|
||||
"Output: lethargic\n",
|
||||
"\n",
|
||||
"Input: sunny\n",
|
||||
"Output: gloomy\n",
|
||||
"\n",
|
||||
"Input: windy\n",
|
||||
"Output: calm\n",
|
||||
"\n",
|
||||
"Input: big\n",
|
||||
"Output:\n"
|
||||
]
|
||||
}
|
||||
],
|
||||
"source": [
|
||||
"# An example with small input, so it selects all examples.\n",
|
||||
"print(dynamic_prompt.format(adjective=\"big\"))"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 8,
|
||||
"id": "878bcde9",
|
||||
"metadata": {},
|
||||
"outputs": [
|
||||
{
|
||||
"name": "stdout",
|
||||
"output_type": "stream",
|
||||
"text": [
|
||||
"Give the antonym of every input\n",
|
||||
"\n",
|
||||
"Input: happy\n",
|
||||
"Output: sad\n",
|
||||
"\n",
|
||||
"Input: big and huge and massive and large and gigantic and tall and much much much much much bigger than everything else\n",
|
||||
"Output:\n"
|
||||
]
|
||||
}
|
||||
],
|
||||
"source": [
|
||||
"# An example with long input, so it selects only one example.\n",
|
||||
"long_string = \"big and huge and massive and large and gigantic and tall and much much much much much bigger than everything else\"\n",
|
||||
"print(dynamic_prompt.format(adjective=long_string))"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 9,
|
||||
"id": "e4bebcd9",
|
||||
"metadata": {},
|
||||
"outputs": [
|
||||
{
|
||||
"name": "stdout",
|
||||
"output_type": "stream",
|
||||
"text": [
|
||||
"Give the antonym of every input\n",
|
||||
"\n",
|
||||
"Input: happy\n",
|
||||
"Output: sad\n",
|
||||
"\n",
|
||||
"Input: tall\n",
|
||||
"Output: short\n",
|
||||
"\n",
|
||||
"Input: energetic\n",
|
||||
"Output: lethargic\n",
|
||||
"\n",
|
||||
"Input: sunny\n",
|
||||
"Output: gloomy\n",
|
||||
"\n",
|
||||
"Input: windy\n",
|
||||
"Output: calm\n",
|
||||
"\n",
|
||||
"Input: big\n",
|
||||
"Output: small\n",
|
||||
"\n",
|
||||
"Input: enthusiastic\n",
|
||||
"Output:\n"
|
||||
]
|
||||
}
|
||||
],
|
||||
"source": [
|
||||
"# You can add an example to an example selector as well.\n",
|
||||
"new_example = {\"input\": \"big\", \"output\": \"small\"}\n",
|
||||
"dynamic_prompt.example_selector.add_example(new_example)\n",
|
||||
"print(dynamic_prompt.format(adjective=\"enthusiastic\"))"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"id": "2d007b0a",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"## Similarity ExampleSelector\n",
|
||||
"\n",
|
||||
"The SemanticSimilarityExampleSelector selects examples based on which examples are most similar to the inputs. It does this by finding the examples with the embeddings that have the greatest cosine similarity with the inputs.\n"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 10,
|
||||
"id": "241bfe80",
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"from langchain.prompts.example_selector import SemanticSimilarityExampleSelector\n",
|
||||
"from langchain.vectorstores import Chroma\n",
|
||||
"from langchain.embeddings import OpenAIEmbeddings"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 11,
|
||||
"id": "50d0a701",
|
||||
"metadata": {},
|
||||
"outputs": [
|
||||
{
|
||||
"name": "stdout",
|
||||
"output_type": "stream",
|
||||
"text": [
|
||||
"Running Chroma using direct local API.\n",
|
||||
"Using DuckDB in-memory for database. Data will be transient.\n"
|
||||
]
|
||||
}
|
||||
],
|
||||
"source": [
|
||||
"example_selector = SemanticSimilarityExampleSelector.from_examples(\n",
|
||||
" # This is the list of examples available to select from.\n",
|
||||
" examples, \n",
|
||||
" # This is the embedding class used to produce embeddings which are used to measure semantic similarity.\n",
|
||||
" OpenAIEmbeddings(), \n",
|
||||
" # This is the VectorStore class that is used to store the embeddings and do a similarity search over.\n",
|
||||
" Chroma, \n",
|
||||
" # This is the number of examples to produce.\n",
|
||||
" k=1\n",
|
||||
")\n",
|
||||
"similar_prompt = FewShotPromptTemplate(\n",
|
||||
" # We provide an ExampleSelector instead of examples.\n",
|
||||
" example_selector=example_selector,\n",
|
||||
" example_prompt=example_prompt,\n",
|
||||
" prefix=\"Give the antonym of every input\",\n",
|
||||
" suffix=\"Input: {adjective}\\nOutput:\", \n",
|
||||
" input_variables=[\"adjective\"],\n",
|
||||
")"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 12,
|
||||
"id": "4c8fdf45",
|
||||
"metadata": {},
|
||||
"outputs": [
|
||||
{
|
||||
"name": "stdout",
|
||||
"output_type": "stream",
|
||||
"text": [
|
||||
"Give the antonym of every input\n",
|
||||
"\n",
|
||||
"Input: happy\n",
|
||||
"Output: sad\n",
|
||||
"\n",
|
||||
"Input: worried\n",
|
||||
"Output:\n"
|
||||
]
|
||||
}
|
||||
],
|
||||
"source": [
|
||||
"# Input is a feeling, so should select the happy/sad example\n",
|
||||
"print(similar_prompt.format(adjective=\"worried\"))"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 13,
|
||||
"id": "829af21a",
|
||||
"metadata": {
|
||||
"scrolled": true
|
||||
},
|
||||
"outputs": [
|
||||
{
|
||||
"name": "stdout",
|
||||
"output_type": "stream",
|
||||
"text": [
|
||||
"Give the antonym of every input\n",
|
||||
"\n",
|
||||
"Input: happy\n",
|
||||
"Output: sad\n",
|
||||
"\n",
|
||||
"Input: fat\n",
|
||||
"Output:\n"
|
||||
]
|
||||
}
|
||||
],
|
||||
"source": [
|
||||
"# Input is a measurement, so should select the tall/short example\n",
|
||||
"print(similar_prompt.format(adjective=\"fat\"))"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 14,
|
||||
"id": "3c16fe23",
|
||||
"metadata": {},
|
||||
"outputs": [
|
||||
{
|
||||
"name": "stdout",
|
||||
"output_type": "stream",
|
||||
"text": [
|
||||
"Give the antonym of every input\n",
|
||||
"\n",
|
||||
"Input: happy\n",
|
||||
"Output: sad\n",
|
||||
"\n",
|
||||
"Input: joyful\n",
|
||||
"Output:\n"
|
||||
]
|
||||
}
|
||||
],
|
||||
"source": [
|
||||
"# You can add new examples to the SemanticSimilarityExampleSelector as well\n",
|
||||
"similar_prompt.example_selector.add_example({\"input\": \"enthusiastic\", \"output\": \"apathetic\"})\n",
|
||||
"print(similar_prompt.format(adjective=\"joyful\"))"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"id": "bc35afd0",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"## Maximal Marginal Relevance ExampleSelector\n",
|
||||
"\n",
|
||||
"The MaxMarginalRelevanceExampleSelector selects examples based on a combination of which examples are most similar to the inputs, while also optimizing for diversity. It does this by finding the examples with the embeddings that have the greatest cosine similarity with the inputs, and then iteratively adding them while penalizing them for closeness to already selected examples.\n"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 18,
|
||||
"id": "ac95c968",
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"from langchain.prompts.example_selector import MaxMarginalRelevanceExampleSelector\n",
|
||||
"from langchain.vectorstores import FAISS"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 19,
|
||||
"id": "db579bea",
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"example_selector = MaxMarginalRelevanceExampleSelector.from_examples(\n",
|
||||
" # This is the list of examples available to select from.\n",
|
||||
" examples, \n",
|
||||
" # This is the embedding class used to produce embeddings which are used to measure semantic similarity.\n",
|
||||
" OpenAIEmbeddings(), \n",
|
||||
" # This is the VectorStore class that is used to store the embeddings and do a similarity search over.\n",
|
||||
" FAISS, \n",
|
||||
" # This is the number of examples to produce.\n",
|
||||
" k=2\n",
|
||||
")\n",
|
||||
"mmr_prompt = FewShotPromptTemplate(\n",
|
||||
" # We provide an ExampleSelector instead of examples.\n",
|
||||
" example_selector=example_selector,\n",
|
||||
" example_prompt=example_prompt,\n",
|
||||
" prefix=\"Give the antonym of every input\",\n",
|
||||
" suffix=\"Input: {adjective}\\nOutput:\", \n",
|
||||
" input_variables=[\"adjective\"],\n",
|
||||
")"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 20,
|
||||
"id": "cd76e344",
|
||||
"metadata": {},
|
||||
"outputs": [
|
||||
{
|
||||
"name": "stdout",
|
||||
"output_type": "stream",
|
||||
"text": [
|
||||
"Give the antonym of every input\n",
|
||||
"\n",
|
||||
"Input: happy\n",
|
||||
"Output: sad\n",
|
||||
"\n",
|
||||
"Input: windy\n",
|
||||
"Output: calm\n",
|
||||
"\n",
|
||||
"Input: worried\n",
|
||||
"Output:\n"
|
||||
]
|
||||
}
|
||||
],
|
||||
"source": [
|
||||
"# Input is a feeling, so should select the happy/sad example as the first one\n",
|
||||
"print(mmr_prompt.format(adjective=\"worried\"))"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 21,
|
||||
"id": "cf82956b",
|
||||
"metadata": {},
|
||||
"outputs": [
|
||||
{
|
||||
"name": "stdout",
|
||||
"output_type": "stream",
|
||||
"text": [
|
||||
"Give the antonym of every input\n",
|
||||
"\n",
|
||||
"Input: enthusiastic\n",
|
||||
"Output: apathetic\n",
|
||||
"\n",
|
||||
"Input: worried\n",
|
||||
"Output:\n"
|
||||
]
|
||||
}
|
||||
],
|
||||
"source": [
|
||||
"# Let's compare this to what we would just get if we went solely off of similarity\n",
|
||||
"similar_prompt.example_selector.k = 2\n",
|
||||
"print(similar_prompt.format(adjective=\"worried\"))"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"id": "4aaeed2f",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"## NGram Overlap ExampleSelector\n",
|
||||
"\n",
|
||||
"The NGramOverlapExampleSelector selects and orders examples based on which examples are most similar to the input, according to an ngram overlap score. The ngram overlap score is a float between 0.0 and 1.0, inclusive. \n",
|
||||
"\n",
|
||||
"The selector allows for a threshold score to be set. Examples with an ngram overlap score less than or equal to the threshold are excluded. The threshold is set to -1.0, by default, so will not exclude any examples, only reorder them. Setting the threshold to 0.0 will exclude examples that have no ngram overlaps with the input.\n"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 2,
|
||||
"id": "9cbc0acc",
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"from langchain.prompts import PromptTemplate\n",
|
||||
"from langchain.prompts.example_selector.ngram_overlap import NGramOverlapExampleSelector"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 3,
|
||||
"id": "4f318f4b",
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"# These are examples of a fictional translation task.\n",
|
||||
"examples = [\n",
|
||||
" {\"input\": \"See Spot run.\", \"output\": \"Ver correr a Spot.\"},\n",
|
||||
" {\"input\": \"My dog barks.\", \"output\": \"Mi perro ladra.\"},\n",
|
||||
" {\"input\": \"Spot can run.\", \"output\": \"Spot puede correr.\"},\n",
|
||||
"]"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 4,
|
||||
"id": "bf75e0fe",
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"example_prompt = PromptTemplate(\n",
|
||||
" input_variables=[\"input\", \"output\"],\n",
|
||||
" template=\"Input: {input}\\nOutput: {output}\",\n",
|
||||
")\n",
|
||||
"example_selector = NGramOverlapExampleSelector(\n",
|
||||
" # These are the examples it has available to choose from.\n",
|
||||
" examples=examples, \n",
|
||||
" # This is the PromptTemplate being used to format the examples.\n",
|
||||
" example_prompt=example_prompt, \n",
|
||||
" # This is the threshold, at which selector stops.\n",
|
||||
" # It is set to -1.0 by default.\n",
|
||||
" threshold=-1.0,\n",
|
||||
" # For negative threshold:\n",
|
||||
" # Selector sorts examples by ngram overlap score, and excludes none.\n",
|
||||
" # For threshold greater than 1.0:\n",
|
||||
" # Selector excludes all examples, and returns an empty list.\n",
|
||||
" # For threshold equal to 0.0:\n",
|
||||
" # Selector sorts examples by ngram overlap score,\n",
|
||||
" # and excludes those with no ngram overlap with input.\n",
|
||||
")\n",
|
||||
"dynamic_prompt = FewShotPromptTemplate(\n",
|
||||
" # We provide an ExampleSelector instead of examples.\n",
|
||||
" example_selector=example_selector,\n",
|
||||
" example_prompt=example_prompt,\n",
|
||||
" prefix=\"Give the Spanish translation of every input\",\n",
|
||||
" suffix=\"Input: {sentence}\\nOutput:\", \n",
|
||||
" input_variables=[\"sentence\"],\n",
|
||||
")"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 5,
|
||||
"id": "83fb218a",
|
||||
"metadata": {},
|
||||
"outputs": [
|
||||
{
|
||||
"name": "stdout",
|
||||
"output_type": "stream",
|
||||
"text": [
|
||||
"Give the Spanish translation of every input\n",
|
||||
"\n",
|
||||
"Input: Spot can run.\n",
|
||||
"Output: Spot puede correr.\n",
|
||||
"\n",
|
||||
"Input: See Spot run.\n",
|
||||
"Output: Ver correr a Spot.\n",
|
||||
"\n",
|
||||
"Input: My dog barks.\n",
|
||||
"Output: Mi perro ladra.\n",
|
||||
"\n",
|
||||
"Input: Spot can run fast.\n",
|
||||
"Output:\n"
|
||||
]
|
||||
}
|
||||
],
|
||||
"source": [
|
||||
"# An example input with large ngram overlap with \"Spot can run.\"\n",
|
||||
"# and no overlap with \"My dog barks.\"\n",
|
||||
"print(dynamic_prompt.format(sentence=\"Spot can run fast.\"))"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 6,
|
||||
"id": "485f5307",
|
||||
"metadata": {},
|
||||
"outputs": [
|
||||
{
|
||||
"name": "stdout",
|
||||
"output_type": "stream",
|
||||
"text": [
|
||||
"Give the Spanish translation of every input\n",
|
||||
"\n",
|
||||
"Input: Spot can run.\n",
|
||||
"Output: Spot puede correr.\n",
|
||||
"\n",
|
||||
"Input: See Spot run.\n",
|
||||
"Output: Ver correr a Spot.\n",
|
||||
"\n",
|
||||
"Input: Spot plays fetch.\n",
|
||||
"Output: Spot juega a buscar.\n",
|
||||
"\n",
|
||||
"Input: My dog barks.\n",
|
||||
"Output: Mi perro ladra.\n",
|
||||
"\n",
|
||||
"Input: Spot can run fast.\n",
|
||||
"Output:\n"
|
||||
]
|
||||
}
|
||||
],
|
||||
"source": [
|
||||
"# You can add examples to NGramOverlapExampleSelector as well.\n",
|
||||
"new_example = {\"input\": \"Spot plays fetch.\", \"output\": \"Spot juega a buscar.\"}\n",
|
||||
"\n",
|
||||
"example_selector.add_example(new_example)\n",
|
||||
"print(dynamic_prompt.format(sentence=\"Spot can run fast.\"))"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 7,
|
||||
"id": "606ce697",
|
||||
"metadata": {},
|
||||
"outputs": [
|
||||
{
|
||||
"name": "stdout",
|
||||
"output_type": "stream",
|
||||
"text": [
|
||||
"Give the Spanish translation of every input\n",
|
||||
"\n",
|
||||
"Input: Spot can run.\n",
|
||||
"Output: Spot puede correr.\n",
|
||||
"\n",
|
||||
"Input: See Spot run.\n",
|
||||
"Output: Ver correr a Spot.\n",
|
||||
"\n",
|
||||
"Input: Spot plays fetch.\n",
|
||||
"Output: Spot juega a buscar.\n",
|
||||
"\n",
|
||||
"Input: Spot can run fast.\n",
|
||||
"Output:\n"
|
||||
]
|
||||
}
|
||||
],
|
||||
"source": [
|
||||
"# You can set a threshold at which examples are excluded.\n",
|
||||
"# For example, setting threshold equal to 0.0\n",
|
||||
"# excludes examples with no ngram overlaps with input.\n",
|
||||
"# Since \"My dog barks.\" has no ngram overlaps with \"Spot can run fast.\"\n",
|
||||
"# it is excluded.\n",
|
||||
"example_selector.threshold=0.0\n",
|
||||
"print(dynamic_prompt.format(sentence=\"Spot can run fast.\"))"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 87,
|
||||
"id": "7f8d72f7",
|
||||
"metadata": {},
|
||||
"outputs": [
|
||||
{
|
||||
"name": "stdout",
|
||||
"output_type": "stream",
|
||||
"text": [
|
||||
"Give the Spanish translation of every input\n",
|
||||
"\n",
|
||||
"Input: Spot can run.\n",
|
||||
"Output: Spot puede correr.\n",
|
||||
"\n",
|
||||
"Input: Spot plays fetch.\n",
|
||||
"Output: Spot juega a buscar.\n",
|
||||
"\n",
|
||||
"Input: Spot can play fetch.\n",
|
||||
"Output:\n"
|
||||
]
|
||||
}
|
||||
],
|
||||
"source": [
|
||||
"# Setting small nonzero threshold\n",
|
||||
"example_selector.threshold=0.09\n",
|
||||
"print(dynamic_prompt.format(sentence=\"Spot can play fetch.\"))"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 88,
|
||||
"id": "09633aa8",
|
||||
"metadata": {},
|
||||
"outputs": [
|
||||
{
|
||||
"name": "stdout",
|
||||
"output_type": "stream",
|
||||
"text": [
|
||||
"Give the Spanish translation of every input\n",
|
||||
"\n",
|
||||
"Input: Spot can play fetch.\n",
|
||||
"Output:\n"
|
||||
]
|
||||
}
|
||||
],
|
||||
"source": [
|
||||
"# Setting threshold greater than 1.0\n",
|
||||
"example_selector.threshold=1.0+1e-9\n",
|
||||
"print(dynamic_prompt.format(sentence=\"Spot can play fetch.\"))"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": null,
|
||||
"id": "39f30097",
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": []
|
||||
}
|
||||
],
|
||||
"metadata": {
|
||||
"kernelspec": {
|
||||
"display_name": "Python 3 (ipykernel)",
|
||||
"language": "python",
|
||||
"name": "python3"
|
||||
},
|
||||
"language_info": {
|
||||
"codemirror_mode": {
|
||||
"name": "ipython",
|
||||
"version": 3
|
||||
},
|
||||
"file_extension": ".py",
|
||||
"mimetype": "text/x-python",
|
||||
"name": "python",
|
||||
"nbconvert_exporter": "python",
|
||||
"pygments_lexer": "ipython3",
|
||||
"version": "3.9.1"
|
||||
}
|
||||
},
|
||||
"nbformat": 4,
|
||||
"nbformat_minor": 5
|
||||
}
|
@@ -1,764 +0,0 @@
|
||||
{
|
||||
"cells": [
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"id": "084ee2f0",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"# Output Parsers\n",
|
||||
"\n",
|
||||
"Language models output text. But many times you may want to get more structured information than just text back. This is where output parsers come in.\n",
|
||||
"\n",
|
||||
"Output parsers are classes that help structure language model responses. There are two main methods an output parser must implement:\n",
|
||||
"\n",
|
||||
"- `get_format_instructions() -> str`: A method which returns a string containing instructions for how the output of a language model should be formatted.\n",
|
||||
"- `parse(str) -> Any`: A method which takes in a string (assumed to be the response from a language model) and parses it into some structure.\n",
|
||||
"\n",
|
||||
"And then one optional one:\n",
|
||||
"\n",
|
||||
"- `parse_with_prompt(str) -> Any`: A method which takes in a string (assumed to be the response from a language model) and a prompt (assumed to the prompt that generated such a response) and parses it into some structure. The prompt is largely provided in the event the OutputParser wants to retry or fix the output in some way, and needs information from the prompt to do so.\n",
|
||||
"\n",
|
||||
"\n",
|
||||
"Below we go over some examples of output parsers."
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 1,
|
||||
"id": "5f0c8a33",
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"from langchain.prompts import PromptTemplate, ChatPromptTemplate, HumanMessagePromptTemplate\n",
|
||||
"from langchain.llms import OpenAI\n",
|
||||
"from langchain.chat_models import ChatOpenAI"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"id": "a1ae632a",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"## PydanticOutputParser\n",
|
||||
"This output parser allows users to specify an arbitrary JSON schema and query LLMs for JSON outputs that conform to that schema.\n",
|
||||
"\n",
|
||||
"Keep in mind that large language models are leaky abstractions! You'll have to use an LLM with sufficient capacity to generate well-formed JSON. In the OpenAI family, DaVinci can do reliably but Curie's ability already drops off dramatically. \n",
|
||||
"\n",
|
||||
"Use Pydantic to declare your data model. Pydantic's BaseModel like a Python dataclass, but with actual type checking + coercion."
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 2,
|
||||
"id": "cba6d8e3",
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"from langchain.output_parsers import PydanticOutputParser\n",
|
||||
"from pydantic import BaseModel, Field, validator\n",
|
||||
"from typing import List"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 3,
|
||||
"id": "0a203100",
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"model_name = 'text-davinci-003'\n",
|
||||
"temperature = 0.0\n",
|
||||
"model = OpenAI(model_name=model_name, temperature=temperature)"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 4,
|
||||
"id": "b3f16168",
|
||||
"metadata": {},
|
||||
"outputs": [
|
||||
{
|
||||
"data": {
|
||||
"text/plain": [
|
||||
"Joke(setup='Why did the chicken cross the road?', punchline='To get to the other side!')"
|
||||
]
|
||||
},
|
||||
"execution_count": 4,
|
||||
"metadata": {},
|
||||
"output_type": "execute_result"
|
||||
}
|
||||
],
|
||||
"source": [
|
||||
"# Define your desired data structure.\n",
|
||||
"class Joke(BaseModel):\n",
|
||||
" setup: str = Field(description=\"question to set up a joke\")\n",
|
||||
" punchline: str = Field(description=\"answer to resolve the joke\")\n",
|
||||
" \n",
|
||||
" # You can add custom validation logic easily with Pydantic.\n",
|
||||
" @validator('setup')\n",
|
||||
" def question_ends_with_question_mark(cls, field):\n",
|
||||
" if field[-1] != '?':\n",
|
||||
" raise ValueError(\"Badly formed question!\")\n",
|
||||
" return field\n",
|
||||
"\n",
|
||||
"# And a query intented to prompt a language model to populate the data structure.\n",
|
||||
"joke_query = \"Tell me a joke.\"\n",
|
||||
"\n",
|
||||
"# Set up a parser + inject instructions into the prompt template.\n",
|
||||
"parser = PydanticOutputParser(pydantic_object=Joke)\n",
|
||||
"\n",
|
||||
"prompt = PromptTemplate(\n",
|
||||
" template=\"Answer the user query.\\n{format_instructions}\\n{query}\\n\",\n",
|
||||
" input_variables=[\"query\"],\n",
|
||||
" partial_variables={\"format_instructions\": parser.get_format_instructions()}\n",
|
||||
")\n",
|
||||
"\n",
|
||||
"_input = prompt.format_prompt(query=joke_query)\n",
|
||||
"\n",
|
||||
"output = model(_input.to_string())\n",
|
||||
"\n",
|
||||
"parser.parse(output)"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 5,
|
||||
"id": "03049f88",
|
||||
"metadata": {},
|
||||
"outputs": [
|
||||
{
|
||||
"data": {
|
||||
"text/plain": [
|
||||
"Actor(name='Tom Hanks', film_names=['Forrest Gump', 'Saving Private Ryan', 'The Green Mile', 'Cast Away', 'Toy Story'])"
|
||||
]
|
||||
},
|
||||
"execution_count": 5,
|
||||
"metadata": {},
|
||||
"output_type": "execute_result"
|
||||
}
|
||||
],
|
||||
"source": [
|
||||
"# Here's another example, but with a compound typed field.\n",
|
||||
"class Actor(BaseModel):\n",
|
||||
" name: str = Field(description=\"name of an actor\")\n",
|
||||
" film_names: List[str] = Field(description=\"list of names of films they starred in\")\n",
|
||||
" \n",
|
||||
"actor_query = \"Generate the filmography for a random actor.\"\n",
|
||||
"\n",
|
||||
"parser = PydanticOutputParser(pydantic_object=Actor)\n",
|
||||
"\n",
|
||||
"prompt = PromptTemplate(\n",
|
||||
" template=\"Answer the user query.\\n{format_instructions}\\n{query}\\n\",\n",
|
||||
" input_variables=[\"query\"],\n",
|
||||
" partial_variables={\"format_instructions\": parser.get_format_instructions()}\n",
|
||||
")\n",
|
||||
"\n",
|
||||
"_input = prompt.format_prompt(query=actor_query)\n",
|
||||
"\n",
|
||||
"output = model(_input.to_string())\n",
|
||||
"\n",
|
||||
"parser.parse(output)"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"id": "4d6c0c86",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"## Fixing Output Parsing Mistakes\n",
|
||||
"\n",
|
||||
"The above guardrail simply tries to parse the LLM response. If it does not parse correctly, then it errors.\n",
|
||||
"\n",
|
||||
"But we can do other things besides throw errors. Specifically, we can pass the misformatted output, along with the formatted instructions, to the model and ask it to fix it.\n",
|
||||
"\n",
|
||||
"For this example, we'll use the above OutputParser. Here's what happens if we pass it a result that does not comply with the schema:"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 6,
|
||||
"id": "73beb20d",
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"misformatted = \"{'name': 'Tom Hanks', 'film_names': ['Forrest Gump']}\""
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 7,
|
||||
"id": "f0e5ba80",
|
||||
"metadata": {},
|
||||
"outputs": [
|
||||
{
|
||||
"ename": "OutputParserException",
|
||||
"evalue": "Failed to parse Actor from completion {'name': 'Tom Hanks', 'film_names': ['Forrest Gump']}. Got: Expecting property name enclosed in double quotes: line 1 column 2 (char 1)",
|
||||
"output_type": "error",
|
||||
"traceback": [
|
||||
"\u001b[0;31m---------------------------------------------------------------------------\u001b[0m",
|
||||
"\u001b[0;31mJSONDecodeError\u001b[0m Traceback (most recent call last)",
|
||||
"File \u001b[0;32m~/workplace/langchain/langchain/output_parsers/pydantic.py:23\u001b[0m, in \u001b[0;36mPydanticOutputParser.parse\u001b[0;34m(self, text)\u001b[0m\n\u001b[1;32m 22\u001b[0m json_str \u001b[38;5;241m=\u001b[39m match\u001b[38;5;241m.\u001b[39mgroup()\n\u001b[0;32m---> 23\u001b[0m json_object \u001b[38;5;241m=\u001b[39m \u001b[43mjson\u001b[49m\u001b[38;5;241;43m.\u001b[39;49m\u001b[43mloads\u001b[49m\u001b[43m(\u001b[49m\u001b[43mjson_str\u001b[49m\u001b[43m)\u001b[49m\n\u001b[1;32m 24\u001b[0m \u001b[38;5;28;01mreturn\u001b[39;00m \u001b[38;5;28mself\u001b[39m\u001b[38;5;241m.\u001b[39mpydantic_object\u001b[38;5;241m.\u001b[39mparse_obj(json_object)\n",
|
||||
"File \u001b[0;32m~/.pyenv/versions/3.9.1/lib/python3.9/json/__init__.py:346\u001b[0m, in \u001b[0;36mloads\u001b[0;34m(s, cls, object_hook, parse_float, parse_int, parse_constant, object_pairs_hook, **kw)\u001b[0m\n\u001b[1;32m 343\u001b[0m \u001b[38;5;28;01mif\u001b[39;00m (\u001b[38;5;28mcls\u001b[39m \u001b[38;5;129;01mis\u001b[39;00m \u001b[38;5;28;01mNone\u001b[39;00m \u001b[38;5;129;01mand\u001b[39;00m object_hook \u001b[38;5;129;01mis\u001b[39;00m \u001b[38;5;28;01mNone\u001b[39;00m \u001b[38;5;129;01mand\u001b[39;00m\n\u001b[1;32m 344\u001b[0m parse_int \u001b[38;5;129;01mis\u001b[39;00m \u001b[38;5;28;01mNone\u001b[39;00m \u001b[38;5;129;01mand\u001b[39;00m parse_float \u001b[38;5;129;01mis\u001b[39;00m \u001b[38;5;28;01mNone\u001b[39;00m \u001b[38;5;129;01mand\u001b[39;00m\n\u001b[1;32m 345\u001b[0m parse_constant \u001b[38;5;129;01mis\u001b[39;00m \u001b[38;5;28;01mNone\u001b[39;00m \u001b[38;5;129;01mand\u001b[39;00m object_pairs_hook \u001b[38;5;129;01mis\u001b[39;00m \u001b[38;5;28;01mNone\u001b[39;00m \u001b[38;5;129;01mand\u001b[39;00m \u001b[38;5;129;01mnot\u001b[39;00m kw):\n\u001b[0;32m--> 346\u001b[0m \u001b[38;5;28;01mreturn\u001b[39;00m \u001b[43m_default_decoder\u001b[49m\u001b[38;5;241;43m.\u001b[39;49m\u001b[43mdecode\u001b[49m\u001b[43m(\u001b[49m\u001b[43ms\u001b[49m\u001b[43m)\u001b[49m\n\u001b[1;32m 347\u001b[0m \u001b[38;5;28;01mif\u001b[39;00m \u001b[38;5;28mcls\u001b[39m \u001b[38;5;129;01mis\u001b[39;00m \u001b[38;5;28;01mNone\u001b[39;00m:\n",
|
||||
"File \u001b[0;32m~/.pyenv/versions/3.9.1/lib/python3.9/json/decoder.py:337\u001b[0m, in \u001b[0;36mJSONDecoder.decode\u001b[0;34m(self, s, _w)\u001b[0m\n\u001b[1;32m 333\u001b[0m \u001b[38;5;250m\u001b[39m\u001b[38;5;124;03m\"\"\"Return the Python representation of ``s`` (a ``str`` instance\u001b[39;00m\n\u001b[1;32m 334\u001b[0m \u001b[38;5;124;03mcontaining a JSON document).\u001b[39;00m\n\u001b[1;32m 335\u001b[0m \n\u001b[1;32m 336\u001b[0m \u001b[38;5;124;03m\"\"\"\u001b[39;00m\n\u001b[0;32m--> 337\u001b[0m obj, end \u001b[38;5;241m=\u001b[39m \u001b[38;5;28;43mself\u001b[39;49m\u001b[38;5;241;43m.\u001b[39;49m\u001b[43mraw_decode\u001b[49m\u001b[43m(\u001b[49m\u001b[43ms\u001b[49m\u001b[43m,\u001b[49m\u001b[43m \u001b[49m\u001b[43midx\u001b[49m\u001b[38;5;241;43m=\u001b[39;49m\u001b[43m_w\u001b[49m\u001b[43m(\u001b[49m\u001b[43ms\u001b[49m\u001b[43m,\u001b[49m\u001b[43m \u001b[49m\u001b[38;5;241;43m0\u001b[39;49m\u001b[43m)\u001b[49m\u001b[38;5;241;43m.\u001b[39;49m\u001b[43mend\u001b[49m\u001b[43m(\u001b[49m\u001b[43m)\u001b[49m\u001b[43m)\u001b[49m\n\u001b[1;32m 338\u001b[0m end \u001b[38;5;241m=\u001b[39m _w(s, end)\u001b[38;5;241m.\u001b[39mend()\n",
|
||||
"File \u001b[0;32m~/.pyenv/versions/3.9.1/lib/python3.9/json/decoder.py:353\u001b[0m, in \u001b[0;36mJSONDecoder.raw_decode\u001b[0;34m(self, s, idx)\u001b[0m\n\u001b[1;32m 352\u001b[0m \u001b[38;5;28;01mtry\u001b[39;00m:\n\u001b[0;32m--> 353\u001b[0m obj, end \u001b[38;5;241m=\u001b[39m \u001b[38;5;28;43mself\u001b[39;49m\u001b[38;5;241;43m.\u001b[39;49m\u001b[43mscan_once\u001b[49m\u001b[43m(\u001b[49m\u001b[43ms\u001b[49m\u001b[43m,\u001b[49m\u001b[43m \u001b[49m\u001b[43midx\u001b[49m\u001b[43m)\u001b[49m\n\u001b[1;32m 354\u001b[0m \u001b[38;5;28;01mexcept\u001b[39;00m \u001b[38;5;167;01mStopIteration\u001b[39;00m \u001b[38;5;28;01mas\u001b[39;00m err:\n",
|
||||
"\u001b[0;31mJSONDecodeError\u001b[0m: Expecting property name enclosed in double quotes: line 1 column 2 (char 1)",
|
||||
"\nDuring handling of the above exception, another exception occurred:\n",
|
||||
"\u001b[0;31mOutputParserException\u001b[0m Traceback (most recent call last)",
|
||||
"Cell \u001b[0;32mIn[7], line 1\u001b[0m\n\u001b[0;32m----> 1\u001b[0m \u001b[43mparser\u001b[49m\u001b[38;5;241;43m.\u001b[39;49m\u001b[43mparse\u001b[49m\u001b[43m(\u001b[49m\u001b[43mmisformatted\u001b[49m\u001b[43m)\u001b[49m\n",
|
||||
"File \u001b[0;32m~/workplace/langchain/langchain/output_parsers/pydantic.py:29\u001b[0m, in \u001b[0;36mPydanticOutputParser.parse\u001b[0;34m(self, text)\u001b[0m\n\u001b[1;32m 27\u001b[0m name \u001b[38;5;241m=\u001b[39m \u001b[38;5;28mself\u001b[39m\u001b[38;5;241m.\u001b[39mpydantic_object\u001b[38;5;241m.\u001b[39m\u001b[38;5;18m__name__\u001b[39m\n\u001b[1;32m 28\u001b[0m msg \u001b[38;5;241m=\u001b[39m \u001b[38;5;124mf\u001b[39m\u001b[38;5;124m\"\u001b[39m\u001b[38;5;124mFailed to parse \u001b[39m\u001b[38;5;132;01m{\u001b[39;00mname\u001b[38;5;132;01m}\u001b[39;00m\u001b[38;5;124m from completion \u001b[39m\u001b[38;5;132;01m{\u001b[39;00mtext\u001b[38;5;132;01m}\u001b[39;00m\u001b[38;5;124m. Got: \u001b[39m\u001b[38;5;132;01m{\u001b[39;00me\u001b[38;5;132;01m}\u001b[39;00m\u001b[38;5;124m\"\u001b[39m\n\u001b[0;32m---> 29\u001b[0m \u001b[38;5;28;01mraise\u001b[39;00m OutputParserException(msg)\n",
|
||||
"\u001b[0;31mOutputParserException\u001b[0m: Failed to parse Actor from completion {'name': 'Tom Hanks', 'film_names': ['Forrest Gump']}. Got: Expecting property name enclosed in double quotes: line 1 column 2 (char 1)"
|
||||
]
|
||||
}
|
||||
],
|
||||
"source": [
|
||||
"parser.parse(misformatted)"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"id": "6c7c82b6",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"Now we can construct and use a `OutputFixingParser`. This output parser takes as an argument another output parser but also an LLM with which to try to correct any formatting mistakes."
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 9,
|
||||
"id": "39b1a5ce",
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"from langchain.output_parsers import OutputFixingParser\n",
|
||||
"\n",
|
||||
"new_parser = OutputFixingParser.from_llm(parser=parser, llm=ChatOpenAI())"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 10,
|
||||
"id": "0fd96d68",
|
||||
"metadata": {},
|
||||
"outputs": [
|
||||
{
|
||||
"data": {
|
||||
"text/plain": [
|
||||
"Actor(name='Tom Hanks', film_names=['Forrest Gump'])"
|
||||
]
|
||||
},
|
||||
"execution_count": 10,
|
||||
"metadata": {},
|
||||
"output_type": "execute_result"
|
||||
}
|
||||
],
|
||||
"source": [
|
||||
"new_parser.parse(misformatted)"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"id": "ea34eeaa",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"## Fixing Output Parsing Mistakes with the original prompt\n",
|
||||
"\n",
|
||||
"While in some cases it is possible to fix any parsing mistakes by only looking at the output, in other cases it can't. An example of this is when the output is not just in the incorrect format, but is partially complete. Consider the below example."
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 11,
|
||||
"id": "67c5e1ac",
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"template = \"\"\"Based on the user question, provide an Action and Action Input for what step should be taken.\n",
|
||||
"{format_instructions}\n",
|
||||
"Question: {query}\n",
|
||||
"Response:\"\"\"\n",
|
||||
"class Action(BaseModel):\n",
|
||||
" action: str = Field(description=\"action to take\")\n",
|
||||
" action_input: str = Field(description=\"input to the action\")\n",
|
||||
" \n",
|
||||
"parser = PydanticOutputParser(pydantic_object=Action)"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 12,
|
||||
"id": "007aa87f",
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"prompt = PromptTemplate(\n",
|
||||
" template=\"Answer the user query.\\n{format_instructions}\\n{query}\\n\",\n",
|
||||
" input_variables=[\"query\"],\n",
|
||||
" partial_variables={\"format_instructions\": parser.get_format_instructions()}\n",
|
||||
")"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 13,
|
||||
"id": "10d207ff",
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"prompt_value = prompt.format_prompt(query=\"who is leo di caprios gf?\")"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 14,
|
||||
"id": "68622837",
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"bad_response = '{\"action\": \"search\"}'"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"id": "25631465",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"If we try to parse this response as is, we will get an error"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 15,
|
||||
"id": "894967c1",
|
||||
"metadata": {},
|
||||
"outputs": [
|
||||
{
|
||||
"ename": "OutputParserException",
|
||||
"evalue": "Failed to parse Action from completion {\"action\": \"search\"}. Got: 1 validation error for Action\naction_input\n field required (type=value_error.missing)",
|
||||
"output_type": "error",
|
||||
"traceback": [
|
||||
"\u001b[0;31m---------------------------------------------------------------------------\u001b[0m",
|
||||
"\u001b[0;31mValidationError\u001b[0m Traceback (most recent call last)",
|
||||
"File \u001b[0;32m~/workplace/langchain/langchain/output_parsers/pydantic.py:24\u001b[0m, in \u001b[0;36mPydanticOutputParser.parse\u001b[0;34m(self, text)\u001b[0m\n\u001b[1;32m 23\u001b[0m json_object \u001b[38;5;241m=\u001b[39m json\u001b[38;5;241m.\u001b[39mloads(json_str)\n\u001b[0;32m---> 24\u001b[0m \u001b[38;5;28;01mreturn\u001b[39;00m \u001b[38;5;28;43mself\u001b[39;49m\u001b[38;5;241;43m.\u001b[39;49m\u001b[43mpydantic_object\u001b[49m\u001b[38;5;241;43m.\u001b[39;49m\u001b[43mparse_obj\u001b[49m\u001b[43m(\u001b[49m\u001b[43mjson_object\u001b[49m\u001b[43m)\u001b[49m\n\u001b[1;32m 26\u001b[0m \u001b[38;5;28;01mexcept\u001b[39;00m (json\u001b[38;5;241m.\u001b[39mJSONDecodeError, ValidationError) \u001b[38;5;28;01mas\u001b[39;00m e:\n",
|
||||
"File \u001b[0;32m~/.pyenv/versions/3.9.1/envs/langchain/lib/python3.9/site-packages/pydantic/main.py:527\u001b[0m, in \u001b[0;36mpydantic.main.BaseModel.parse_obj\u001b[0;34m()\u001b[0m\n",
|
||||
"File \u001b[0;32m~/.pyenv/versions/3.9.1/envs/langchain/lib/python3.9/site-packages/pydantic/main.py:342\u001b[0m, in \u001b[0;36mpydantic.main.BaseModel.__init__\u001b[0;34m()\u001b[0m\n",
|
||||
"\u001b[0;31mValidationError\u001b[0m: 1 validation error for Action\naction_input\n field required (type=value_error.missing)",
|
||||
"\nDuring handling of the above exception, another exception occurred:\n",
|
||||
"\u001b[0;31mOutputParserException\u001b[0m Traceback (most recent call last)",
|
||||
"Cell \u001b[0;32mIn[15], line 1\u001b[0m\n\u001b[0;32m----> 1\u001b[0m \u001b[43mparser\u001b[49m\u001b[38;5;241;43m.\u001b[39;49m\u001b[43mparse\u001b[49m\u001b[43m(\u001b[49m\u001b[43mbad_response\u001b[49m\u001b[43m)\u001b[49m\n",
|
||||
"File \u001b[0;32m~/workplace/langchain/langchain/output_parsers/pydantic.py:29\u001b[0m, in \u001b[0;36mPydanticOutputParser.parse\u001b[0;34m(self, text)\u001b[0m\n\u001b[1;32m 27\u001b[0m name \u001b[38;5;241m=\u001b[39m \u001b[38;5;28mself\u001b[39m\u001b[38;5;241m.\u001b[39mpydantic_object\u001b[38;5;241m.\u001b[39m\u001b[38;5;18m__name__\u001b[39m\n\u001b[1;32m 28\u001b[0m msg \u001b[38;5;241m=\u001b[39m \u001b[38;5;124mf\u001b[39m\u001b[38;5;124m\"\u001b[39m\u001b[38;5;124mFailed to parse \u001b[39m\u001b[38;5;132;01m{\u001b[39;00mname\u001b[38;5;132;01m}\u001b[39;00m\u001b[38;5;124m from completion \u001b[39m\u001b[38;5;132;01m{\u001b[39;00mtext\u001b[38;5;132;01m}\u001b[39;00m\u001b[38;5;124m. Got: \u001b[39m\u001b[38;5;132;01m{\u001b[39;00me\u001b[38;5;132;01m}\u001b[39;00m\u001b[38;5;124m\"\u001b[39m\n\u001b[0;32m---> 29\u001b[0m \u001b[38;5;28;01mraise\u001b[39;00m OutputParserException(msg)\n",
|
||||
"\u001b[0;31mOutputParserException\u001b[0m: Failed to parse Action from completion {\"action\": \"search\"}. Got: 1 validation error for Action\naction_input\n field required (type=value_error.missing)"
|
||||
]
|
||||
}
|
||||
],
|
||||
"source": [
|
||||
"parser.parse(bad_response)"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"id": "f6b64696",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"If we try to use the `OutputFixingParser` to fix this error, it will be confused - namely, it doesn't know what to actually put for action input."
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 16,
|
||||
"id": "78b2b40d",
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"fix_parser = OutputFixingParser.from_llm(parser=parser, llm=ChatOpenAI())"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 17,
|
||||
"id": "4fe1301d",
|
||||
"metadata": {},
|
||||
"outputs": [
|
||||
{
|
||||
"data": {
|
||||
"text/plain": [
|
||||
"Action(action='search', action_input='keyword')"
|
||||
]
|
||||
},
|
||||
"execution_count": 17,
|
||||
"metadata": {},
|
||||
"output_type": "execute_result"
|
||||
}
|
||||
],
|
||||
"source": [
|
||||
"fix_parser.parse(bad_response)"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"id": "9bd9ea7d",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"Instead, we can use the RetryOutputParser, which passes in the prompt (as well as the original output) to try again to get a better response."
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 19,
|
||||
"id": "7e8a8a28",
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"from langchain.output_parsers import RetryWithErrorOutputParser"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 20,
|
||||
"id": "5c86e141",
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"retry_parser = RetryWithErrorOutputParser.from_llm(parser=parser, llm=ChatOpenAI())"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 21,
|
||||
"id": "9c04f731",
|
||||
"metadata": {},
|
||||
"outputs": [
|
||||
{
|
||||
"data": {
|
||||
"text/plain": [
|
||||
"Action(action='search', action_input='leo di caprios girlfriend')"
|
||||
]
|
||||
},
|
||||
"execution_count": 21,
|
||||
"metadata": {},
|
||||
"output_type": "execute_result"
|
||||
}
|
||||
],
|
||||
"source": [
|
||||
"retry_parser.parse_with_prompt(bad_response, prompt_value)"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"id": "61f67890",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"<br>\n",
|
||||
"<br>\n",
|
||||
"<br>\n",
|
||||
"<br>\n",
|
||||
"<br>\n",
|
||||
"<br>\n",
|
||||
"<br>\n",
|
||||
"<br>\n",
|
||||
"\n",
|
||||
"---"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"id": "64bf525a",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"# Older, less powerful parsers"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"id": "91871002",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"## Structured Output Parser\n",
|
||||
"\n",
|
||||
"While the Pydantic/JSON parser is more powerful, we initially experimented data structures having text fields only."
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 16,
|
||||
"id": "b492997a",
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"from langchain.output_parsers import StructuredOutputParser, ResponseSchema"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"id": "09473dce",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"Here we define the response schema we want to receive."
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 17,
|
||||
"id": "432ac44a",
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"response_schemas = [\n",
|
||||
" ResponseSchema(name=\"answer\", description=\"answer to the user's question\"),\n",
|
||||
" ResponseSchema(name=\"source\", description=\"source used to answer the user's question, should be a website.\")\n",
|
||||
"]\n",
|
||||
"output_parser = StructuredOutputParser.from_response_schemas(response_schemas)"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"id": "7b92ce96",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"We now get a string that contains instructions for how the response should be formatted, and we then insert that into our prompt."
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 18,
|
||||
"id": "593cfc25",
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"format_instructions = output_parser.get_format_instructions()\n",
|
||||
"prompt = PromptTemplate(\n",
|
||||
" template=\"answer the users question as best as possible.\\n{format_instructions}\\n{question}\",\n",
|
||||
" input_variables=[\"question\"],\n",
|
||||
" partial_variables={\"format_instructions\": format_instructions}\n",
|
||||
")"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"id": "0943e783",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"We can now use this to format a prompt to send to the language model, and then parse the returned result."
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 19,
|
||||
"id": "106f1ba6",
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"model = OpenAI(temperature=0)"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 20,
|
||||
"id": "86d9d24f",
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"_input = prompt.format_prompt(question=\"what's the capital of france\")\n",
|
||||
"output = model(_input.to_string())"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 21,
|
||||
"id": "956bdc99",
|
||||
"metadata": {},
|
||||
"outputs": [
|
||||
{
|
||||
"data": {
|
||||
"text/plain": [
|
||||
"{'answer': 'Paris', 'source': 'https://en.wikipedia.org/wiki/Paris'}"
|
||||
]
|
||||
},
|
||||
"execution_count": 21,
|
||||
"metadata": {},
|
||||
"output_type": "execute_result"
|
||||
}
|
||||
],
|
||||
"source": [
|
||||
"output_parser.parse(output)"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"id": "da639285",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"And here's an example of using this in a chat model"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 22,
|
||||
"id": "8f483d7d",
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"chat_model = ChatOpenAI(temperature=0)"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 23,
|
||||
"id": "f761cbf1",
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"prompt = ChatPromptTemplate(\n",
|
||||
" messages=[\n",
|
||||
" HumanMessagePromptTemplate.from_template(\"answer the users question as best as possible.\\n{format_instructions}\\n{question}\") \n",
|
||||
" ],\n",
|
||||
" input_variables=[\"question\"],\n",
|
||||
" partial_variables={\"format_instructions\": format_instructions}\n",
|
||||
")"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 24,
|
||||
"id": "edd73ae3",
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"_input = prompt.format_prompt(question=\"what's the capital of france\")\n",
|
||||
"output = chat_model(_input.to_messages())"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 25,
|
||||
"id": "a3c8b91e",
|
||||
"metadata": {},
|
||||
"outputs": [
|
||||
{
|
||||
"data": {
|
||||
"text/plain": [
|
||||
"{'answer': 'Paris', 'source': 'https://en.wikipedia.org/wiki/Paris'}"
|
||||
]
|
||||
},
|
||||
"execution_count": 25,
|
||||
"metadata": {},
|
||||
"output_type": "execute_result"
|
||||
}
|
||||
],
|
||||
"source": [
|
||||
"output_parser.parse(output.content)"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"id": "9936fa27",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"## CommaSeparatedListOutputParser\n",
|
||||
"\n",
|
||||
"Here's another parser strictly less powerful than Pydantic/JSON parsing."
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 26,
|
||||
"id": "872246d7",
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"from langchain.output_parsers import CommaSeparatedListOutputParser"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 27,
|
||||
"id": "c3f9aee6",
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"output_parser = CommaSeparatedListOutputParser()"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 28,
|
||||
"id": "e77871b7",
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"format_instructions = output_parser.get_format_instructions()\n",
|
||||
"prompt = PromptTemplate(\n",
|
||||
" template=\"List five {subject}.\\n{format_instructions}\",\n",
|
||||
" input_variables=[\"subject\"],\n",
|
||||
" partial_variables={\"format_instructions\": format_instructions}\n",
|
||||
")"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 29,
|
||||
"id": "a71cb5d3",
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"model = OpenAI(temperature=0)"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 30,
|
||||
"id": "783d7d98",
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"_input = prompt.format(subject=\"ice cream flavors\")\n",
|
||||
"output = model(_input)"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 31,
|
||||
"id": "fcb81344",
|
||||
"metadata": {},
|
||||
"outputs": [
|
||||
{
|
||||
"data": {
|
||||
"text/plain": [
|
||||
"['Vanilla',\n",
|
||||
" 'Chocolate',\n",
|
||||
" 'Strawberry',\n",
|
||||
" 'Mint Chocolate Chip',\n",
|
||||
" 'Cookies and Cream']"
|
||||
]
|
||||
},
|
||||
"execution_count": 31,
|
||||
"metadata": {},
|
||||
"output_type": "execute_result"
|
||||
}
|
||||
],
|
||||
"source": [
|
||||
"output_parser.parse(output)"
|
||||
]
|
||||
}
|
||||
],
|
||||
"metadata": {
|
||||
"kernelspec": {
|
||||
"display_name": "Python 3 (ipykernel)",
|
||||
"language": "python",
|
||||
"name": "python3"
|
||||
},
|
||||
"language_info": {
|
||||
"codemirror_mode": {
|
||||
"name": "ipython",
|
||||
"version": 3
|
||||
},
|
||||
"file_extension": ".py",
|
||||
"mimetype": "text/x-python",
|
||||
"name": "python",
|
||||
"nbconvert_exporter": "python",
|
||||
"pygments_lexer": "ipython3",
|
||||
"version": "3.9.1"
|
||||
}
|
||||
},
|
||||
"nbformat": 4,
|
||||
"nbformat_minor": 5
|
||||
}
|
@@ -1,886 +0,0 @@
|
||||
{
|
||||
"cells": [
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"id": "43fb16cb",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"# Getting Started\n",
|
||||
"\n",
|
||||
"Managing your prompts is annoying and tedious, with everyone writing their own slightly different variants of the same ideas. But it shouldn't be this way. \n",
|
||||
"\n",
|
||||
"LangChain provides a standard and flexible way for specifying and managing all your prompts, as well as clear and specific terminology around them. This notebook goes through the core components of working with prompts, showing how to use them as well as explaining what they do.\n",
|
||||
"\n",
|
||||
"This notebook covers how to work with prompts in Python. If you are interested in how to work with serialized versions of prompts and load them from disk, see [this notebook](prompt_serialization.ipynb)."
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"id": "890aad4d",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"### The BasePromptTemplate Interface\n",
|
||||
"\n",
|
||||
"A prompt template is a mechanism for constructing a prompt to pass to the language model given some user input. Below is the interface that all different types of prompt templates should expose.\n",
|
||||
"\n",
|
||||
"```python\n",
|
||||
"class BasePromptTemplate(ABC):\n",
|
||||
"\n",
|
||||
" input_variables: List[str]\n",
|
||||
" \"\"\"A list of the names of the variables the prompt template expects.\"\"\"\n",
|
||||
"\n",
|
||||
" @abstractmethod\n",
|
||||
" def format(self, **kwargs: Any) -> str:\n",
|
||||
" \"\"\"Format the prompt with the inputs.\n",
|
||||
"\n",
|
||||
" Args:\n",
|
||||
" kwargs: Any arguments to be passed to the prompt template.\n",
|
||||
"\n",
|
||||
" Returns:\n",
|
||||
" A formatted string.\n",
|
||||
"\n",
|
||||
" Example:\n",
|
||||
"\n",
|
||||
" .. code-block:: python\n",
|
||||
"\n",
|
||||
" prompt.format(variable1=\"foo\")\n",
|
||||
" \"\"\"\n",
|
||||
"```\n",
|
||||
"\n",
|
||||
"The only two things that define a prompt are:\n",
|
||||
"\n",
|
||||
"1. `input_variables`: The user inputted variables that are needed to format the prompt.\n",
|
||||
"2. `format`: A method which takes in keyword arguments and returns a formatted prompt. The keys are expected to be the input variables\n",
|
||||
" \n",
|
||||
"The rest of the logic of how the prompt is constructed is left up to different implementations. Let's take a look at some below."
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"id": "cddb465e",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"### PromptTemplate\n",
|
||||
"\n",
|
||||
"This is the most simple type of prompt template, consisting of a string template that takes any number of input variables. The template should be formatted as a Python f-string, although we will support other formats (Jinja, Mako, etc) in the future. \n",
|
||||
"\n",
|
||||
"If you just want to use a hardcoded prompt template, you should use this implementation.\n",
|
||||
"\n",
|
||||
"Let's walk through a few examples."
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 3,
|
||||
"id": "094229f4",
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"from langchain.prompts import PromptTemplate"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 4,
|
||||
"id": "ab46bd2a",
|
||||
"metadata": {},
|
||||
"outputs": [
|
||||
{
|
||||
"data": {
|
||||
"text/plain": [
|
||||
"'Tell me a joke.'"
|
||||
]
|
||||
},
|
||||
"execution_count": 4,
|
||||
"metadata": {},
|
||||
"output_type": "execute_result"
|
||||
}
|
||||
],
|
||||
"source": [
|
||||
"# An example prompt with no input variables\n",
|
||||
"no_input_prompt = PromptTemplate(input_variables=[], template=\"Tell me a joke.\")\n",
|
||||
"no_input_prompt.format()"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 5,
|
||||
"id": "c3ad0fa8",
|
||||
"metadata": {},
|
||||
"outputs": [
|
||||
{
|
||||
"data": {
|
||||
"text/plain": [
|
||||
"'Tell me a funny joke.'"
|
||||
]
|
||||
},
|
||||
"execution_count": 5,
|
||||
"metadata": {},
|
||||
"output_type": "execute_result"
|
||||
}
|
||||
],
|
||||
"source": [
|
||||
"# An example prompt with one input variable\n",
|
||||
"one_input_prompt = PromptTemplate(input_variables=[\"adjective\"], template=\"Tell me a {adjective} joke.\")\n",
|
||||
"one_input_prompt.format(adjective=\"funny\")"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 6,
|
||||
"id": "ba577dcf",
|
||||
"metadata": {},
|
||||
"outputs": [
|
||||
{
|
||||
"data": {
|
||||
"text/plain": [
|
||||
"'Tell me a funny joke about chickens.'"
|
||||
]
|
||||
},
|
||||
"execution_count": 6,
|
||||
"metadata": {},
|
||||
"output_type": "execute_result"
|
||||
}
|
||||
],
|
||||
"source": [
|
||||
"# An example prompt with multiple input variables\n",
|
||||
"multiple_input_prompt = PromptTemplate(\n",
|
||||
" input_variables=[\"adjective\", \"content\"], \n",
|
||||
" template=\"Tell me a {adjective} joke about {content}.\"\n",
|
||||
")\n",
|
||||
"multiple_input_prompt.format(adjective=\"funny\", content=\"chickens\")"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"id": "cc991ad2",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"## From Template\n",
|
||||
"You can also easily load a prompt template by just specifying the template, and not worrying about the input variables."
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 7,
|
||||
"id": "d0a0756c",
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"template = \"Tell me a {adjective} joke about {content}.\"\n",
|
||||
"multiple_input_prompt = PromptTemplate.from_template(template)"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 8,
|
||||
"id": "59046640",
|
||||
"metadata": {},
|
||||
"outputs": [
|
||||
{
|
||||
"data": {
|
||||
"text/plain": [
|
||||
"PromptTemplate(input_variables=['adjective', 'content'], output_parser=None, template='Tell me a {adjective} joke about {content}.', template_format='f-string', validate_template=True)"
|
||||
]
|
||||
},
|
||||
"execution_count": 8,
|
||||
"metadata": {},
|
||||
"output_type": "execute_result"
|
||||
}
|
||||
],
|
||||
"source": [
|
||||
"multiple_input_prompt"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"id": "b2dd6154",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"## Alternative formats\n",
|
||||
"\n",
|
||||
"This section shows how to use alternative formats besides \"f-string\" to format prompts."
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 9,
|
||||
"id": "53b41b6a",
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"# Jinja2\n",
|
||||
"template = \"\"\"\n",
|
||||
"{% for item in items %}\n",
|
||||
"Question: {{ item.question }}\n",
|
||||
"Answer: {{ item.answer }}\n",
|
||||
"{% endfor %}\n",
|
||||
"\"\"\"\n",
|
||||
"items=[{\"question\": \"foo\", \"answer\": \"bar\"},{\"question\": \"1\", \"answer\": \"2\"}]\n",
|
||||
"jinja2_prompt = PromptTemplate(\n",
|
||||
" input_variables=[\"items\"], \n",
|
||||
" template=template,\n",
|
||||
" template_format=\"jinja2\"\n",
|
||||
")"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 10,
|
||||
"id": "ba8aabd3",
|
||||
"metadata": {},
|
||||
"outputs": [
|
||||
{
|
||||
"data": {
|
||||
"text/plain": [
|
||||
"'\\n\\nQuestion: foo\\nAnswer: bar\\n\\nQuestion: 1\\nAnswer: 2\\n'"
|
||||
]
|
||||
},
|
||||
"execution_count": 10,
|
||||
"metadata": {},
|
||||
"output_type": "execute_result"
|
||||
}
|
||||
],
|
||||
"source": [
|
||||
"jinja2_prompt.format(items=items)"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"id": "1492b49d",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"### Few Shot Prompts\n",
|
||||
"\n",
|
||||
"A FewShotPromptTemplate is a prompt template that includes some examples. If you have collected some examples of how the task should be done, you can insert them into prompt using this class.\n",
|
||||
"\n",
|
||||
"Examples are datapoints that can be included in the prompt in order to give the model more context what to do. Examples are represented as a dictionary of key-value pairs, with the key being the input (or label) name, and the value being the input (or label) value. \n",
|
||||
"\n",
|
||||
"In addition to the example, we also need to specify how the example should be formatted when it's inserted in the prompt. We can do this using the above `PromptTemplate`!"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 11,
|
||||
"id": "3eb36972",
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"# These are some examples of a pretend task of creating antonyms.\n",
|
||||
"examples = [\n",
|
||||
" {\"input\": \"happy\", \"output\": \"sad\"},\n",
|
||||
" {\"input\": \"tall\", \"output\": \"short\"},\n",
|
||||
"]\n",
|
||||
"# This how we specify how the example should be formatted.\n",
|
||||
"example_prompt = PromptTemplate(\n",
|
||||
" input_variables=[\"input\",\"output\"],\n",
|
||||
" template=\"Input: {input}\\nOutput: {output}\",\n",
|
||||
")"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 12,
|
||||
"id": "80a91d96",
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"from langchain.prompts import FewShotPromptTemplate"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 13,
|
||||
"id": "7931e5f2",
|
||||
"metadata": {},
|
||||
"outputs": [
|
||||
{
|
||||
"name": "stdout",
|
||||
"output_type": "stream",
|
||||
"text": [
|
||||
"Give the antonym of every input\n",
|
||||
"\n",
|
||||
"Input: happy\n",
|
||||
"Output: sad\n",
|
||||
"\n",
|
||||
"Input: tall\n",
|
||||
"Output: short\n",
|
||||
"\n",
|
||||
"Input: big\n",
|
||||
"Output:\n"
|
||||
]
|
||||
}
|
||||
],
|
||||
"source": [
|
||||
"prompt_from_string_examples = FewShotPromptTemplate(\n",
|
||||
" # These are the examples we want to insert into the prompt.\n",
|
||||
" examples=examples,\n",
|
||||
" # This is how we want to format the examples when we insert them into the prompt.\n",
|
||||
" example_prompt=example_prompt,\n",
|
||||
" # The prefix is some text that goes before the examples in the prompt.\n",
|
||||
" # Usually, this consists of intructions.\n",
|
||||
" prefix=\"Give the antonym of every input\",\n",
|
||||
" # The suffix is some text that goes after the examples in the prompt.\n",
|
||||
" # Usually, this is where the user input will go\n",
|
||||
" suffix=\"Input: {adjective}\\nOutput:\", \n",
|
||||
" # The input variables are the variables that the overall prompt expects.\n",
|
||||
" input_variables=[\"adjective\"],\n",
|
||||
" # The example_separator is the string we will use to join the prefix, examples, and suffix together with.\n",
|
||||
" example_separator=\"\\n\\n\"\n",
|
||||
" \n",
|
||||
")\n",
|
||||
"print(prompt_from_string_examples.format(adjective=\"big\"))"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"id": "874b7575",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"## Few Shot Prompts with Templates\n",
|
||||
"We can also construct few shot prompt templates where the prefix and suffix themselves are prompt templates"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 14,
|
||||
"id": "e710115f",
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"from langchain.prompts import FewShotPromptWithTemplates"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 15,
|
||||
"id": "5bf23a65",
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"prefix = PromptTemplate(input_variables=[\"content\"], template=\"This is a test about {content}.\")\n",
|
||||
"suffix = PromptTemplate(input_variables=[\"new_content\"], template=\"Now you try to talk about {new_content}.\")\n",
|
||||
"\n",
|
||||
"prompt = FewShotPromptWithTemplates(\n",
|
||||
" suffix=suffix,\n",
|
||||
" prefix=prefix,\n",
|
||||
" input_variables=[\"content\", \"new_content\"],\n",
|
||||
" examples=examples,\n",
|
||||
" example_prompt=example_prompt,\n",
|
||||
" example_separator=\"\\n\",\n",
|
||||
")\n",
|
||||
"output = prompt.format(content=\"animals\", new_content=\"party\")"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 16,
|
||||
"id": "d4036351",
|
||||
"metadata": {},
|
||||
"outputs": [
|
||||
{
|
||||
"name": "stdout",
|
||||
"output_type": "stream",
|
||||
"text": [
|
||||
"This is a test about animals.\n",
|
||||
"Input: happy\n",
|
||||
"Output: sad\n",
|
||||
"Input: tall\n",
|
||||
"Output: short\n",
|
||||
"Now you try to talk about party.\n"
|
||||
]
|
||||
}
|
||||
],
|
||||
"source": [
|
||||
"print(output)"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"id": "bf038596",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"### ExampleSelector\n",
|
||||
"If you have a large number of examples, you may need to select which ones to include in the prompt. The ExampleSelector is the class responsible for doing so. The base interface is defined as below.\n",
|
||||
"\n",
|
||||
"```python\n",
|
||||
"class BaseExampleSelector(ABC):\n",
|
||||
" \"\"\"Interface for selecting examples to include in prompts.\"\"\"\n",
|
||||
"\n",
|
||||
" @abstractmethod\n",
|
||||
" def select_examples(self, input_variables: Dict[str, str]) -> List[dict]:\n",
|
||||
" \"\"\"Select which examples to use based on the inputs.\"\"\"\n",
|
||||
"\n",
|
||||
"```\n",
|
||||
"\n",
|
||||
"The only method it needs to expose is a `select_examples` method. This takes in the input variables and then returns a list of examples. It is up to each specific implementation as to how those examples are selected. Let's take a look at some below."
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"id": "861a4d1f",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"### LengthBased ExampleSelector\n",
|
||||
"\n",
|
||||
"This ExampleSelector selects which examples to use based on length. This is useful when you are worried about constructing a prompt that will go over the length of the context window. For longer inputs, it will select fewer examples to include, while for shorter inputs it will select more.\n"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 17,
|
||||
"id": "7c469c95",
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"from langchain.prompts.example_selector import LengthBasedExampleSelector"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 18,
|
||||
"id": "0ec6d950",
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"# These are a lot of examples of a pretend task of creating antonyms.\n",
|
||||
"examples = [\n",
|
||||
" {\"input\": \"happy\", \"output\": \"sad\"},\n",
|
||||
" {\"input\": \"tall\", \"output\": \"short\"},\n",
|
||||
" {\"input\": \"energetic\", \"output\": \"lethargic\"},\n",
|
||||
" {\"input\": \"sunny\", \"output\": \"gloomy\"},\n",
|
||||
" {\"input\": \"windy\", \"output\": \"calm\"},\n",
|
||||
"]"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 19,
|
||||
"id": "207e55f7",
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"example_selector = LengthBasedExampleSelector(\n",
|
||||
" # These are the examples is has available to choose from.\n",
|
||||
" examples=examples, \n",
|
||||
" # This is the PromptTemplate being used to format the examples.\n",
|
||||
" example_prompt=example_prompt, \n",
|
||||
" # This is the maximum length that the formatted examples should be.\n",
|
||||
" # Length is measured by the get_text_length function below.\n",
|
||||
" max_length=25,\n",
|
||||
" # This is the function used to get the length of a string, which is used\n",
|
||||
" # to determine which examples to include. It is commented out because\n",
|
||||
" # it is provided as a default value if none is specified.\n",
|
||||
" # get_text_length: Callable[[str], int] = lambda x: len(re.split(\"\\n| \", x))\n",
|
||||
")\n",
|
||||
"dynamic_prompt = FewShotPromptTemplate(\n",
|
||||
" # We provide an ExampleSelector instead of examples.\n",
|
||||
" example_selector=example_selector,\n",
|
||||
" example_prompt=example_prompt,\n",
|
||||
" prefix=\"Give the antonym of every input\",\n",
|
||||
" suffix=\"Input: {adjective}\\nOutput:\", \n",
|
||||
" input_variables=[\"adjective\"],\n",
|
||||
")"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 20,
|
||||
"id": "d00b4385",
|
||||
"metadata": {},
|
||||
"outputs": [
|
||||
{
|
||||
"name": "stdout",
|
||||
"output_type": "stream",
|
||||
"text": [
|
||||
"Give the antonym of every input\n",
|
||||
"\n",
|
||||
"Input: happy\n",
|
||||
"Output: sad\n",
|
||||
"\n",
|
||||
"Input: tall\n",
|
||||
"Output: short\n",
|
||||
"\n",
|
||||
"Input: energetic\n",
|
||||
"Output: lethargic\n",
|
||||
"\n",
|
||||
"Input: sunny\n",
|
||||
"Output: gloomy\n",
|
||||
"\n",
|
||||
"Input: windy\n",
|
||||
"Output: calm\n",
|
||||
"\n",
|
||||
"Input: big\n",
|
||||
"Output:\n"
|
||||
]
|
||||
}
|
||||
],
|
||||
"source": [
|
||||
"# An example with small input, so it selects all examples.\n",
|
||||
"print(dynamic_prompt.format(adjective=\"big\"))"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 21,
|
||||
"id": "878bcde9",
|
||||
"metadata": {},
|
||||
"outputs": [
|
||||
{
|
||||
"name": "stdout",
|
||||
"output_type": "stream",
|
||||
"text": [
|
||||
"Give the antonym of every input\n",
|
||||
"\n",
|
||||
"Input: happy\n",
|
||||
"Output: sad\n",
|
||||
"\n",
|
||||
"Input: big and huge and massive and large and gigantic and tall and much much much much much bigger than everything else\n",
|
||||
"Output:\n"
|
||||
]
|
||||
}
|
||||
],
|
||||
"source": [
|
||||
"# An example with long input, so it selects only one example.\n",
|
||||
"long_string = \"big and huge and massive and large and gigantic and tall and much much much much much bigger than everything else\"\n",
|
||||
"print(dynamic_prompt.format(adjective=long_string))"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 22,
|
||||
"id": "e4bebcd9",
|
||||
"metadata": {},
|
||||
"outputs": [
|
||||
{
|
||||
"name": "stdout",
|
||||
"output_type": "stream",
|
||||
"text": [
|
||||
"Give the antonym of every input\n",
|
||||
"\n",
|
||||
"Input: happy\n",
|
||||
"Output: sad\n",
|
||||
"\n",
|
||||
"Input: tall\n",
|
||||
"Output: short\n",
|
||||
"\n",
|
||||
"Input: energetic\n",
|
||||
"Output: lethargic\n",
|
||||
"\n",
|
||||
"Input: sunny\n",
|
||||
"Output: gloomy\n",
|
||||
"\n",
|
||||
"Input: windy\n",
|
||||
"Output: calm\n",
|
||||
"\n",
|
||||
"Input: big\n",
|
||||
"Output: small\n",
|
||||
"\n",
|
||||
"Input: enthusiastic\n",
|
||||
"Output:\n"
|
||||
]
|
||||
}
|
||||
],
|
||||
"source": [
|
||||
"# You can add an example to an example selector as well.\n",
|
||||
"new_example = {\"input\": \"big\", \"output\": \"small\"}\n",
|
||||
"dynamic_prompt.example_selector.add_example(new_example)\n",
|
||||
"print(dynamic_prompt.format(adjective=\"enthusiastic\"))"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"id": "2d007b0a",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"### Similarity ExampleSelector\n",
|
||||
"\n",
|
||||
"The SemanticSimilarityExampleSelector selects examples based on which examples are most similar to the inputs. It does this by finding the examples with the embeddings that have the greatest cosine similarity with the inputs.\n"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 23,
|
||||
"id": "241bfe80",
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"from langchain.prompts.example_selector import SemanticSimilarityExampleSelector\n",
|
||||
"from langchain.vectorstores import Chroma\n",
|
||||
"from langchain.embeddings import OpenAIEmbeddings"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 24,
|
||||
"id": "50d0a701",
|
||||
"metadata": {},
|
||||
"outputs": [
|
||||
{
|
||||
"name": "stdout",
|
||||
"output_type": "stream",
|
||||
"text": [
|
||||
"Running Chroma using direct local API.\n",
|
||||
"Using DuckDB in-memory for database. Data will be transient.\n"
|
||||
]
|
||||
}
|
||||
],
|
||||
"source": [
|
||||
"example_selector = SemanticSimilarityExampleSelector.from_examples(\n",
|
||||
" # This is the list of examples available to select from.\n",
|
||||
" examples, \n",
|
||||
" # This is the embedding class used to produce embeddings which are used to measure semantic similarity.\n",
|
||||
" OpenAIEmbeddings(), \n",
|
||||
" # This is the VectorStore class that is used to store the embeddings and do a similarity search over.\n",
|
||||
" Chroma, \n",
|
||||
" # This is the number of examples to produce.\n",
|
||||
" k=1\n",
|
||||
")\n",
|
||||
"similar_prompt = FewShotPromptTemplate(\n",
|
||||
" # We provide an ExampleSelector instead of examples.\n",
|
||||
" example_selector=example_selector,\n",
|
||||
" example_prompt=example_prompt,\n",
|
||||
" prefix=\"Give the antonym of every input\",\n",
|
||||
" suffix=\"Input: {adjective}\\nOutput:\", \n",
|
||||
" input_variables=[\"adjective\"],\n",
|
||||
")"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 25,
|
||||
"id": "4c8fdf45",
|
||||
"metadata": {},
|
||||
"outputs": [
|
||||
{
|
||||
"name": "stdout",
|
||||
"output_type": "stream",
|
||||
"text": [
|
||||
"Give the antonym of every input\n",
|
||||
"\n",
|
||||
"Input: happy\n",
|
||||
"Output: sad\n",
|
||||
"\n",
|
||||
"Input: worried\n",
|
||||
"Output:\n"
|
||||
]
|
||||
}
|
||||
],
|
||||
"source": [
|
||||
"# Input is a feeling, so should select the happy/sad example\n",
|
||||
"print(similar_prompt.format(adjective=\"worried\"))"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 19,
|
||||
"id": "829af21a",
|
||||
"metadata": {
|
||||
"scrolled": true
|
||||
},
|
||||
"outputs": [
|
||||
{
|
||||
"name": "stdout",
|
||||
"output_type": "stream",
|
||||
"text": [
|
||||
"Give the antonym of every input\n",
|
||||
"\n",
|
||||
"Input: happy\n",
|
||||
"Output: sad\n",
|
||||
"\n",
|
||||
"Input: fat\n",
|
||||
"Output:\n"
|
||||
]
|
||||
}
|
||||
],
|
||||
"source": [
|
||||
"# Input is a measurement, so should select the tall/short example\n",
|
||||
"print(similar_prompt.format(adjective=\"fat\"))"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 20,
|
||||
"id": "3c16fe23",
|
||||
"metadata": {},
|
||||
"outputs": [
|
||||
{
|
||||
"name": "stdout",
|
||||
"output_type": "stream",
|
||||
"text": [
|
||||
"Give the antonym of every input\n",
|
||||
"\n",
|
||||
"Input: happy\n",
|
||||
"Output: sad\n",
|
||||
"\n",
|
||||
"Input: joyful\n",
|
||||
"Output:\n"
|
||||
]
|
||||
}
|
||||
],
|
||||
"source": [
|
||||
"# You can add new examples to the SemanticSimilarityExampleSelector as well\n",
|
||||
"similar_prompt.example_selector.add_example({\"input\": \"enthusiastic\", \"output\": \"apathetic\"})\n",
|
||||
"print(similar_prompt.format(adjective=\"joyful\"))"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"id": "bc35afd0",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"### Maximal Marginal Relevance ExampleSelector\n",
|
||||
"\n",
|
||||
"The MaxMarginalRelevanceExampleSelector selects examples based on a combination of which examples are most similar to the inputs, while also optimizing for diversity. It does this by finding the examples with the embeddings that have the greatest cosine similarity with the inputs, and then iteratively adding them while penalizing them for closeness to already selected examples.\n"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 21,
|
||||
"id": "ac95c968",
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"from langchain.prompts.example_selector import MaxMarginalRelevanceExampleSelector\n",
|
||||
"from langchain.vectorstores import FAISS"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 22,
|
||||
"id": "db579bea",
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"example_selector = MaxMarginalRelevanceExampleSelector.from_examples(\n",
|
||||
" # This is the list of examples available to select from.\n",
|
||||
" examples, \n",
|
||||
" # This is the embedding class used to produce embeddings which are used to measure semantic similarity.\n",
|
||||
" OpenAIEmbeddings(), \n",
|
||||
" # This is the VectorStore class that is used to store the embeddings and do a similarity search over.\n",
|
||||
" FAISS, \n",
|
||||
" # This is the number of examples to produce.\n",
|
||||
" k=2\n",
|
||||
")\n",
|
||||
"mmr_prompt = FewShotPromptTemplate(\n",
|
||||
" # We provide an ExampleSelector instead of examples.\n",
|
||||
" example_selector=example_selector,\n",
|
||||
" example_prompt=example_prompt,\n",
|
||||
" prefix=\"Give the antonym of every input\",\n",
|
||||
" suffix=\"Input: {adjective}\\nOutput:\", \n",
|
||||
" input_variables=[\"adjective\"],\n",
|
||||
")"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 23,
|
||||
"id": "cd76e344",
|
||||
"metadata": {},
|
||||
"outputs": [
|
||||
{
|
||||
"name": "stdout",
|
||||
"output_type": "stream",
|
||||
"text": [
|
||||
"Give the antonym of every input\n",
|
||||
"\n",
|
||||
"Input: happy\n",
|
||||
"Output: sad\n",
|
||||
"\n",
|
||||
"Input: windy\n",
|
||||
"Output: calm\n",
|
||||
"\n",
|
||||
"Input: worried\n",
|
||||
"Output:\n"
|
||||
]
|
||||
}
|
||||
],
|
||||
"source": [
|
||||
"# Input is a feeling, so should select the happy/sad example as the first one\n",
|
||||
"print(mmr_prompt.format(adjective=\"worried\"))"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 24,
|
||||
"id": "cf82956b",
|
||||
"metadata": {},
|
||||
"outputs": [
|
||||
{
|
||||
"name": "stdout",
|
||||
"output_type": "stream",
|
||||
"text": [
|
||||
"Give the antonym of every input\n",
|
||||
"\n",
|
||||
"Input: happy\n",
|
||||
"Output: sad\n",
|
||||
"\n",
|
||||
"Input: enthusiastic\n",
|
||||
"Output: apathetic\n",
|
||||
"\n",
|
||||
"Input: worried\n",
|
||||
"Output:\n"
|
||||
]
|
||||
}
|
||||
],
|
||||
"source": [
|
||||
"# Let's compare this to what we would just get if we went solely off of similarity\n",
|
||||
"similar_prompt.example_selector.k = 2\n",
|
||||
"print(similar_prompt.format(adjective=\"worried\"))"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"id": "dbc32551",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"### Serialization\n",
|
||||
"\n",
|
||||
"PromptTemplates and examples can be serialized and loaded from disk, making it easy to share and store prompts. For a detailed walkthrough on how to do that, see [this notebook](prompt_serialization.ipynb)."
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"id": "1e1e13c6",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"### Customizability\n",
|
||||
"The above covers all the ways currently supported in LangChain to represent prompts and example selectors. However, due to the simple interface that the base classes (`BasePromptTemplate`, `BaseExampleSelector`) expose, it should be easy to subclass them and write your own implementation in your own codebase. And of course, if you'd like to contribute that back to LangChain, we'd love that :)"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": null,
|
||||
"id": "c746d6f4",
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": []
|
||||
}
|
||||
],
|
||||
"metadata": {
|
||||
"kernelspec": {
|
||||
"display_name": "Python 3 (ipykernel)",
|
||||
"language": "python",
|
||||
"name": "python3"
|
||||
},
|
||||
"language_info": {
|
||||
"codemirror_mode": {
|
||||
"name": "ipython",
|
||||
"version": 3
|
||||
},
|
||||
"file_extension": ".py",
|
||||
"mimetype": "text/x-python",
|
||||
"name": "python",
|
||||
"nbconvert_exporter": "python",
|
||||
"pygments_lexer": "ipython3",
|
||||
"version": "3.9.1"
|
||||
},
|
||||
"vscode": {
|
||||
"interpreter": {
|
||||
"hash": "b1677b440931f40d89ef8be7bf03acb108ce003de0ac9b18e8d43753ea2e7103"
|
||||
}
|
||||
}
|
||||
},
|
||||
"nbformat": 4,
|
||||
"nbformat_minor": 5
|
||||
}
|
@@ -1,35 +0,0 @@
|
||||
How-To Guides
|
||||
=============
|
||||
|
||||
If you're new to the library, you may want to start with the `Quickstart <./getting_started.html>`_.
|
||||
|
||||
The user guide here shows more advanced workflows and how to use the library in different ways.
|
||||
|
||||
`Custom Prompt Template <./examples/custom_prompt_template.html>`_: How to create and use a custom PromptTemplate, the logic that decides how input variables get formatted into a prompt.
|
||||
|
||||
`Custom Example Selector <./examples/custom_example_selector.html>`_: How to create and use a custom ExampleSelector (the class responsible for choosing which examples to use in a prompt).
|
||||
|
||||
`Few Shot Prompt Templates <./examples/few_shot_examples.html>`_: How to include examples in the prompt.
|
||||
|
||||
`Example Selectors <./examples/example_selectors.html>`_: How to use different types of example selectors.
|
||||
|
||||
`Prompt Serialization <./examples/prompt_serialization.html>`_: A walkthrough of how to serialize prompts to and from disk.
|
||||
|
||||
`Few Shot Prompt Examples <./examples/few_shot_examples.html>`_: Examples of Few Shot Prompt Templates.
|
||||
|
||||
`Partial Prompt Template <./examples/partial.html>`_: How to partial Prompt Templates.
|
||||
|
||||
|
||||
|
||||
.. toctree::
|
||||
:maxdepth: 1
|
||||
:glob:
|
||||
:hidden:
|
||||
|
||||
./examples/custom_prompt_template.md
|
||||
./examples/custom_example_selector.md
|
||||
./examples/few_shot_examples.ipynb
|
||||
./examples/prompt_serialization.ipynb
|
||||
./examples/few_shot_examples_data.ipynb
|
||||
./examples/example_selectors.ipynb
|
||||
./examples/output_parsers.ipynb
|
@@ -1,75 +0,0 @@
|
||||
# Key Concepts
|
||||
|
||||
## Prompts
|
||||
|
||||
A prompt is the input to a language model. It is a string of text that is used to generate a response from the language model.
|
||||
|
||||
## Prompt Templates
|
||||
|
||||
`PromptTemplates` are a way to create prompts in a reproducible way. They contain a template string, and a set of input variables. The template string can be formatted with the input variables to generate a prompt. The template string often contains instructions to the language model, a few shot examples, and a question to the language model.
|
||||
|
||||
`PromptTemplates` generically have a `format` method that takes in variables and returns a formatted string.
|
||||
The most simple implementation of this is to have a template string with some variables in it, and then format it with the incoming variables.
|
||||
More complex iterations dynamically construct the template string from few shot examples, etc.
|
||||
|
||||
To learn more about `PromptTemplates`, see [Prompt Templates](getting_started.md).
|
||||
|
||||
As an example, consider the following template string:
|
||||
|
||||
```python
|
||||
"""
|
||||
Predict the capital of a country.
|
||||
|
||||
Country: {country}
|
||||
Capital:
|
||||
"""
|
||||
```
|
||||
|
||||
### Input Variables
|
||||
|
||||
Input variables are the variables that are used to fill in the template string. In the example above, the input variable is `country`.
|
||||
|
||||
Given an input variable, the `PromptTemplate` can generate a prompt by filling in the template string with the input variable. For example, if the input variable is `United States`, the template string can be formatted to generate the following prompt:
|
||||
|
||||
```python
|
||||
"""
|
||||
Predict the capital of a country.
|
||||
|
||||
Country: United States
|
||||
Capital:
|
||||
"""
|
||||
```
|
||||
|
||||
## Few Shot Examples
|
||||
|
||||
Few shot examples refer to in-context examples that are provided to a language model as part of a prompt. The examples can be used to help the language model understand the context of the prompt, and as a result generate a better response. Few shot examples can contain both positive and negative examples about the expected response.
|
||||
|
||||
Below, we list out some few shot examples that may be relevant for the task of predicting the capital of a country.
|
||||
|
||||
```
|
||||
Country: United States
|
||||
Capital: Washington, D.C.
|
||||
|
||||
Country: Canada
|
||||
Capital: Ottawa
|
||||
```
|
||||
|
||||
To learn more about how to provide few shot examples, see [Few Shot Examples](examples/few_shot_examples.ipynb).
|
||||
|
||||
<!-- TODO(shreya): Add correct link here. -->
|
||||
|
||||
## Example selection
|
||||
|
||||
If there are multiple examples that are relevant to a prompt, it is important to select the most relevant examples. Generally, the quality of the response from the LLM can be significantly improved by selecting the most relevant examples. This is because the language model will be able to better understand the context of the prompt, and also potentially learn failure modes to avoid.
|
||||
|
||||
To help the user with selecting the most relevant examples, we provide example selectors that select the most relevant based on different criteria, such as length, semantic similarity, etc. The example selector takes in a list of examples and returns a list of selected examples, formatted as a string. The user can also provide their own example selector. To learn more about example selectors, see [Example Selectors](examples/example_selectors.ipynb).
|
||||
|
||||
<!-- TODO(shreya): Add correct link here. -->
|
||||
|
||||
## Serialization
|
||||
|
||||
To make it easy to share `PromptTemplates`, we provide a `serialize` method that returns a JSON string. The JSON string can be saved to a file, and then loaded back into a `PromptTemplate` using the `deserialize` method. This allows users to share `PromptTemplates` with others, and also to save them for later use.
|
||||
|
||||
To learn more about serialization, see [Serialization](examples/prompt_serialization.ipynb).
|
||||
|
||||
<!-- TODO(shreya): Provide correct link. -->
|
32
docs/modules/prompts/output_parsers.rst
Normal file
32
docs/modules/prompts/output_parsers.rst
Normal file
@@ -0,0 +1,32 @@
|
||||
Output Parsers
|
||||
==========================
|
||||
|
||||
.. note::
|
||||
`Conceptual Guide <https://docs.langchain.com/docs/components/prompts/output-parser>`_
|
||||
|
||||
|
||||
Language models output text. But many times you may want to get more structured information than just text back. This is where output parsers come in.
|
||||
|
||||
Output parsers are classes that help structure language model responses. There are two main methods an output parser must implement:
|
||||
|
||||
- ``get_format_instructions() -> str``: A method which returns a string containing instructions for how the output of a language model should be formatted.
|
||||
- ``parse(str) -> Any``: A method which takes in a string (assumed to be the response from a language model) and parses it into some structure.
|
||||
|
||||
And then one optional one:
|
||||
|
||||
- ``parse_with_prompt(str) -> Any``: A method which takes in a string (assumed to be the response from a language model) and a prompt (assumed to the prompt that generated such a response) and parses it into some structure. The prompt is largely provided in the event the OutputParser wants to retry or fix the output in some way, and needs information from the prompt to do so.
|
||||
|
||||
To start, we recommend familiarizing yourself with the Getting Started section
|
||||
|
||||
.. toctree::
|
||||
:maxdepth: 1
|
||||
|
||||
./output_parsers/getting_started.md
|
||||
|
||||
After that, we provide deep dives on all the different types of output parsers.
|
||||
|
||||
.. toctree::
|
||||
:maxdepth: 1
|
||||
:glob:
|
||||
|
||||
./output_parsers/examples/*
|
@@ -0,0 +1,127 @@
|
||||
{
|
||||
"cells": [
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"id": "9936fa27",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"# CommaSeparatedListOutputParser\n",
|
||||
"\n",
|
||||
"Here's another parser strictly less powerful than Pydantic/JSON parsing."
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 1,
|
||||
"id": "872246d7",
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"from langchain.output_parsers import CommaSeparatedListOutputParser\n",
|
||||
"from langchain.prompts import PromptTemplate, ChatPromptTemplate, HumanMessagePromptTemplate\n",
|
||||
"from langchain.llms import OpenAI\n",
|
||||
"from langchain.chat_models import ChatOpenAI"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 2,
|
||||
"id": "c3f9aee6",
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"output_parser = CommaSeparatedListOutputParser()"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 3,
|
||||
"id": "e77871b7",
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"format_instructions = output_parser.get_format_instructions()\n",
|
||||
"prompt = PromptTemplate(\n",
|
||||
" template=\"List five {subject}.\\n{format_instructions}\",\n",
|
||||
" input_variables=[\"subject\"],\n",
|
||||
" partial_variables={\"format_instructions\": format_instructions}\n",
|
||||
")"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 4,
|
||||
"id": "a71cb5d3",
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"model = OpenAI(temperature=0)"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 5,
|
||||
"id": "783d7d98",
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"_input = prompt.format(subject=\"ice cream flavors\")\n",
|
||||
"output = model(_input)"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 6,
|
||||
"id": "fcb81344",
|
||||
"metadata": {},
|
||||
"outputs": [
|
||||
{
|
||||
"data": {
|
||||
"text/plain": [
|
||||
"['Vanilla',\n",
|
||||
" 'Chocolate',\n",
|
||||
" 'Strawberry',\n",
|
||||
" 'Mint Chocolate Chip',\n",
|
||||
" 'Cookies and Cream']"
|
||||
]
|
||||
},
|
||||
"execution_count": 6,
|
||||
"metadata": {},
|
||||
"output_type": "execute_result"
|
||||
}
|
||||
],
|
||||
"source": [
|
||||
"output_parser.parse(output)"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": null,
|
||||
"id": "ca5a23c5",
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": []
|
||||
}
|
||||
],
|
||||
"metadata": {
|
||||
"kernelspec": {
|
||||
"display_name": "Python 3 (ipykernel)",
|
||||
"language": "python",
|
||||
"name": "python3"
|
||||
},
|
||||
"language_info": {
|
||||
"codemirror_mode": {
|
||||
"name": "ipython",
|
||||
"version": 3
|
||||
},
|
||||
"file_extension": ".py",
|
||||
"mimetype": "text/x-python",
|
||||
"name": "python",
|
||||
"nbconvert_exporter": "python",
|
||||
"pygments_lexer": "ipython3",
|
||||
"version": "3.9.1"
|
||||
}
|
||||
},
|
||||
"nbformat": 4,
|
||||
"nbformat_minor": 5
|
||||
}
|
@@ -0,0 +1,153 @@
|
||||
{
|
||||
"cells": [
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"id": "4d6c0c86",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"# OutputFixingParser\n",
|
||||
"\n",
|
||||
"This output parser wraps another output parser and tries to fix any mmistakes\n",
|
||||
"\n",
|
||||
"The Pydantic guardrail simply tries to parse the LLM response. If it does not parse correctly, then it errors.\n",
|
||||
"\n",
|
||||
"But we can do other things besides throw errors. Specifically, we can pass the misformatted output, along with the formatted instructions, to the model and ask it to fix it.\n",
|
||||
"\n",
|
||||
"For this example, we'll use the above OutputParser. Here's what happens if we pass it a result that does not comply with the schema:"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 3,
|
||||
"id": "50048777",
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"from langchain.prompts import PromptTemplate, ChatPromptTemplate, HumanMessagePromptTemplate\n",
|
||||
"from langchain.llms import OpenAI\n",
|
||||
"from langchain.chat_models import ChatOpenAI\n",
|
||||
"from langchain.output_parsers import PydanticOutputParser\n",
|
||||
"from pydantic import BaseModel, Field, validator\n",
|
||||
"from typing import List"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 4,
|
||||
"id": "4f1b563f",
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"class Actor(BaseModel):\n",
|
||||
" name: str = Field(description=\"name of an actor\")\n",
|
||||
" film_names: List[str] = Field(description=\"list of names of films they starred in\")\n",
|
||||
" \n",
|
||||
"actor_query = \"Generate the filmography for a random actor.\"\n",
|
||||
"\n",
|
||||
"parser = PydanticOutputParser(pydantic_object=Actor)"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 5,
|
||||
"id": "73beb20d",
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"misformatted = \"{'name': 'Tom Hanks', 'film_names': ['Forrest Gump']}\""
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 6,
|
||||
"id": "f0e5ba80",
|
||||
"metadata": {},
|
||||
"outputs": [
|
||||
{
|
||||
"ename": "OutputParserException",
|
||||
"evalue": "Failed to parse Actor from completion {'name': 'Tom Hanks', 'film_names': ['Forrest Gump']}. Got: Expecting property name enclosed in double quotes: line 1 column 2 (char 1)",
|
||||
"output_type": "error",
|
||||
"traceback": [
|
||||
"\u001b[0;31m---------------------------------------------------------------------------\u001b[0m",
|
||||
"\u001b[0;31mJSONDecodeError\u001b[0m Traceback (most recent call last)",
|
||||
"File \u001b[0;32m~/workplace/langchain/langchain/output_parsers/pydantic.py:23\u001b[0m, in \u001b[0;36mPydanticOutputParser.parse\u001b[0;34m(self, text)\u001b[0m\n\u001b[1;32m 22\u001b[0m json_str \u001b[38;5;241m=\u001b[39m match\u001b[38;5;241m.\u001b[39mgroup()\n\u001b[0;32m---> 23\u001b[0m json_object \u001b[38;5;241m=\u001b[39m \u001b[43mjson\u001b[49m\u001b[38;5;241;43m.\u001b[39;49m\u001b[43mloads\u001b[49m\u001b[43m(\u001b[49m\u001b[43mjson_str\u001b[49m\u001b[43m)\u001b[49m\n\u001b[1;32m 24\u001b[0m \u001b[38;5;28;01mreturn\u001b[39;00m \u001b[38;5;28mself\u001b[39m\u001b[38;5;241m.\u001b[39mpydantic_object\u001b[38;5;241m.\u001b[39mparse_obj(json_object)\n",
|
||||
"File \u001b[0;32m~/.pyenv/versions/3.9.1/lib/python3.9/json/__init__.py:346\u001b[0m, in \u001b[0;36mloads\u001b[0;34m(s, cls, object_hook, parse_float, parse_int, parse_constant, object_pairs_hook, **kw)\u001b[0m\n\u001b[1;32m 343\u001b[0m \u001b[38;5;28;01mif\u001b[39;00m (\u001b[38;5;28mcls\u001b[39m \u001b[38;5;129;01mis\u001b[39;00m \u001b[38;5;28;01mNone\u001b[39;00m \u001b[38;5;129;01mand\u001b[39;00m object_hook \u001b[38;5;129;01mis\u001b[39;00m \u001b[38;5;28;01mNone\u001b[39;00m \u001b[38;5;129;01mand\u001b[39;00m\n\u001b[1;32m 344\u001b[0m parse_int \u001b[38;5;129;01mis\u001b[39;00m \u001b[38;5;28;01mNone\u001b[39;00m \u001b[38;5;129;01mand\u001b[39;00m parse_float \u001b[38;5;129;01mis\u001b[39;00m \u001b[38;5;28;01mNone\u001b[39;00m \u001b[38;5;129;01mand\u001b[39;00m\n\u001b[1;32m 345\u001b[0m parse_constant \u001b[38;5;129;01mis\u001b[39;00m \u001b[38;5;28;01mNone\u001b[39;00m \u001b[38;5;129;01mand\u001b[39;00m object_pairs_hook \u001b[38;5;129;01mis\u001b[39;00m \u001b[38;5;28;01mNone\u001b[39;00m \u001b[38;5;129;01mand\u001b[39;00m \u001b[38;5;129;01mnot\u001b[39;00m kw):\n\u001b[0;32m--> 346\u001b[0m \u001b[38;5;28;01mreturn\u001b[39;00m \u001b[43m_default_decoder\u001b[49m\u001b[38;5;241;43m.\u001b[39;49m\u001b[43mdecode\u001b[49m\u001b[43m(\u001b[49m\u001b[43ms\u001b[49m\u001b[43m)\u001b[49m\n\u001b[1;32m 347\u001b[0m \u001b[38;5;28;01mif\u001b[39;00m \u001b[38;5;28mcls\u001b[39m \u001b[38;5;129;01mis\u001b[39;00m \u001b[38;5;28;01mNone\u001b[39;00m:\n",
|
||||
"File \u001b[0;32m~/.pyenv/versions/3.9.1/lib/python3.9/json/decoder.py:337\u001b[0m, in \u001b[0;36mJSONDecoder.decode\u001b[0;34m(self, s, _w)\u001b[0m\n\u001b[1;32m 333\u001b[0m \u001b[38;5;250m\u001b[39m\u001b[38;5;124;03m\"\"\"Return the Python representation of ``s`` (a ``str`` instance\u001b[39;00m\n\u001b[1;32m 334\u001b[0m \u001b[38;5;124;03mcontaining a JSON document).\u001b[39;00m\n\u001b[1;32m 335\u001b[0m \n\u001b[1;32m 336\u001b[0m \u001b[38;5;124;03m\"\"\"\u001b[39;00m\n\u001b[0;32m--> 337\u001b[0m obj, end \u001b[38;5;241m=\u001b[39m \u001b[38;5;28;43mself\u001b[39;49m\u001b[38;5;241;43m.\u001b[39;49m\u001b[43mraw_decode\u001b[49m\u001b[43m(\u001b[49m\u001b[43ms\u001b[49m\u001b[43m,\u001b[49m\u001b[43m \u001b[49m\u001b[43midx\u001b[49m\u001b[38;5;241;43m=\u001b[39;49m\u001b[43m_w\u001b[49m\u001b[43m(\u001b[49m\u001b[43ms\u001b[49m\u001b[43m,\u001b[49m\u001b[43m \u001b[49m\u001b[38;5;241;43m0\u001b[39;49m\u001b[43m)\u001b[49m\u001b[38;5;241;43m.\u001b[39;49m\u001b[43mend\u001b[49m\u001b[43m(\u001b[49m\u001b[43m)\u001b[49m\u001b[43m)\u001b[49m\n\u001b[1;32m 338\u001b[0m end \u001b[38;5;241m=\u001b[39m _w(s, end)\u001b[38;5;241m.\u001b[39mend()\n",
|
||||
"File \u001b[0;32m~/.pyenv/versions/3.9.1/lib/python3.9/json/decoder.py:353\u001b[0m, in \u001b[0;36mJSONDecoder.raw_decode\u001b[0;34m(self, s, idx)\u001b[0m\n\u001b[1;32m 352\u001b[0m \u001b[38;5;28;01mtry\u001b[39;00m:\n\u001b[0;32m--> 353\u001b[0m obj, end \u001b[38;5;241m=\u001b[39m \u001b[38;5;28;43mself\u001b[39;49m\u001b[38;5;241;43m.\u001b[39;49m\u001b[43mscan_once\u001b[49m\u001b[43m(\u001b[49m\u001b[43ms\u001b[49m\u001b[43m,\u001b[49m\u001b[43m \u001b[49m\u001b[43midx\u001b[49m\u001b[43m)\u001b[49m\n\u001b[1;32m 354\u001b[0m \u001b[38;5;28;01mexcept\u001b[39;00m \u001b[38;5;167;01mStopIteration\u001b[39;00m \u001b[38;5;28;01mas\u001b[39;00m err:\n",
|
||||
"\u001b[0;31mJSONDecodeError\u001b[0m: Expecting property name enclosed in double quotes: line 1 column 2 (char 1)",
|
||||
"\nDuring handling of the above exception, another exception occurred:\n",
|
||||
"\u001b[0;31mOutputParserException\u001b[0m Traceback (most recent call last)",
|
||||
"Cell \u001b[0;32mIn[6], line 1\u001b[0m\n\u001b[0;32m----> 1\u001b[0m \u001b[43mparser\u001b[49m\u001b[38;5;241;43m.\u001b[39;49m\u001b[43mparse\u001b[49m\u001b[43m(\u001b[49m\u001b[43mmisformatted\u001b[49m\u001b[43m)\u001b[49m\n",
|
||||
"File \u001b[0;32m~/workplace/langchain/langchain/output_parsers/pydantic.py:29\u001b[0m, in \u001b[0;36mPydanticOutputParser.parse\u001b[0;34m(self, text)\u001b[0m\n\u001b[1;32m 27\u001b[0m name \u001b[38;5;241m=\u001b[39m \u001b[38;5;28mself\u001b[39m\u001b[38;5;241m.\u001b[39mpydantic_object\u001b[38;5;241m.\u001b[39m\u001b[38;5;18m__name__\u001b[39m\n\u001b[1;32m 28\u001b[0m msg \u001b[38;5;241m=\u001b[39m \u001b[38;5;124mf\u001b[39m\u001b[38;5;124m\"\u001b[39m\u001b[38;5;124mFailed to parse \u001b[39m\u001b[38;5;132;01m{\u001b[39;00mname\u001b[38;5;132;01m}\u001b[39;00m\u001b[38;5;124m from completion \u001b[39m\u001b[38;5;132;01m{\u001b[39;00mtext\u001b[38;5;132;01m}\u001b[39;00m\u001b[38;5;124m. Got: \u001b[39m\u001b[38;5;132;01m{\u001b[39;00me\u001b[38;5;132;01m}\u001b[39;00m\u001b[38;5;124m\"\u001b[39m\n\u001b[0;32m---> 29\u001b[0m \u001b[38;5;28;01mraise\u001b[39;00m OutputParserException(msg)\n",
|
||||
"\u001b[0;31mOutputParserException\u001b[0m: Failed to parse Actor from completion {'name': 'Tom Hanks', 'film_names': ['Forrest Gump']}. Got: Expecting property name enclosed in double quotes: line 1 column 2 (char 1)"
|
||||
]
|
||||
}
|
||||
],
|
||||
"source": [
|
||||
"parser.parse(misformatted)"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"id": "6c7c82b6",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"Now we can construct and use a `OutputFixingParser`. This output parser takes as an argument another output parser but also an LLM with which to try to correct any formatting mistakes."
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 7,
|
||||
"id": "39b1a5ce",
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"from langchain.output_parsers import OutputFixingParser\n",
|
||||
"\n",
|
||||
"new_parser = OutputFixingParser.from_llm(parser=parser, llm=ChatOpenAI())"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 8,
|
||||
"id": "0fd96d68",
|
||||
"metadata": {},
|
||||
"outputs": [
|
||||
{
|
||||
"data": {
|
||||
"text/plain": [
|
||||
"Actor(name='Tom Hanks', film_names=['Forrest Gump'])"
|
||||
]
|
||||
},
|
||||
"execution_count": 8,
|
||||
"metadata": {},
|
||||
"output_type": "execute_result"
|
||||
}
|
||||
],
|
||||
"source": [
|
||||
"new_parser.parse(misformatted)"
|
||||
]
|
||||
}
|
||||
],
|
||||
"metadata": {
|
||||
"kernelspec": {
|
||||
"display_name": "Python 3 (ipykernel)",
|
||||
"language": "python",
|
||||
"name": "python3"
|
||||
},
|
||||
"language_info": {
|
||||
"codemirror_mode": {
|
||||
"name": "ipython",
|
||||
"version": 3
|
||||
},
|
||||
"file_extension": ".py",
|
||||
"mimetype": "text/x-python",
|
||||
"name": "python",
|
||||
"nbconvert_exporter": "python",
|
||||
"pygments_lexer": "ipython3",
|
||||
"version": "3.9.1"
|
||||
}
|
||||
},
|
||||
"nbformat": 4,
|
||||
"nbformat_minor": 5
|
||||
}
|
163
docs/modules/prompts/output_parsers/examples/pydantic.ipynb
Normal file
163
docs/modules/prompts/output_parsers/examples/pydantic.ipynb
Normal file
@@ -0,0 +1,163 @@
|
||||
{
|
||||
"cells": [
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"id": "a1ae632a",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"# PydanticOutputParser\n",
|
||||
"This output parser allows users to specify an arbitrary JSON schema and query LLMs for JSON outputs that conform to that schema.\n",
|
||||
"\n",
|
||||
"Keep in mind that large language models are leaky abstractions! You'll have to use an LLM with sufficient capacity to generate well-formed JSON. In the OpenAI family, DaVinci can do reliably but Curie's ability already drops off dramatically. \n",
|
||||
"\n",
|
||||
"Use Pydantic to declare your data model. Pydantic's BaseModel like a Python dataclass, but with actual type checking + coercion."
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 1,
|
||||
"id": "b322c447",
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"from langchain.prompts import PromptTemplate, ChatPromptTemplate, HumanMessagePromptTemplate\n",
|
||||
"from langchain.llms import OpenAI\n",
|
||||
"from langchain.chat_models import ChatOpenAI"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 2,
|
||||
"id": "cba6d8e3",
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"from langchain.output_parsers import PydanticOutputParser\n",
|
||||
"from pydantic import BaseModel, Field, validator\n",
|
||||
"from typing import List"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 3,
|
||||
"id": "0a203100",
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"model_name = 'text-davinci-003'\n",
|
||||
"temperature = 0.0\n",
|
||||
"model = OpenAI(model_name=model_name, temperature=temperature)"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 4,
|
||||
"id": "b3f16168",
|
||||
"metadata": {},
|
||||
"outputs": [
|
||||
{
|
||||
"data": {
|
||||
"text/plain": [
|
||||
"Joke(setup='Why did the chicken cross the road?', punchline='To get to the other side!')"
|
||||
]
|
||||
},
|
||||
"execution_count": 4,
|
||||
"metadata": {},
|
||||
"output_type": "execute_result"
|
||||
}
|
||||
],
|
||||
"source": [
|
||||
"# Define your desired data structure.\n",
|
||||
"class Joke(BaseModel):\n",
|
||||
" setup: str = Field(description=\"question to set up a joke\")\n",
|
||||
" punchline: str = Field(description=\"answer to resolve the joke\")\n",
|
||||
" \n",
|
||||
" # You can add custom validation logic easily with Pydantic.\n",
|
||||
" @validator('setup')\n",
|
||||
" def question_ends_with_question_mark(cls, field):\n",
|
||||
" if field[-1] != '?':\n",
|
||||
" raise ValueError(\"Badly formed question!\")\n",
|
||||
" return field\n",
|
||||
"\n",
|
||||
"# And a query intented to prompt a language model to populate the data structure.\n",
|
||||
"joke_query = \"Tell me a joke.\"\n",
|
||||
"\n",
|
||||
"# Set up a parser + inject instructions into the prompt template.\n",
|
||||
"parser = PydanticOutputParser(pydantic_object=Joke)\n",
|
||||
"\n",
|
||||
"prompt = PromptTemplate(\n",
|
||||
" template=\"Answer the user query.\\n{format_instructions}\\n{query}\\n\",\n",
|
||||
" input_variables=[\"query\"],\n",
|
||||
" partial_variables={\"format_instructions\": parser.get_format_instructions()}\n",
|
||||
")\n",
|
||||
"\n",
|
||||
"_input = prompt.format_prompt(query=joke_query)\n",
|
||||
"\n",
|
||||
"output = model(_input.to_string())\n",
|
||||
"\n",
|
||||
"parser.parse(output)"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 5,
|
||||
"id": "03049f88",
|
||||
"metadata": {},
|
||||
"outputs": [
|
||||
{
|
||||
"data": {
|
||||
"text/plain": [
|
||||
"Actor(name='Tom Hanks', film_names=['Forrest Gump', 'Saving Private Ryan', 'The Green Mile', 'Cast Away', 'Toy Story'])"
|
||||
]
|
||||
},
|
||||
"execution_count": 5,
|
||||
"metadata": {},
|
||||
"output_type": "execute_result"
|
||||
}
|
||||
],
|
||||
"source": [
|
||||
"# Here's another example, but with a compound typed field.\n",
|
||||
"class Actor(BaseModel):\n",
|
||||
" name: str = Field(description=\"name of an actor\")\n",
|
||||
" film_names: List[str] = Field(description=\"list of names of films they starred in\")\n",
|
||||
" \n",
|
||||
"actor_query = \"Generate the filmography for a random actor.\"\n",
|
||||
"\n",
|
||||
"parser = PydanticOutputParser(pydantic_object=Actor)\n",
|
||||
"\n",
|
||||
"prompt = PromptTemplate(\n",
|
||||
" template=\"Answer the user query.\\n{format_instructions}\\n{query}\\n\",\n",
|
||||
" input_variables=[\"query\"],\n",
|
||||
" partial_variables={\"format_instructions\": parser.get_format_instructions()}\n",
|
||||
")\n",
|
||||
"\n",
|
||||
"_input = prompt.format_prompt(query=actor_query)\n",
|
||||
"\n",
|
||||
"output = model(_input.to_string())\n",
|
||||
"\n",
|
||||
"parser.parse(output)"
|
||||
]
|
||||
}
|
||||
],
|
||||
"metadata": {
|
||||
"kernelspec": {
|
||||
"display_name": "Python 3 (ipykernel)",
|
||||
"language": "python",
|
||||
"name": "python3"
|
||||
},
|
||||
"language_info": {
|
||||
"codemirror_mode": {
|
||||
"name": "ipython",
|
||||
"version": 3
|
||||
},
|
||||
"file_extension": ".py",
|
||||
"mimetype": "text/x-python",
|
||||
"name": "python",
|
||||
"nbconvert_exporter": "python",
|
||||
"pygments_lexer": "ipython3",
|
||||
"version": "3.9.1"
|
||||
}
|
||||
},
|
||||
"nbformat": 4,
|
||||
"nbformat_minor": 5
|
||||
}
|
227
docs/modules/prompts/output_parsers/examples/retry.ipynb
Normal file
227
docs/modules/prompts/output_parsers/examples/retry.ipynb
Normal file
@@ -0,0 +1,227 @@
|
||||
{
|
||||
"cells": [
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"id": "4d6c0c86",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"# RetryOutputParser\n",
|
||||
"\n",
|
||||
"While in some cases it is possible to fix any parsing mistakes by only looking at the output, in other cases it can't. An example of this is when the output is not just in the incorrect format, but is partially complete. Consider the below example."
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 1,
|
||||
"id": "f28526bd",
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"from langchain.prompts import PromptTemplate, ChatPromptTemplate, HumanMessagePromptTemplate\n",
|
||||
"from langchain.llms import OpenAI\n",
|
||||
"from langchain.chat_models import ChatOpenAI\n",
|
||||
"from langchain.output_parsers import PydanticOutputParser, OutputFixingParser, RetryOutputParser\n",
|
||||
"from pydantic import BaseModel, Field, validator\n",
|
||||
"from typing import List"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 2,
|
||||
"id": "67c5e1ac",
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"template = \"\"\"Based on the user question, provide an Action and Action Input for what step should be taken.\n",
|
||||
"{format_instructions}\n",
|
||||
"Question: {query}\n",
|
||||
"Response:\"\"\"\n",
|
||||
"class Action(BaseModel):\n",
|
||||
" action: str = Field(description=\"action to take\")\n",
|
||||
" action_input: str = Field(description=\"input to the action\")\n",
|
||||
" \n",
|
||||
"parser = PydanticOutputParser(pydantic_object=Action)"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 3,
|
||||
"id": "007aa87f",
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"prompt = PromptTemplate(\n",
|
||||
" template=\"Answer the user query.\\n{format_instructions}\\n{query}\\n\",\n",
|
||||
" input_variables=[\"query\"],\n",
|
||||
" partial_variables={\"format_instructions\": parser.get_format_instructions()}\n",
|
||||
")"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 4,
|
||||
"id": "10d207ff",
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"prompt_value = prompt.format_prompt(query=\"who is leo di caprios gf?\")"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 5,
|
||||
"id": "68622837",
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"bad_response = '{\"action\": \"search\"}'"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"id": "25631465",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"If we try to parse this response as is, we will get an error"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 6,
|
||||
"id": "894967c1",
|
||||
"metadata": {},
|
||||
"outputs": [
|
||||
{
|
||||
"ename": "OutputParserException",
|
||||
"evalue": "Failed to parse Action from completion {\"action\": \"search\"}. Got: 1 validation error for Action\naction_input\n field required (type=value_error.missing)",
|
||||
"output_type": "error",
|
||||
"traceback": [
|
||||
"\u001b[0;31m---------------------------------------------------------------------------\u001b[0m",
|
||||
"\u001b[0;31mValidationError\u001b[0m Traceback (most recent call last)",
|
||||
"File \u001b[0;32m~/workplace/langchain/langchain/output_parsers/pydantic.py:24\u001b[0m, in \u001b[0;36mPydanticOutputParser.parse\u001b[0;34m(self, text)\u001b[0m\n\u001b[1;32m 23\u001b[0m json_object \u001b[38;5;241m=\u001b[39m json\u001b[38;5;241m.\u001b[39mloads(json_str)\n\u001b[0;32m---> 24\u001b[0m \u001b[38;5;28;01mreturn\u001b[39;00m \u001b[38;5;28;43mself\u001b[39;49m\u001b[38;5;241;43m.\u001b[39;49m\u001b[43mpydantic_object\u001b[49m\u001b[38;5;241;43m.\u001b[39;49m\u001b[43mparse_obj\u001b[49m\u001b[43m(\u001b[49m\u001b[43mjson_object\u001b[49m\u001b[43m)\u001b[49m\n\u001b[1;32m 26\u001b[0m \u001b[38;5;28;01mexcept\u001b[39;00m (json\u001b[38;5;241m.\u001b[39mJSONDecodeError, ValidationError) \u001b[38;5;28;01mas\u001b[39;00m e:\n",
|
||||
"File \u001b[0;32m~/.pyenv/versions/3.9.1/envs/langchain/lib/python3.9/site-packages/pydantic/main.py:527\u001b[0m, in \u001b[0;36mpydantic.main.BaseModel.parse_obj\u001b[0;34m()\u001b[0m\n",
|
||||
"File \u001b[0;32m~/.pyenv/versions/3.9.1/envs/langchain/lib/python3.9/site-packages/pydantic/main.py:342\u001b[0m, in \u001b[0;36mpydantic.main.BaseModel.__init__\u001b[0;34m()\u001b[0m\n",
|
||||
"\u001b[0;31mValidationError\u001b[0m: 1 validation error for Action\naction_input\n field required (type=value_error.missing)",
|
||||
"\nDuring handling of the above exception, another exception occurred:\n",
|
||||
"\u001b[0;31mOutputParserException\u001b[0m Traceback (most recent call last)",
|
||||
"Cell \u001b[0;32mIn[6], line 1\u001b[0m\n\u001b[0;32m----> 1\u001b[0m \u001b[43mparser\u001b[49m\u001b[38;5;241;43m.\u001b[39;49m\u001b[43mparse\u001b[49m\u001b[43m(\u001b[49m\u001b[43mbad_response\u001b[49m\u001b[43m)\u001b[49m\n",
|
||||
"File \u001b[0;32m~/workplace/langchain/langchain/output_parsers/pydantic.py:29\u001b[0m, in \u001b[0;36mPydanticOutputParser.parse\u001b[0;34m(self, text)\u001b[0m\n\u001b[1;32m 27\u001b[0m name \u001b[38;5;241m=\u001b[39m \u001b[38;5;28mself\u001b[39m\u001b[38;5;241m.\u001b[39mpydantic_object\u001b[38;5;241m.\u001b[39m\u001b[38;5;18m__name__\u001b[39m\n\u001b[1;32m 28\u001b[0m msg \u001b[38;5;241m=\u001b[39m \u001b[38;5;124mf\u001b[39m\u001b[38;5;124m\"\u001b[39m\u001b[38;5;124mFailed to parse \u001b[39m\u001b[38;5;132;01m{\u001b[39;00mname\u001b[38;5;132;01m}\u001b[39;00m\u001b[38;5;124m from completion \u001b[39m\u001b[38;5;132;01m{\u001b[39;00mtext\u001b[38;5;132;01m}\u001b[39;00m\u001b[38;5;124m. Got: \u001b[39m\u001b[38;5;132;01m{\u001b[39;00me\u001b[38;5;132;01m}\u001b[39;00m\u001b[38;5;124m\"\u001b[39m\n\u001b[0;32m---> 29\u001b[0m \u001b[38;5;28;01mraise\u001b[39;00m OutputParserException(msg)\n",
|
||||
"\u001b[0;31mOutputParserException\u001b[0m: Failed to parse Action from completion {\"action\": \"search\"}. Got: 1 validation error for Action\naction_input\n field required (type=value_error.missing)"
|
||||
]
|
||||
}
|
||||
],
|
||||
"source": [
|
||||
"parser.parse(bad_response)"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"id": "f6b64696",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"If we try to use the `OutputFixingParser` to fix this error, it will be confused - namely, it doesn't know what to actually put for action input."
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 7,
|
||||
"id": "78b2b40d",
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"fix_parser = OutputFixingParser.from_llm(parser=parser, llm=ChatOpenAI())"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 8,
|
||||
"id": "4fe1301d",
|
||||
"metadata": {},
|
||||
"outputs": [
|
||||
{
|
||||
"data": {
|
||||
"text/plain": [
|
||||
"Action(action='search', action_input='')"
|
||||
]
|
||||
},
|
||||
"execution_count": 8,
|
||||
"metadata": {},
|
||||
"output_type": "execute_result"
|
||||
}
|
||||
],
|
||||
"source": [
|
||||
"fix_parser.parse(bad_response)"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"id": "9bd9ea7d",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"Instead, we can use the RetryOutputParser, which passes in the prompt (as well as the original output) to try again to get a better response."
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 9,
|
||||
"id": "7e8a8a28",
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"from langchain.output_parsers import RetryWithErrorOutputParser"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 15,
|
||||
"id": "5c86e141",
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"retry_parser = RetryWithErrorOutputParser.from_llm(parser=parser, llm=OpenAI(temperature=0))"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 16,
|
||||
"id": "9c04f731",
|
||||
"metadata": {},
|
||||
"outputs": [
|
||||
{
|
||||
"data": {
|
||||
"text/plain": [
|
||||
"Action(action='search', action_input='who is leo di caprios gf?')"
|
||||
]
|
||||
},
|
||||
"execution_count": 16,
|
||||
"metadata": {},
|
||||
"output_type": "execute_result"
|
||||
}
|
||||
],
|
||||
"source": [
|
||||
"retry_parser.parse_with_prompt(bad_response, prompt_value)"
|
||||
]
|
||||
}
|
||||
],
|
||||
"metadata": {
|
||||
"kernelspec": {
|
||||
"display_name": "Python 3 (ipykernel)",
|
||||
"language": "python",
|
||||
"name": "python3"
|
||||
},
|
||||
"language_info": {
|
||||
"codemirror_mode": {
|
||||
"name": "ipython",
|
||||
"version": 3
|
||||
},
|
||||
"file_extension": ".py",
|
||||
"mimetype": "text/x-python",
|
||||
"name": "python",
|
||||
"nbconvert_exporter": "python",
|
||||
"pygments_lexer": "ipython3",
|
||||
"version": "3.9.1"
|
||||
}
|
||||
},
|
||||
"nbformat": 4,
|
||||
"nbformat_minor": 5
|
||||
}
|
209
docs/modules/prompts/output_parsers/examples/structured.ipynb
Normal file
209
docs/modules/prompts/output_parsers/examples/structured.ipynb
Normal file
@@ -0,0 +1,209 @@
|
||||
{
|
||||
"cells": [
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"id": "91871002",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"# Structured Output Parser\n",
|
||||
"\n",
|
||||
"While the Pydantic/JSON parser is more powerful, we initially experimented data structures having text fields only."
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 3,
|
||||
"id": "b492997a",
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"from langchain.output_parsers import StructuredOutputParser, ResponseSchema\n",
|
||||
"from langchain.prompts import PromptTemplate, ChatPromptTemplate, HumanMessagePromptTemplate\n",
|
||||
"from langchain.llms import OpenAI\n",
|
||||
"from langchain.chat_models import ChatOpenAI"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"id": "09473dce",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"Here we define the response schema we want to receive."
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 4,
|
||||
"id": "432ac44a",
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"response_schemas = [\n",
|
||||
" ResponseSchema(name=\"answer\", description=\"answer to the user's question\"),\n",
|
||||
" ResponseSchema(name=\"source\", description=\"source used to answer the user's question, should be a website.\")\n",
|
||||
"]\n",
|
||||
"output_parser = StructuredOutputParser.from_response_schemas(response_schemas)"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"id": "7b92ce96",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"We now get a string that contains instructions for how the response should be formatted, and we then insert that into our prompt."
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 5,
|
||||
"id": "593cfc25",
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"format_instructions = output_parser.get_format_instructions()\n",
|
||||
"prompt = PromptTemplate(\n",
|
||||
" template=\"answer the users question as best as possible.\\n{format_instructions}\\n{question}\",\n",
|
||||
" input_variables=[\"question\"],\n",
|
||||
" partial_variables={\"format_instructions\": format_instructions}\n",
|
||||
")"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"id": "0943e783",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"We can now use this to format a prompt to send to the language model, and then parse the returned result."
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 6,
|
||||
"id": "106f1ba6",
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"model = OpenAI(temperature=0)"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 7,
|
||||
"id": "86d9d24f",
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"_input = prompt.format_prompt(question=\"what's the capital of france\")\n",
|
||||
"output = model(_input.to_string())"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 8,
|
||||
"id": "956bdc99",
|
||||
"metadata": {},
|
||||
"outputs": [
|
||||
{
|
||||
"data": {
|
||||
"text/plain": [
|
||||
"{'answer': 'Paris', 'source': 'https://en.wikipedia.org/wiki/Paris'}"
|
||||
]
|
||||
},
|
||||
"execution_count": 8,
|
||||
"metadata": {},
|
||||
"output_type": "execute_result"
|
||||
}
|
||||
],
|
||||
"source": [
|
||||
"output_parser.parse(output)"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"id": "da639285",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"And here's an example of using this in a chat model"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 9,
|
||||
"id": "8f483d7d",
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"chat_model = ChatOpenAI(temperature=0)"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 10,
|
||||
"id": "f761cbf1",
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"prompt = ChatPromptTemplate(\n",
|
||||
" messages=[\n",
|
||||
" HumanMessagePromptTemplate.from_template(\"answer the users question as best as possible.\\n{format_instructions}\\n{question}\") \n",
|
||||
" ],\n",
|
||||
" input_variables=[\"question\"],\n",
|
||||
" partial_variables={\"format_instructions\": format_instructions}\n",
|
||||
")"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 11,
|
||||
"id": "edd73ae3",
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"_input = prompt.format_prompt(question=\"what's the capital of france\")\n",
|
||||
"output = chat_model(_input.to_messages())"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 12,
|
||||
"id": "a3c8b91e",
|
||||
"metadata": {},
|
||||
"outputs": [
|
||||
{
|
||||
"data": {
|
||||
"text/plain": [
|
||||
"{'answer': 'Paris', 'source': 'https://en.wikipedia.org/wiki/Paris'}"
|
||||
]
|
||||
},
|
||||
"execution_count": 12,
|
||||
"metadata": {},
|
||||
"output_type": "execute_result"
|
||||
}
|
||||
],
|
||||
"source": [
|
||||
"output_parser.parse(output.content)"
|
||||
]
|
||||
}
|
||||
],
|
||||
"metadata": {
|
||||
"kernelspec": {
|
||||
"display_name": "Python 3 (ipykernel)",
|
||||
"language": "python",
|
||||
"name": "python3"
|
||||
},
|
||||
"language_info": {
|
||||
"codemirror_mode": {
|
||||
"name": "ipython",
|
||||
"version": 3
|
||||
},
|
||||
"file_extension": ".py",
|
||||
"mimetype": "text/x-python",
|
||||
"name": "python",
|
||||
"nbconvert_exporter": "python",
|
||||
"pygments_lexer": "ipython3",
|
||||
"version": "3.9.1"
|
||||
}
|
||||
},
|
||||
"nbformat": 4,
|
||||
"nbformat_minor": 5
|
||||
}
|
163
docs/modules/prompts/output_parsers/getting_started.ipynb
Normal file
163
docs/modules/prompts/output_parsers/getting_started.ipynb
Normal file
@@ -0,0 +1,163 @@
|
||||
{
|
||||
"cells": [
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"id": "084ee2f0",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"# Output Parsers\n",
|
||||
"\n",
|
||||
"Language models output text. But many times you may want to get more structured information than just text back. This is where output parsers come in.\n",
|
||||
"\n",
|
||||
"Output parsers are classes that help structure language model responses. There are two main methods an output parser must implement:\n",
|
||||
"\n",
|
||||
"- `get_format_instructions() -> str`: A method which returns a string containing instructions for how the output of a language model should be formatted.\n",
|
||||
"- `parse(str) -> Any`: A method which takes in a string (assumed to be the response from a language model) and parses it into some structure.\n",
|
||||
"\n",
|
||||
"And then one optional one:\n",
|
||||
"\n",
|
||||
"- `parse_with_prompt(str) -> Any`: A method which takes in a string (assumed to be the response from a language model) and a prompt (assumed to the prompt that generated such a response) and parses it into some structure. The prompt is largely provided in the event the OutputParser wants to retry or fix the output in some way, and needs information from the prompt to do so.\n",
|
||||
"\n",
|
||||
"\n",
|
||||
"Below we go over the main type of output parser, the `PydanticOutputParser`. See the `examples` folder for other options."
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 1,
|
||||
"id": "5f0c8a33",
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"from langchain.prompts import PromptTemplate, ChatPromptTemplate, HumanMessagePromptTemplate\n",
|
||||
"from langchain.llms import OpenAI\n",
|
||||
"from langchain.chat_models import ChatOpenAI\n",
|
||||
"\n",
|
||||
"from langchain.output_parsers import PydanticOutputParser\n",
|
||||
"from pydantic import BaseModel, Field, validator\n",
|
||||
"from typing import List"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 2,
|
||||
"id": "0a203100",
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"model_name = 'text-davinci-003'\n",
|
||||
"temperature = 0.0\n",
|
||||
"model = OpenAI(model_name=model_name, temperature=temperature)"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 3,
|
||||
"id": "fd3cbfc5",
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"# Define your desired data structure.\n",
|
||||
"class Joke(BaseModel):\n",
|
||||
" setup: str = Field(description=\"question to set up a joke\")\n",
|
||||
" punchline: str = Field(description=\"answer to resolve the joke\")\n",
|
||||
" \n",
|
||||
" # You can add custom validation logic easily with Pydantic.\n",
|
||||
" @validator('setup')\n",
|
||||
" def question_ends_with_question_mark(cls, field):\n",
|
||||
" if field[-1] != '?':\n",
|
||||
" raise ValueError(\"Badly formed question!\")\n",
|
||||
" return field"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 4,
|
||||
"id": "e03e1576",
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"# Set up a parser + inject instructions into the prompt template.\n",
|
||||
"parser = PydanticOutputParser(pydantic_object=Joke)"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 5,
|
||||
"id": "5ec3fa44",
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"prompt = PromptTemplate(\n",
|
||||
" template=\"Answer the user query.\\n{format_instructions}\\n{query}\\n\",\n",
|
||||
" input_variables=[\"query\"],\n",
|
||||
" partial_variables={\"format_instructions\": parser.get_format_instructions()}\n",
|
||||
")"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 6,
|
||||
"id": "00139255",
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"# And a query intented to prompt a language model to populate the data structure.\n",
|
||||
"joke_query = \"Tell me a joke.\"\n",
|
||||
"_input = prompt.format_prompt(query=joke_query)"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 7,
|
||||
"id": "f1aac756",
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"output = model(_input.to_string())"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 8,
|
||||
"id": "b3f16168",
|
||||
"metadata": {},
|
||||
"outputs": [
|
||||
{
|
||||
"data": {
|
||||
"text/plain": [
|
||||
"Joke(setup='Why did the chicken cross the road?', punchline='To get to the other side!')"
|
||||
]
|
||||
},
|
||||
"execution_count": 8,
|
||||
"metadata": {},
|
||||
"output_type": "execute_result"
|
||||
}
|
||||
],
|
||||
"source": [
|
||||
"parser.parse(output)"
|
||||
]
|
||||
}
|
||||
],
|
||||
"metadata": {
|
||||
"kernelspec": {
|
||||
"display_name": "Python 3 (ipykernel)",
|
||||
"language": "python",
|
||||
"name": "python3"
|
||||
},
|
||||
"language_info": {
|
||||
"codemirror_mode": {
|
||||
"name": "ipython",
|
||||
"version": 3
|
||||
},
|
||||
"file_extension": ".py",
|
||||
"mimetype": "text/x-python",
|
||||
"name": "python",
|
||||
"nbconvert_exporter": "python",
|
||||
"pygments_lexer": "ipython3",
|
||||
"version": "3.9.1"
|
||||
}
|
||||
},
|
||||
"nbformat": 4,
|
||||
"nbformat_minor": 5
|
||||
}
|
8
docs/modules/prompts/output_parsers/how_to_guides.rst
Normal file
8
docs/modules/prompts/output_parsers/how_to_guides.rst
Normal file
@@ -0,0 +1,8 @@
|
||||
How-To Guides
|
||||
=============
|
||||
|
||||
If you're new to the library, you may want to start with the `Quickstart <./getting_started.html>`_.
|
||||
|
||||
The user guide here shows different types of output parsers.
|
||||
|
||||
|
30
docs/modules/prompts/prompt_templates.rst
Normal file
30
docs/modules/prompts/prompt_templates.rst
Normal file
@@ -0,0 +1,30 @@
|
||||
Prompt Templates
|
||||
==========================
|
||||
|
||||
.. note::
|
||||
`Conceptual Guide <https://docs.langchain.com/docs/components/prompts/prompt-template>`_
|
||||
|
||||
|
||||
Language models take text as input - that text is commonly referred to as a prompt.
|
||||
Typically this is not simply a hardcoded string but rather a combination of a template, some examples, and user input.
|
||||
LangChain provides several classes and functions to make constructing and working with prompts easy.
|
||||
|
||||
The following sections of documentation are provided:
|
||||
|
||||
- `Getting Started <./prompt_templates/getting_started.html>`_: An overview of all the functionality LangChain provides for working with and constructing prompts.
|
||||
|
||||
- `How-To Guides <./prompt_templates/how_to_guides.html>`_: A collection of how-to guides. These highlight how to accomplish various objectives with our prompt class.
|
||||
|
||||
- `Reference <../../reference/prompts.html>`_: API reference documentation for all prompt classes.
|
||||
|
||||
|
||||
|
||||
.. toctree::
|
||||
:maxdepth: 1
|
||||
:caption: Prompt Templates
|
||||
:name: Prompt Templates
|
||||
:hidden:
|
||||
|
||||
./prompt_templates/getting_started.md
|
||||
./prompt_templates/how_to_guides.rst
|
||||
Reference<../../reference/prompts.rst>
|
@@ -5,7 +5,7 @@
|
||||
"id": "c75efab3",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"# Create a custom prompt template\n",
|
||||
"# How to create a custom prompt template\n",
|
||||
"\n",
|
||||
"Let's suppose we want the LLM to generate English language explanations of a function given its name. To achieve this task, we will create a custom prompt template that takes in the function name as input, and formats the prompt template to provide the source code of the function.\n",
|
||||
"\n",
|
||||
@@ -161,7 +161,7 @@
|
||||
"name": "python",
|
||||
"nbconvert_exporter": "python",
|
||||
"pygments_lexer": "ipython3",
|
||||
"version": "3.10.9"
|
||||
"version": "3.9.1"
|
||||
}
|
||||
},
|
||||
"nbformat": 4,
|
@@ -5,7 +5,7 @@
|
||||
"id": "f8b01b97",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"# Provide few shot examples to a prompt\n",
|
||||
"# How to create a prompt template that uses few shot examples\n",
|
||||
"\n",
|
||||
"In this tutorial, we'll learn how to create a prompt template that uses few shot examples.\n",
|
||||
"\n",
|
@@ -5,7 +5,7 @@
|
||||
"id": "9355a547",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"# Partial Prompt Templates\n",
|
||||
"# How to work with partial Prompt Templates\n",
|
||||
"\n",
|
||||
"A prompt template is a class with a `.format` method which takes in a key-value map and returns a string (a prompt) to pass to the language model. Like other methods, it can make sense to \"partial\" a prompt template - eg pass in a subset of the required values, as to create a new prompt template which expects only the remaining subset of values.\n",
|
||||
"\n",
|
@@ -5,7 +5,7 @@
|
||||
"id": "43fb16cb",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"# Prompt Serialization\n",
|
||||
"# How to serialize prompts\n",
|
||||
"\n",
|
||||
"It is often preferrable to store prompts not as python code but as files. This can make it easy to share, store, and version prompts. This notebook covers how to do that in LangChain, walking through all the different types of prompts and the different serialization options.\n",
|
||||
"\n",
|
||||
@@ -649,7 +649,7 @@
|
||||
"name": "python",
|
||||
"nbconvert_exporter": "python",
|
||||
"pygments_lexer": "ipython3",
|
||||
"version": "3.11.2"
|
||||
"version": "3.9.1"
|
||||
},
|
||||
"vscode": {
|
||||
"interpreter": {
|
@@ -236,6 +236,6 @@ print(dynamic_prompt.format(input=long_string))
|
||||
```
|
||||
|
||||
<!-- TODO(shreya): Add correct link here. -->
|
||||
LangChain comes with a few example selectors that you can use. For more details on how to use them, see [Example Selectors](./examples/example_selectors.ipynb).
|
||||
LangChain comes with a few example selectors that you can use. For more details on how to use them, see [Example Selectors](../example_selectors.rst).
|
||||
|
||||
You can create custom example selectors that select examples based on any criteria you want. For more details on how to do this, see [Creating a custom example selector](examples/custom_example_selector.ipynb).
|
||||
You can create custom example selectors that select examples based on any criteria you want. For more details on how to do this, see [Creating a custom example selector](prompt_templates/examples/custom_example_selector.ipynb).
|
13
docs/modules/prompts/prompt_templates/how_to_guides.rst
Normal file
13
docs/modules/prompts/prompt_templates/how_to_guides.rst
Normal file
@@ -0,0 +1,13 @@
|
||||
How-To Guides
|
||||
=============
|
||||
|
||||
If you're new to the library, you may want to start with the `Quickstart <./getting_started.html>`_.
|
||||
|
||||
The user guide here shows more advanced workflows and how to use the library in different ways.
|
||||
|
||||
|
||||
.. toctree::
|
||||
:maxdepth: 1
|
||||
:glob:
|
||||
|
||||
./examples/*
|
Reference in New Issue
Block a user