mirror of
https://github.com/hwchase17/langchain.git
synced 2026-04-04 11:25:11 +00:00
329 lines
8.4 KiB
Plaintext
329 lines
8.4 KiB
Plaintext
{
|
|
"cells": [
|
|
{
|
|
"cell_type": "raw",
|
|
"id": "df7d42b9-58a6-434c-a2d7-0b61142f6d3e",
|
|
"metadata": {},
|
|
"source": [
|
|
"---\n",
|
|
"sidebar_position: 3\n",
|
|
"---"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"id": "f2195672-0cab-4967-ba8a-c6544635547d",
|
|
"metadata": {},
|
|
"source": [
|
|
"# How to handle cases where no queries are generated\n",
|
|
"\n",
|
|
"Sometimes, a query analysis technique may allow for any number of queries to be generated - including no queries! In this case, our overall chain will need to inspect the result of the query analysis before deciding whether to call the retriever or not.\n",
|
|
"\n",
|
|
"We will use mock data for this example."
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"id": "a4079b57-4369-49c9-b2ad-c809b5408d7e",
|
|
"metadata": {},
|
|
"source": [
|
|
"## Setup\n",
|
|
"#### Install dependencies"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": 1,
|
|
"id": "e168ef5c-e54e-49a6-8552-5502854a6f01",
|
|
"metadata": {},
|
|
"outputs": [],
|
|
"source": [
|
|
"# %pip install -qU langchain langchain-community langchain-openai langchain-chroma"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"id": "79d66a45-a05c-4d22-b011-b1cdbdfc8f9c",
|
|
"metadata": {},
|
|
"source": [
|
|
"#### Set environment variables\n",
|
|
"\n",
|
|
"We'll use OpenAI in this example:"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": null,
|
|
"id": "40e2979e-a818-4b96-ac25-039336f94319",
|
|
"metadata": {},
|
|
"outputs": [],
|
|
"source": [
|
|
"import getpass\n",
|
|
"import os\n",
|
|
"\n",
|
|
"os.environ[\"OPENAI_API_KEY\"] = getpass.getpass()\n",
|
|
"\n",
|
|
"# Optional, uncomment to trace runs with LangSmith. Sign up here: https://smith.langchain.com.\n",
|
|
"# os.environ[\"LANGCHAIN_TRACING_V2\"] = \"true\"\n",
|
|
"# os.environ[\"LANGCHAIN_API_KEY\"] = getpass.getpass()"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"id": "c20b48b8-16d7-4089-bc17-f2d240b3935a",
|
|
"metadata": {},
|
|
"source": [
|
|
"### Create Index\n",
|
|
"\n",
|
|
"We will create a vectorstore over fake information."
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": 1,
|
|
"id": "1f621694",
|
|
"metadata": {},
|
|
"outputs": [],
|
|
"source": [
|
|
"from langchain.text_splitter import RecursiveCharacterTextSplitter\n",
|
|
"from langchain_chroma import Chroma\n",
|
|
"from langchain_openai import OpenAIEmbeddings\n",
|
|
"\n",
|
|
"texts = [\"Harrison worked at Kensho\"]\n",
|
|
"embeddings = OpenAIEmbeddings(model=\"text-embedding-3-small\")\n",
|
|
"vectorstore = Chroma.from_texts(\n",
|
|
" texts,\n",
|
|
" embeddings,\n",
|
|
")\n",
|
|
"retriever = vectorstore.as_retriever()"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"id": "57396e23-c192-4d97-846b-5eacea4d6b8d",
|
|
"metadata": {},
|
|
"source": [
|
|
"## Query analysis\n",
|
|
"\n",
|
|
"We will use function calling to structure the output. However, we will configure the LLM such that is doesn't NEED to call the function representing a search query (should it decide not to). We will also then use a prompt to do query analysis that explicitly lays when it should and shouldn't make a search."
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": 2,
|
|
"id": "0b51dd76-820d-41a4-98c8-893f6fe0d1ea",
|
|
"metadata": {},
|
|
"outputs": [],
|
|
"source": [
|
|
"from typing import Optional\n",
|
|
"\n",
|
|
"from langchain_core.pydantic_v1 import BaseModel, Field\n",
|
|
"\n",
|
|
"\n",
|
|
"class Search(BaseModel):\n",
|
|
" \"\"\"Search over a database of job records.\"\"\"\n",
|
|
"\n",
|
|
" query: str = Field(\n",
|
|
" ...,\n",
|
|
" description=\"Similarity search query applied to job record.\",\n",
|
|
" )"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": 3,
|
|
"id": "783c03c3-8c72-4f88-9cf4-5829ce6745d6",
|
|
"metadata": {},
|
|
"outputs": [],
|
|
"source": [
|
|
"from langchain_core.prompts import ChatPromptTemplate\n",
|
|
"from langchain_core.runnables import RunnablePassthrough\n",
|
|
"from langchain_openai import ChatOpenAI\n",
|
|
"\n",
|
|
"system = \"\"\"You have the ability to issue search queries to get information to help answer user information.\n",
|
|
"\n",
|
|
"You do not NEED to look things up. If you don't need to, then just respond normally.\"\"\"\n",
|
|
"prompt = ChatPromptTemplate.from_messages(\n",
|
|
" [\n",
|
|
" (\"system\", system),\n",
|
|
" (\"human\", \"{question}\"),\n",
|
|
" ]\n",
|
|
")\n",
|
|
"llm = ChatOpenAI(model=\"gpt-3.5-turbo-0125\", temperature=0)\n",
|
|
"structured_llm = llm.bind_tools([Search])\n",
|
|
"query_analyzer = {\"question\": RunnablePassthrough()} | prompt | structured_llm"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"id": "b9564078",
|
|
"metadata": {},
|
|
"source": [
|
|
"We can see that by invoking this we get an message that sometimes - but not always - returns a tool call."
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": 4,
|
|
"id": "bc1d3863",
|
|
"metadata": {},
|
|
"outputs": [
|
|
{
|
|
"data": {
|
|
"text/plain": [
|
|
"AIMessage(content='', additional_kwargs={'tool_calls': [{'id': 'call_ZnoVX4j9Mn8wgChaORyd1cvq', 'function': {'arguments': '{\"query\":\"Harrison\"}', 'name': 'Search'}, 'type': 'function'}]})"
|
|
]
|
|
},
|
|
"execution_count": 4,
|
|
"metadata": {},
|
|
"output_type": "execute_result"
|
|
}
|
|
],
|
|
"source": [
|
|
"query_analyzer.invoke(\"where did Harrison Work\")"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": 5,
|
|
"id": "af62af17-4f90-4dbd-a8b4-dfff51f1db95",
|
|
"metadata": {},
|
|
"outputs": [
|
|
{
|
|
"data": {
|
|
"text/plain": [
|
|
"AIMessage(content='Hello! How can I assist you today?')"
|
|
]
|
|
},
|
|
"execution_count": 5,
|
|
"metadata": {},
|
|
"output_type": "execute_result"
|
|
}
|
|
],
|
|
"source": [
|
|
"query_analyzer.invoke(\"hi!\")"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"id": "c7c65b2f-7881-45fc-a47b-a4eaaf48245f",
|
|
"metadata": {},
|
|
"source": [
|
|
"## Retrieval with query analysis\n",
|
|
"\n",
|
|
"So how would we include this in a chain? Let's look at an example below."
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": 6,
|
|
"id": "1e047d87",
|
|
"metadata": {},
|
|
"outputs": [],
|
|
"source": [
|
|
"from langchain_core.output_parsers.openai_tools import PydanticToolsParser\n",
|
|
"from langchain_core.runnables import chain\n",
|
|
"\n",
|
|
"output_parser = PydanticToolsParser(tools=[Search])"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": 7,
|
|
"id": "8dac7866",
|
|
"metadata": {},
|
|
"outputs": [],
|
|
"source": [
|
|
"@chain\n",
|
|
"def custom_chain(question):\n",
|
|
" response = query_analyzer.invoke(question)\n",
|
|
" if \"tool_calls\" in response.additional_kwargs:\n",
|
|
" query = output_parser.invoke(response)\n",
|
|
" docs = retriever.invoke(query[0].query)\n",
|
|
" # Could add more logic - like another LLM call - here\n",
|
|
" return docs\n",
|
|
" else:\n",
|
|
" return response"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": 8,
|
|
"id": "232ad8a7-7990-4066-9228-d35a555f7293",
|
|
"metadata": {},
|
|
"outputs": [
|
|
{
|
|
"name": "stderr",
|
|
"output_type": "stream",
|
|
"text": [
|
|
"Number of requested results 4 is greater than number of elements in index 1, updating n_results = 1\n"
|
|
]
|
|
},
|
|
{
|
|
"data": {
|
|
"text/plain": [
|
|
"[Document(page_content='Harrison worked at Kensho')]"
|
|
]
|
|
},
|
|
"execution_count": 8,
|
|
"metadata": {},
|
|
"output_type": "execute_result"
|
|
}
|
|
],
|
|
"source": [
|
|
"custom_chain.invoke(\"where did Harrison Work\")"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": 9,
|
|
"id": "28e14ba5",
|
|
"metadata": {},
|
|
"outputs": [
|
|
{
|
|
"data": {
|
|
"text/plain": [
|
|
"AIMessage(content='Hello! How can I assist you today?')"
|
|
]
|
|
},
|
|
"execution_count": 9,
|
|
"metadata": {},
|
|
"output_type": "execute_result"
|
|
}
|
|
],
|
|
"source": [
|
|
"custom_chain.invoke(\"hi!\")"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": null,
|
|
"id": "33338d4f",
|
|
"metadata": {},
|
|
"outputs": [],
|
|
"source": []
|
|
}
|
|
],
|
|
"metadata": {
|
|
"kernelspec": {
|
|
"display_name": "Python 3 (ipykernel)",
|
|
"language": "python",
|
|
"name": "python3"
|
|
},
|
|
"language_info": {
|
|
"codemirror_mode": {
|
|
"name": "ipython",
|
|
"version": 3
|
|
},
|
|
"file_extension": ".py",
|
|
"mimetype": "text/x-python",
|
|
"name": "python",
|
|
"nbconvert_exporter": "python",
|
|
"pygments_lexer": "ipython3",
|
|
"version": "3.10.1"
|
|
}
|
|
},
|
|
"nbformat": 4,
|
|
"nbformat_minor": 5
|
|
}
|