mirror of
https://github.com/hwchase17/langchain.git
synced 2025-10-23 19:44:05 +00:00
613 lines
17 KiB
Plaintext
613 lines
17 KiB
Plaintext
{
|
||
"cells": [
|
||
{
|
||
"cell_type": "markdown",
|
||
"id": "134a0785",
|
||
"metadata": {},
|
||
"source": [
|
||
"# Chat Index\n",
|
||
"\n",
|
||
"This notebook goes over how to set up a chain to chat with an index. The only difference between this chain and the [RetrievalQAChain](./vector_db_qa.ipynb) is that this allows for passing in of a chat history which can be used to allow for follow up questions."
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 1,
|
||
"id": "70c4e529",
|
||
"metadata": {
|
||
"tags": []
|
||
},
|
||
"outputs": [],
|
||
"source": [
|
||
"from langchain.embeddings.openai import OpenAIEmbeddings\n",
|
||
"from langchain.vectorstores import Chroma\n",
|
||
"from langchain.text_splitter import CharacterTextSplitter\n",
|
||
"from langchain.llms import OpenAI\n",
|
||
"from langchain.chains import ConversationalRetrievalChain"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"id": "cdff94be",
|
||
"metadata": {},
|
||
"source": [
|
||
"Load in documents. You can replace this with a loader for whatever type of data you want"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 2,
|
||
"id": "01c46e92",
|
||
"metadata": {
|
||
"tags": []
|
||
},
|
||
"outputs": [],
|
||
"source": [
|
||
"from langchain.document_loaders import TextLoader\n",
|
||
"loader = TextLoader(\"../../state_of_the_union.txt\")\n",
|
||
"documents = loader.load()"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"id": "e9be4779",
|
||
"metadata": {},
|
||
"source": [
|
||
"If you had multiple loaders that you wanted to combine, you do something like:"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 3,
|
||
"id": "433363a5",
|
||
"metadata": {
|
||
"tags": []
|
||
},
|
||
"outputs": [],
|
||
"source": [
|
||
"# loaders = [....]\n",
|
||
"# docs = []\n",
|
||
"# for loader in loaders:\n",
|
||
"# docs.extend(loader.load())"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"id": "239475d2",
|
||
"metadata": {},
|
||
"source": [
|
||
"We now split the documents, create embeddings for them, and put them in a vectorstore. This allows us to do semantic search over them."
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 4,
|
||
"id": "a8930cf7",
|
||
"metadata": {
|
||
"tags": []
|
||
},
|
||
"outputs": [
|
||
{
|
||
"name": "stdout",
|
||
"output_type": "stream",
|
||
"text": [
|
||
"Running Chroma using direct local API.\n",
|
||
"Using DuckDB in-memory for database. Data will be transient.\n"
|
||
]
|
||
}
|
||
],
|
||
"source": [
|
||
"text_splitter = CharacterTextSplitter(chunk_size=1000, chunk_overlap=0)\n",
|
||
"documents = text_splitter.split_documents(documents)\n",
|
||
"\n",
|
||
"embeddings = OpenAIEmbeddings()\n",
|
||
"vectorstore = Chroma.from_documents(documents, embeddings)"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"id": "3c96b118",
|
||
"metadata": {},
|
||
"source": [
|
||
"We now initialize the ConversationalRetrievalChain"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 5,
|
||
"id": "7b4110f3",
|
||
"metadata": {
|
||
"tags": []
|
||
},
|
||
"outputs": [],
|
||
"source": [
|
||
"qa = ConversationalRetrievalChain.from_llm(OpenAI(temperature=0), vectorstore.as_retriever())"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"id": "3872432d",
|
||
"metadata": {},
|
||
"source": [
|
||
"Here's an example of asking a question with no chat history"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 6,
|
||
"id": "7fe3e730",
|
||
"metadata": {
|
||
"tags": []
|
||
},
|
||
"outputs": [],
|
||
"source": [
|
||
"chat_history = []\n",
|
||
"query = \"What did the president say about Ketanji Brown Jackson\"\n",
|
||
"result = qa({\"question\": query, \"chat_history\": chat_history})"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 7,
|
||
"id": "bfff9cc8",
|
||
"metadata": {
|
||
"tags": []
|
||
},
|
||
"outputs": [
|
||
{
|
||
"data": {
|
||
"text/plain": [
|
||
"\" The president said that Ketanji Brown Jackson is one of the nation's top legal minds, a former top litigator in private practice, a former federal public defender, and from a family of public school educators and police officers. He also said that she is a consensus builder and has received a broad range of support from the Fraternal Order of Police to former judges appointed by Democrats and Republicans.\""
|
||
]
|
||
},
|
||
"execution_count": 7,
|
||
"metadata": {},
|
||
"output_type": "execute_result"
|
||
}
|
||
],
|
||
"source": [
|
||
"result[\"answer\"]"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"id": "9e46edf7",
|
||
"metadata": {},
|
||
"source": [
|
||
"Here's an example of asking a question with some chat history"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 8,
|
||
"id": "00b4cf00",
|
||
"metadata": {
|
||
"tags": []
|
||
},
|
||
"outputs": [],
|
||
"source": [
|
||
"chat_history = [(query, result[\"answer\"])]\n",
|
||
"query = \"Did he mention who she suceeded\"\n",
|
||
"result = qa({\"question\": query, \"chat_history\": chat_history})"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 9,
|
||
"id": "f01828d1",
|
||
"metadata": {
|
||
"tags": []
|
||
},
|
||
"outputs": [
|
||
{
|
||
"data": {
|
||
"text/plain": [
|
||
"' Justice Stephen Breyer'"
|
||
]
|
||
},
|
||
"execution_count": 9,
|
||
"metadata": {},
|
||
"output_type": "execute_result"
|
||
}
|
||
],
|
||
"source": [
|
||
"result['answer']"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"id": "0eaadf0f",
|
||
"metadata": {},
|
||
"source": [
|
||
"## Return Source Documents\n",
|
||
"You can also easily return source documents from the ConversationalRetrievalChain. This is useful for when you want to inspect what documents were returned."
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 11,
|
||
"id": "562769c6",
|
||
"metadata": {},
|
||
"outputs": [],
|
||
"source": [
|
||
"qa = ConversationalRetrievalChain.from_llm(OpenAI(temperature=0), vectorstore.as_retriever(), return_source_documents=True)"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 12,
|
||
"id": "ea478300",
|
||
"metadata": {},
|
||
"outputs": [],
|
||
"source": [
|
||
"chat_history = []\n",
|
||
"query = \"What did the president say about Ketanji Brown Jackson\"\n",
|
||
"result = qa({\"question\": query, \"chat_history\": chat_history})"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 13,
|
||
"id": "4cb75b4e",
|
||
"metadata": {},
|
||
"outputs": [
|
||
{
|
||
"data": {
|
||
"text/plain": [
|
||
"Document(page_content='Tonight. I call on the Senate to: Pass the Freedom to Vote Act. Pass the John Lewis Voting Rights Act. And while you’re at it, pass the Disclose Act so Americans can know who is funding our elections. \\n\\nTonight, I’d like to honor someone who has dedicated his life to serve this country: Justice Stephen Breyer—an Army veteran, Constitutional scholar, and retiring Justice of the United States Supreme Court. Justice Breyer, thank you for your service. \\n\\nOne of the most serious constitutional responsibilities a President has is nominating someone to serve on the United States Supreme Court. \\n\\nAnd I did that 4 days ago, when I nominated Circuit Court of Appeals Judge Ketanji Brown Jackson. One of our nation’s top legal minds, who will continue Justice Breyer’s legacy of excellence.', lookup_str='', metadata={'source': '../../state_of_the_union.txt'}, lookup_index=0)"
|
||
]
|
||
},
|
||
"execution_count": 13,
|
||
"metadata": {},
|
||
"output_type": "execute_result"
|
||
}
|
||
],
|
||
"source": [
|
||
"result['source_documents'][0]"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"id": "4f49beab",
|
||
"metadata": {},
|
||
"source": [
|
||
"## ConversationalRetrievalChain with `search_distance`\n",
|
||
"If you are using a vector store that supports filtering by search distance, you can add a threshold value parameter."
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 14,
|
||
"id": "5ed8d612",
|
||
"metadata": {},
|
||
"outputs": [],
|
||
"source": [
|
||
"vectordbkwargs = {\"search_distance\": 0.9}"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 15,
|
||
"id": "6a7b3459",
|
||
"metadata": {},
|
||
"outputs": [],
|
||
"source": [
|
||
"qa = ConversationalRetrievalChain.from_llm(OpenAI(temperature=0), vectorstore.as_retriever(), return_source_documents=True)\n",
|
||
"chat_history = []\n",
|
||
"query = \"What did the president say about Ketanji Brown Jackson\"\n",
|
||
"result = qa({\"question\": query, \"chat_history\": chat_history, \"vectordbkwargs\": vectordbkwargs})"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"id": "99b96dae",
|
||
"metadata": {},
|
||
"source": [
|
||
"## ConversationalRetrievalChain with `map_reduce`\n",
|
||
"We can also use different types of combine document chains with the ConversationalRetrievalChain chain."
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 16,
|
||
"id": "e53a9d66",
|
||
"metadata": {},
|
||
"outputs": [],
|
||
"source": [
|
||
"from langchain.chains import LLMChain\n",
|
||
"from langchain.chains.question_answering import load_qa_chain\n",
|
||
"from langchain.chains.chat_index.prompts import CONDENSE_QUESTION_PROMPT"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 19,
|
||
"id": "bf205e35",
|
||
"metadata": {},
|
||
"outputs": [],
|
||
"source": [
|
||
"llm = OpenAI(temperature=0)\n",
|
||
"question_generator = LLMChain(llm=llm, prompt=CONDENSE_QUESTION_PROMPT)\n",
|
||
"doc_chain = load_qa_chain(llm, chain_type=\"map_reduce\")\n",
|
||
"\n",
|
||
"chain = ConversationalRetrievalChain(\n",
|
||
" retriever=vectorstore.as_retriever(),\n",
|
||
" question_generator=question_generator,\n",
|
||
" combine_docs_chain=doc_chain,\n",
|
||
")"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 20,
|
||
"id": "78155887",
|
||
"metadata": {},
|
||
"outputs": [],
|
||
"source": [
|
||
"chat_history = []\n",
|
||
"query = \"What did the president say about Ketanji Brown Jackson\"\n",
|
||
"result = chain({\"question\": query, \"chat_history\": chat_history})"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 21,
|
||
"id": "e54b5fa2",
|
||
"metadata": {},
|
||
"outputs": [
|
||
{
|
||
"data": {
|
||
"text/plain": [
|
||
"\" The president said that Ketanji Brown Jackson is one of the nation's top legal minds, a former top litigator in private practice, a former federal public defender, from a family of public school educators and police officers, a consensus builder, and has received a broad range of support from the Fraternal Order of Police to former judges appointed by Democrats and Republicans.\""
|
||
]
|
||
},
|
||
"execution_count": 21,
|
||
"metadata": {},
|
||
"output_type": "execute_result"
|
||
}
|
||
],
|
||
"source": [
|
||
"result['answer']"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"id": "a2fe6b14",
|
||
"metadata": {},
|
||
"source": [
|
||
"## ConversationalRetrievalChain with Question Answering with sources\n",
|
||
"\n",
|
||
"You can also use this chain with the question answering with sources chain."
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 22,
|
||
"id": "d1058fd2",
|
||
"metadata": {},
|
||
"outputs": [],
|
||
"source": [
|
||
"from langchain.chains.qa_with_sources import load_qa_with_sources_chain"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 23,
|
||
"id": "a6594482",
|
||
"metadata": {},
|
||
"outputs": [],
|
||
"source": [
|
||
"llm = OpenAI(temperature=0)\n",
|
||
"question_generator = LLMChain(llm=llm, prompt=CONDENSE_QUESTION_PROMPT)\n",
|
||
"doc_chain = load_qa_with_sources_chain(llm, chain_type=\"map_reduce\")\n",
|
||
"\n",
|
||
"chain = ConversationalRetrievalChain(\n",
|
||
" retriever=vectorstore.as_retriever(),\n",
|
||
" question_generator=question_generator,\n",
|
||
" combine_docs_chain=doc_chain,\n",
|
||
")"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 24,
|
||
"id": "e2badd21",
|
||
"metadata": {},
|
||
"outputs": [],
|
||
"source": [
|
||
"chat_history = []\n",
|
||
"query = \"What did the president say about Ketanji Brown Jackson\"\n",
|
||
"result = chain({\"question\": query, \"chat_history\": chat_history})"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 25,
|
||
"id": "edb31fe5",
|
||
"metadata": {},
|
||
"outputs": [
|
||
{
|
||
"data": {
|
||
"text/plain": [
|
||
"\" The president said that Ketanji Brown Jackson is one of the nation's top legal minds, a former top litigator in private practice, a former federal public defender, from a family of public school educators and police officers, a consensus builder, and has received a broad range of support from the Fraternal Order of Police to former judges appointed by Democrats and Republicans. \\nSOURCES: ../../state_of_the_union.txt\""
|
||
]
|
||
},
|
||
"execution_count": 25,
|
||
"metadata": {},
|
||
"output_type": "execute_result"
|
||
}
|
||
],
|
||
"source": [
|
||
"result['answer']"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"id": "2324cdc6-98bf-4708-b8cd-02a98b1e5b67",
|
||
"metadata": {},
|
||
"source": [
|
||
"## ConversationalRetrievalChain with streaming to `stdout`\n",
|
||
"\n",
|
||
"Output from the chain will be streamed to `stdout` token by token in this example."
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 26,
|
||
"id": "2efacec3-2690-4b05-8de3-a32fd2ac3911",
|
||
"metadata": {
|
||
"tags": []
|
||
},
|
||
"outputs": [],
|
||
"source": [
|
||
"from langchain.chains.llm import LLMChain\n",
|
||
"from langchain.callbacks.base import CallbackManager\n",
|
||
"from langchain.callbacks.streaming_stdout import StreamingStdOutCallbackHandler\n",
|
||
"from langchain.chains.chat_index.prompts import CONDENSE_QUESTION_PROMPT, QA_PROMPT\n",
|
||
"from langchain.chains.question_answering import load_qa_chain\n",
|
||
"\n",
|
||
"# Construct a ChatVectorDBChain with a streaming llm for combine docs\n",
|
||
"# and a separate, non-streaming llm for question generation\n",
|
||
"llm = OpenAI(temperature=0)\n",
|
||
"streaming_llm = OpenAI(streaming=True, callback_manager=CallbackManager([StreamingStdOutCallbackHandler()]), verbose=True, temperature=0)\n",
|
||
"\n",
|
||
"question_generator = LLMChain(llm=llm, prompt=CONDENSE_QUESTION_PROMPT)\n",
|
||
"doc_chain = load_qa_chain(streaming_llm, chain_type=\"stuff\", prompt=QA_PROMPT)\n",
|
||
"\n",
|
||
"qa = ConversationalRetrievalChain(\n",
|
||
" retriever=vectorstore.as_retriever(), combine_docs_chain=doc_chain, question_generator=question_generator)"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 27,
|
||
"id": "fd6d43f4-7428-44a4-81bc-26fe88a98762",
|
||
"metadata": {
|
||
"tags": []
|
||
},
|
||
"outputs": [
|
||
{
|
||
"name": "stdout",
|
||
"output_type": "stream",
|
||
"text": [
|
||
" The president said that Ketanji Brown Jackson is one of the nation's top legal minds, a former top litigator in private practice, a former federal public defender, and from a family of public school educators and police officers. He also said that she is a consensus builder and has received a broad range of support from the Fraternal Order of Police to former judges appointed by Democrats and Republicans."
|
||
]
|
||
}
|
||
],
|
||
"source": [
|
||
"chat_history = []\n",
|
||
"query = \"What did the president say about Ketanji Brown Jackson\"\n",
|
||
"result = qa({\"question\": query, \"chat_history\": chat_history})"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 28,
|
||
"id": "5ab38978-f3e8-4fa7-808c-c79dec48379a",
|
||
"metadata": {
|
||
"tags": []
|
||
},
|
||
"outputs": [
|
||
{
|
||
"name": "stdout",
|
||
"output_type": "stream",
|
||
"text": [
|
||
" Justice Stephen Breyer"
|
||
]
|
||
}
|
||
],
|
||
"source": [
|
||
"chat_history = [(query, result[\"answer\"])]\n",
|
||
"query = \"Did he mention who she suceeded\"\n",
|
||
"result = qa({\"question\": query, \"chat_history\": chat_history})\n"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"id": "f793d56b",
|
||
"metadata": {},
|
||
"source": [
|
||
"## get_chat_history Function\n",
|
||
"You can also specify a `get_chat_history` function, which can be used to format the chat_history string."
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 29,
|
||
"id": "a7ba9d8c",
|
||
"metadata": {},
|
||
"outputs": [],
|
||
"source": [
|
||
"def get_chat_history(inputs) -> str:\n",
|
||
" res = []\n",
|
||
" for human, ai in inputs:\n",
|
||
" res.append(f\"Human:{human}\\nAI:{ai}\")\n",
|
||
" return \"\\n\".join(res)\n",
|
||
"qa = ConversationalRetrievalChain.from_llm(OpenAI(temperature=0), vectorstore, get_chat_history=get_chat_history)"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 30,
|
||
"id": "a3e33c0d",
|
||
"metadata": {},
|
||
"outputs": [],
|
||
"source": [
|
||
"chat_history = []\n",
|
||
"query = \"What did the president say about Ketanji Brown Jackson\"\n",
|
||
"result = qa({\"question\": query, \"chat_history\": chat_history})"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 31,
|
||
"id": "936dc62f",
|
||
"metadata": {},
|
||
"outputs": [
|
||
{
|
||
"data": {
|
||
"text/plain": [
|
||
"\" The president said that Ketanji Brown Jackson is one of the nation's top legal minds, a former top litigator in private practice, a former federal public defender, and from a family of public school educators and police officers. He also said that she is a consensus builder and has received a broad range of support from the Fraternal Order of Police to former judges appointed by Democrats and Republicans.\""
|
||
]
|
||
},
|
||
"execution_count": 31,
|
||
"metadata": {},
|
||
"output_type": "execute_result"
|
||
}
|
||
],
|
||
"source": [
|
||
"result['answer']"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": null,
|
||
"id": "b8c26901",
|
||
"metadata": {},
|
||
"outputs": [],
|
||
"source": []
|
||
}
|
||
],
|
||
"metadata": {
|
||
"kernelspec": {
|
||
"display_name": "Python 3 (ipykernel)",
|
||
"language": "python",
|
||
"name": "python3"
|
||
},
|
||
"language_info": {
|
||
"codemirror_mode": {
|
||
"name": "ipython",
|
||
"version": 3
|
||
},
|
||
"file_extension": ".py",
|
||
"mimetype": "text/x-python",
|
||
"name": "python",
|
||
"nbconvert_exporter": "python",
|
||
"pygments_lexer": "ipython3",
|
||
"version": "3.9.1"
|
||
}
|
||
},
|
||
"nbformat": 4,
|
||
"nbformat_minor": 5
|
||
}
|