addition to docs at 'Store and reference chat history' (#8910)

- Description: I have added an example showing how to pass a custom
template to ConversationRetrievalChain. Instead of
CONDENSE_QUESTION_PROMPT we can pass any prompt in the argument
condense_question_prompt. Look in Use cases -> QA over Documents -> How
to -> Store and reference chat history,
  - Issue: #8864,
  - Dependencies: NA,
  - Tag maintainer: @hinthornw,
  - Twitter handle:

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
This commit is contained in:
Apurv Agarwal 2023-08-08 22:40:11 +05:30 committed by GitHub
parent bf4a112aa6
commit 4a63533216
No known key found for this signature in database
GPG Key ID: 4AEE18F83AFDEB23

View File

@ -8,7 +8,6 @@ from langchain.chains import ConversationalRetrievalChain
Load in documents. You can replace this with a loader for whatever type of data you want
```python
from langchain.document_loaders import TextLoader
loader = TextLoader("../../state_of_the_union.txt")
@ -17,7 +16,6 @@ documents = loader.load()
If you had multiple loaders that you wanted to combine, you do something like:
```python
# loaders = [....]
# docs = []
@ -27,7 +25,6 @@ If you had multiple loaders that you wanted to combine, you do something like:
We now split the documents, create embeddings for them, and put them in a vectorstore. This allows us to do semantic search over them.
```python
text_splitter = CharacterTextSplitter(chunk_size=1000, chunk_overlap=0)
documents = text_splitter.split_documents(documents)
@ -46,7 +43,6 @@ vectorstore = Chroma.from_documents(documents, embeddings)
We can now create a memory object, which is necessary to track the inputs/outputs and hold a conversation.
```python
from langchain.memory import ConversationBufferMemory
memory = ConversationBufferMemory(memory_key="chat_history", return_messages=True)
@ -54,18 +50,15 @@ memory = ConversationBufferMemory(memory_key="chat_history", return_messages=Tru
We now initialize the `ConversationalRetrievalChain`
```python
qa = ConversationalRetrievalChain.from_llm(OpenAI(temperature=0), vectorstore.as_retriever(), memory=memory)
```
```python
query = "What did the president say about Ketanji Brown Jackson"
result = qa({"question": query})
```
```python
result["answer"]
```
@ -78,13 +71,11 @@ result["answer"]
</CodeOutputBlock>
```python
query = "Did he mention who she succeeded"
result = qa({"question": query})
```
```python
result['answer']
```
@ -101,21 +92,18 @@ result['answer']
In the above example, we used a Memory object to track chat history. We can also just pass it in explicitly. In order to do this, we need to initialize a chain without any memory object.
```python
qa = ConversationalRetrievalChain.from_llm(OpenAI(temperature=0), vectorstore.as_retriever())
```
Here's an example of asking a question with no chat history
```python
chat_history = []
query = "What did the president say about Ketanji Brown Jackson"
result = qa({"question": query, "chat_history": chat_history})
```
```python
result["answer"]
```
@ -130,14 +118,12 @@ result["answer"]
Here's an example of asking a question with some chat history
```python
chat_history = [(query, result["answer"])]
query = "Did he mention who she succeeded"
result = qa({"question": query, "chat_history": chat_history})
```
```python
result['answer']
```
@ -154,12 +140,10 @@ result['answer']
This chain has two steps. First, it condenses the current question and the chat history into a standalone question. This is necessary to create a standanlone vector to use for retrieval. After that, it does retrieval and then answers the question using retrieval augmented generation with a separate model. Part of the power of the declarative nature of LangChain is that you can easily use a separate language model for each call. This can be useful to use a cheaper and faster model for the simpler task of condensing the question, and then a more expensive model for answering the question. Here is an example of doing so.
```python
from langchain.chat_models import ChatOpenAI
```
```python
qa = ConversationalRetrievalChain.from_llm(
ChatOpenAI(temperature=0, model="gpt-4"),
@ -168,36 +152,90 @@ qa = ConversationalRetrievalChain.from_llm(
)
```
```python
chat_history = []
query = "What did the president say about Ketanji Brown Jackson"
result = qa({"question": query, "chat_history": chat_history})
```
```python
chat_history = [(query, result["answer"])]
query = "Did he mention who she succeeded"
result = qa({"question": query, "chat_history": chat_history})
```
## Return Source Documents
You can also easily return source documents from the ConversationalRetrievalChain. This is useful for when you want to inspect what documents were returned.
## Using a custom prompt for condensing the question
By default, ConversationalRetrievalQA uses CONDENSE_QUESTION_PROMPT to condense a question. Here is the implementation of this in the docs
```python
from langchain.prompts.prompt import PromptTemplate
_template = """Given the following conversation and a follow up question, rephrase the follow up question to be a standalone question, in its original language.
Chat History:
{chat_history}
Follow Up Input: {question}
Standalone question:"""
CONDENSE_QUESTION_PROMPT = PromptTemplate.from_template(_template)
```
But instead of this any custom template can be used to further augment information in the question or instruct the LLM to do something. Here is an example
```python
from langchain.prompts.prompt import PromptTemplate
```
```python
custom_template = """Given the following conversation and a follow up question, rephrase the follow up question to be a standalone question. At the end of standalone question add this 'Answer the question in German language.' If you do not know the answer reply with 'I am sorry'.
Chat History:
{chat_history}
Follow Up Input: {question}
Standalone question:"""
```
```python
CUSTOM_QUESTION_PROMPT = PromptTemplate.from_template(custom_template)
```
```python
model = ChatOpenAI(model_name="gpt-3.5-turbo", temperature=0.3)
embeddings = OpenAIEmbeddings()
vectordb = Chroma(embedding_function=embeddings, persist_directory=directory)
memory = ConversationBufferMemory(memory_key="chat_history", return_messages=True)
qa = ConversationalRetrievalChain.from_llm(
model,
vectordb.as_retriever(),
condense_question_prompt=CUSTOM_QUESTION_PROMPT,
memory=memory
)
```
```python
query = "What did the president say about Ketanji Brown Jackson"
result = qa({"question": query})
```
```python
query = "Did he mention who she succeeded"
result = qa({"question": query})
```
## Return Source Documents
You can also easily return source documents from the ConversationalRetrievalChain. This is useful for when you want to inspect what documents were returned.
```python
qa = ConversationalRetrievalChain.from_llm(OpenAI(temperature=0), vectorstore.as_retriever(), return_source_documents=True)
```
```python
chat_history = []
query = "What did the president say about Ketanji Brown Jackson"
result = qa({"question": query, "chat_history": chat_history})
```
```python
result['source_documents'][0]
```
@ -211,14 +249,13 @@ result['source_documents'][0]
</CodeOutputBlock>
## ConversationalRetrievalChain with `search_distance`
If you are using a vector store that supports filtering by search distance, you can add a threshold value parameter.
If you are using a vector store that supports filtering by search distance, you can add a threshold value parameter.
```python
vectordbkwargs = {"search_distance": 0.9}
```
```python
qa = ConversationalRetrievalChain.from_llm(OpenAI(temperature=0), vectorstore.as_retriever(), return_source_documents=True)
chat_history = []
@ -227,8 +264,8 @@ result = qa({"question": query, "chat_history": chat_history, "vectordbkwargs":
```
## ConversationalRetrievalChain with `map_reduce`
We can also use different types of combine document chains with the ConversationalRetrievalChain chain.
We can also use different types of combine document chains with the ConversationalRetrievalChain chain.
```python
from langchain.chains import LLMChain
@ -236,7 +273,6 @@ from langchain.chains.question_answering import load_qa_chain
from langchain.chains.conversational_retrieval.prompts import CONDENSE_QUESTION_PROMPT
```
```python
llm = OpenAI(temperature=0)
question_generator = LLMChain(llm=llm, prompt=CONDENSE_QUESTION_PROMPT)
@ -249,14 +285,12 @@ chain = ConversationalRetrievalChain(
)
```
```python
chat_history = []
query = "What did the president say about Ketanji Brown Jackson"
result = chain({"question": query, "chat_history": chat_history})
```
```python
result['answer']
```
@ -273,12 +307,10 @@ result['answer']
You can also use this chain with the question answering with sources chain.
```python
from langchain.chains.qa_with_sources import load_qa_with_sources_chain
```
```python
llm = OpenAI(temperature=0)
question_generator = LLMChain(llm=llm, prompt=CONDENSE_QUESTION_PROMPT)
@ -291,14 +323,12 @@ chain = ConversationalRetrievalChain(
)
```
```python
chat_history = []
query = "What did the president say about Ketanji Brown Jackson"
result = chain({"question": query, "chat_history": chat_history})
```
```python
result['answer']
```
@ -315,7 +345,6 @@ result['answer']
Output from the chain will be streamed to `stdout` token by token in this example.
```python
from langchain.chains.llm import LLMChain
from langchain.callbacks.streaming_stdout import StreamingStdOutCallbackHandler
@ -334,7 +363,6 @@ qa = ConversationalRetrievalChain(
retriever=vectorstore.as_retriever(), combine_docs_chain=doc_chain, question_generator=question_generator)
```
```python
chat_history = []
query = "What did the president say about Ketanji Brown Jackson"
@ -349,7 +377,6 @@ result = qa({"question": query, "chat_history": chat_history})
</CodeOutputBlock>
```python
chat_history = [(query, result["answer"])]
query = "Did he mention who she succeeded"
@ -365,8 +392,8 @@ result = qa({"question": query, "chat_history": chat_history})
</CodeOutputBlock>
## get_chat_history Function
You can also specify a `get_chat_history` function, which can be used to format the chat_history string.
You can also specify a `get_chat_history` function, which can be used to format the chat_history string.
```python
def get_chat_history(inputs) -> str:
@ -377,14 +404,12 @@ def get_chat_history(inputs) -> str:
qa = ConversationalRetrievalChain.from_llm(OpenAI(temperature=0), vectorstore.as_retriever(), get_chat_history=get_chat_history)
```
```python
chat_history = []
query = "What did the president say about Ketanji Brown Jackson"
result = qa({"question": query, "chat_history": chat_history})
```
```python
result['answer']
```