Harrison/update memory docs (#8384)

Co-authored-by: Bagatur <baskaryan@gmail.com>
This commit is contained in:
Harrison Chase 2023-07-27 17:18:19 -07:00 committed by GitHub
parent d7e6770de8
commit 25b8cc7e3d
No known key found for this signature in database
GPG Key ID: 4AEE18F83AFDEB23
19 changed files with 208 additions and 215 deletions

View File

@ -0,0 +1,17 @@
---
sidebar_position: 1
---
# Chat Messages
:::info
Head to [Integrations](/docs/integrations/memory/) for documentation on built-in memory integrations with 3rd-party databases and tools.
:::
One of the core utility classes underpinning most (if not all) memory modules is the `ChatMessageHistory` class.
This is a super lightweight wrapper which exposes convenience methods for saving Human messages, AI messages, and then fetching them all.
You may want to use this class directly if you are managing memory outside of a chain.
import GetStarted from "@snippets/modules/memory/chat_messages/get_started.mdx"
<GetStarted/>

View File

@ -1,34 +1,62 @@
---
sidebar_position: 3
---
# Memory
🚧 _Docs under construction_ 🚧
Most LLM applications have a conversational interface. An essential component of a conversation is being able to refer to information introduced earlier in the conversation.
At bare minimum, a conversational system should be able to access some window of past messages directly.
A more complex system will need to have a world model that it is constantly updating, which allows it to do things like maintain information about entities and their relationships.
:::info
Head to [Integrations](/docs/integrations/memory/) for documentation on built-in memory integrations with 3rd-party tools.
:::
We call this ability to store information about past interactions "memory".
LangChain provides a lot of utilities for adding memory to a system.
These utilities can be used by themselves or incorporated seamlessly into a chain.
By default, Chains and Agents are stateless,
meaning that they treat each incoming query independently (like the underlying LLMs and chat models themselves).
In some applications, like chatbots, it is essential
to remember previous interactions, both in the short and long-term.
The **Memory** class does exactly that.
A memory system needs to support two basic actions: reading and writing.
Recall that every chain defines some core execution logic that expects certain inputs.
Some of these inputs come directly from the user, but some of these inputs can come from memory.
A chain will interact with its memory system twice in a given run.
1. AFTER receiving the initial user inputs but BEFORE executing the core logic, a chain will READ from its memory system and augment the user inputs.
2. AFTER executing the core logic but BEFORE returning the answer, a chain will WRITE the inputs and outputs of the current run to memory, so that they can be referred to in future runs.
LangChain provides memory components in two forms.
First, LangChain provides helper utilities for managing and manipulating previous chat messages.
These are designed to be modular and useful regardless of how they are used.
Secondly, LangChain provides easy ways to incorporate these utilities into chains.
![memory-diagram](/img/memory_diagram.png)
## Building memory into a system
The two core design decisions in any memory system are:
- How state is stored
- How state is queried
### Storing: List of chat messages
Underlying any memory is a history of all chat interactions.
Even if these are not all used directly, they need to be stored in some form.
One of the key parts of the LangChain memory module is a series of integrations for storing these chat messages,
from in-memory lists to persistent databases.
- [Chat message storage](/docs/modules/memory/chat_messages/): How to work with Chat Messages, and the various integrations offered
### Querying: Data structures and algorithms on top of chat messages
Keeping a list of chat messages is fairly straight-forward.
What is less straight-forward are the data structures and algorithms built on top of chat messages that serve a view of those messages that is most useful.
A very simply memory system might just return the most recent messages each run. A slightly more complex memory system might return a succinct summary of the past K messages.
An even more sophisticated system might extract entities from stored messages and only return information about entities referenced in the current run.
Each application can have different requirements for how memory is queried. The memory module should make it easy to both get started with simple memory systems and write your own custom systems if needed.
- [Memory types](/docs/modules/memory/types/): The various data structures and algorithms that make up the memory types LangChain supports
## Get started
Memory involves keeping a concept of state around throughout a user's interactions with an language model. A user's interactions with a language model are captured in the concept of ChatMessages, so this boils down to ingesting, capturing, transforming and extracting knowledge from a sequence of chat messages. There are many different ways to do this, each of which exists as its own memory type.
In general, for each type of memory there are two ways to understanding using memory. These are the standalone functions which extract information from a sequence of messages, and then there is the way you can use this type of memory in a chain.
Memory can return multiple pieces of information (for example, the most recent N messages and a summary of all previous messages). The returned information can either be a string or a list of messages.
Let's take a look at what Memory actually looks like in LangChain.
Here we'll cover the basics of interacting with an arbitrary memory class.
import GetStarted from "@snippets/modules/memory/get_started.mdx"
<GetStarted/>
## Next steps
And that's it for getting started!
Please see the other sections for walkthroughs of more advanced topics,
like custom memory, multiple memories, and more.

View File

@ -4,6 +4,6 @@ This notebook shows how to use `ConversationBufferMemory`. This memory allows fo
We can first extract it as a string.
import Example from "@snippets/modules/memory/how_to/buffer.mdx"
import Example from "@snippets/modules/memory/types/buffer.mdx"
<Example/>

View File

@ -4,6 +4,6 @@
Let's first explore the basic functionality of this type of memory.
import Example from "@snippets/modules/memory/how_to/buffer_window.mdx"
import Example from "@snippets/modules/memory/types/buffer_window.mdx"
<Example/>

View File

@ -4,6 +4,6 @@ Entity Memory remembers given facts about specific entities in a conversation. I
Let's first walk through using this functionality.
import Example from "@snippets/modules/memory/how_to/entity_summary_memory.mdx"
import Example from "@snippets/modules/memory/types/entity_summary_memory.mdx"
<Example/>

View File

@ -0,0 +1,8 @@
---
sidebar_position: 2
---
# Memory Types
There are many different types of memory.
Each have their own parameters, their own return types, and are useful in different scenarios.
Please see their individual page for more detail on each one.

View File

@ -4,6 +4,6 @@ Conversation summary memory summarizes the conversation as it happens and stores
Let's first explore the basic functionality of this type of memory.
import Example from "@snippets/modules/memory/how_to/summary.mdx"
import Example from "@snippets/modules/memory/types/summary.mdx"
<Example/>

View File

@ -6,6 +6,6 @@ This differs from most of the other Memory classes in that it doesn't explicitly
In this case, the "docs" are previous conversation snippets. This can be useful to refer to relevant pieces of information that the AI was told earlier in the conversation.
import Example from "@snippets/modules/memory/how_to/vectorstore_retriever_memory.mdx"
import Example from "@snippets/modules/memory/types/vectorstore_retriever_memory.mdx"
<Example/>

Binary file not shown.

After

Width:  |  Height:  |  Size: 111 KiB

View File

@ -0,0 +1,23 @@
```python
from langchain.memory import ChatMessageHistory
history = ChatMessageHistory()
history.add_user_message("hi!")
history.add_ai_message("whats up?")
```
```python
history.messages
```
<CodeOutputBlock lang="python">
```
[HumanMessage(content='hi!', additional_kwargs={}),
AIMessage(content='whats up?', additional_kwargs={})]
```
</CodeOutputBlock>

View File

@ -1,55 +1,25 @@
We will walk through the simplest form of memory: "buffer" memory, which just involves keeping a buffer of all prior messages. We will show how to use the modular utility functions here, then show how it can be used in a chain (both returning a string as well as a list of messages).
## ChatMessageHistory
One of the core utility classes underpinning most (if not all) memory modules is the `ChatMessageHistory` class. This is a super lightweight wrapper which exposes convenience methods for saving Human messages, AI messages, and then fetching them all.
You may want to use this class directly if you are managing memory outside of a chain.
```python
from langchain.memory import ChatMessageHistory
history = ChatMessageHistory()
history.add_user_message("hi!")
history.add_ai_message("whats up?")
```
```python
history.messages
```
<CodeOutputBlock lang="python">
```
[HumanMessage(content='hi!', additional_kwargs={}),
AIMessage(content='whats up?', additional_kwargs={})]
```
</CodeOutputBlock>
## ConversationBufferMemory
We now show how to use this simple concept in a chain. We first showcase `ConversationBufferMemory` which is just a wrapper around ChatMessageHistory that extracts the messages in a variable.
We can first extract it as a string.
Let's take a look at how to use ConversationBufferMemory in chains.
ConversationBufferMemory is an extremely simple form of memory that just keeps a list of chat messages in a buffer
and passes those into the prompt template.
```python
from langchain.memory import ConversationBufferMemory
```
```python
memory = ConversationBufferMemory()
memory.chat_memory.add_user_message("hi!")
memory.chat_memory.add_ai_message("whats up?")
```
When using memory in a chain, there are a few key concepts to understand.
Note that here we cover general concepts that are useful for most types of memory.
Each individual memory type may very well have its own parameters and concepts that are necessary to understand.
### What variables get returned from memory
Before going into the chain, various variables are read from memory.
This have specific names which need to align with the variables the chain expects.
You can see what these variables are by calling `memory.load_memory_variables({})`.
Note that the empty dictionary that we pass in is just a placeholder for real variables.
If the memory type you are using is dependent upon the input variables, you may need to pass some in.
```python
memory.load_memory_variables({})
@ -58,199 +28,146 @@ memory.load_memory_variables({})
<CodeOutputBlock lang="python">
```
{'history': 'Human: hi!\nAI: whats up?'}
{'history': "Human: hi!\nAI: whats up?"}
```
</CodeOutputBlock>
We can also get the history as a list of messages
In this case, you can see that `load_memory_variables` returns a single key, `history`.
This means that your chain (and likely your prompt) should expect and input named `history`.
You can usually control this variable through parameters on the memory class.
For example, if you want the memory variables to be returned in the key `chat_history` you can do:
```python
memory = ConversationBufferMemory(memory_key="chat_history")
memory.chat_memory.add_user_message("hi!")
memory.chat_memory.add_ai_message("whats up?")
```
<CodeOutputBlock lang="python">
```
{'chat_history': "Human: hi!\nAI: whats up?"}
```
</CodeOutputBlock>
The parameter name to control these keys may vary per memory type, but it's important to understand that (1) this is controllable, (2) how to control it.
### Whether memory is a string or a list of messages
One of the most common types of memory involves returning a list of chat messages.
These can either be returned as a single string, all concatenated together (useful when they will be passed in LLMs)
or a list of ChatMessages (useful when passed into ChatModels).
By default, they are returned as a single string.
In order to return as a list of messages, you can set `return_messages=True`
```python
memory = ConversationBufferMemory(return_messages=True)
memory.chat_memory.add_user_message("hi!")
memory.chat_memory.add_ai_message("whats up?")
```
```python
memory.load_memory_variables({})
```
<CodeOutputBlock lang="python">
```
{'history': [HumanMessage(content='hi!', additional_kwargs={}),
AIMessage(content='whats up?', additional_kwargs={})]}
{'history': [HumanMessage(content='hi!', additional_kwargs={}, example=False),
AIMessage(content='whats up?', additional_kwargs={}, example=False)]}
```
</CodeOutputBlock>
## Using in a chain
Finally, let's take a look at using this in a chain (setting `verbose=True` so we can see the prompt).
### What keys are saved to memory
Often times chains take in or return multiple input/output keys.
In these cases, how can we know which keys we want to save to the chat message history?
This is generally controllable by `input_key` and `output_key` parameters on the memory types.
These default to None - and if there is only one input/output key it is known to just use that.
However, if there are multiple input/output keys then you MUST specify the name of which one to use
### End to end example
Finally, let's take a look at using this in a chain.
We'll use an LLMChain, and show working with both an LLM and a ChatModel.
#### Using an LLM
```python
from langchain.llms import OpenAI
from langchain.chains import ConversationChain
from langchain.prompts import PromptTemplate
from langchain.chains import LLMChain
from langchain.memory import ConversationBufferMemory
llm = OpenAI(temperature=0)
conversation = ConversationChain(
# Notice that "chat_history" is present in the prompt template
template = """You are a nice chatbot having a conversation with a human.
Previous conversation:
{chat_history}
New human question: {question}
Response:"""
prompt = PromptTemplate.from_template(template)
# Notice that we need to align the `memory_key`
memory = ConversationBufferMemory(memory_key="chat_history")
conversation = LLMChain(
llm=llm,
prompt=prompt,
verbose=True,
memory=ConversationBufferMemory()
memory=memory
)
```
```python
conversation.predict(input="Hi there!")
```
<CodeOutputBlock lang="python">
# Notice that we just pass in the `question` variables - `chat_history` gets populated by memory
conversation({"question": "hi"})
```
> Entering new ConversationChain chain...
Prompt after formatting:
The following is a friendly conversation between a human and an AI. The AI is talkative and provides lots of specific details from its context. If the AI does not know the answer to a question, it truthfully says it does not know.
Current conversation:
Human: Hi there!
AI:
> Finished chain.
" Hi there! It's nice to meet you. How can I help you today?"
```
</CodeOutputBlock>
#### Using a ChatModel
```python
conversation.predict(input="I'm doing well! Just having a conversation with an AI.")
```
<CodeOutputBlock lang="python">
```
from langchain.chat_models import ChatOpenAI
from langchain.prompts import (
ChatPromptTemplate,
MessagesPlaceholder,
SystemMessagePromptTemplate,
HumanMessagePromptTemplate,
)
from langchain.chains import LLMChain
from langchain.memory import ConversationBufferMemory
> Entering new ConversationChain chain...
Prompt after formatting:
The following is a friendly conversation between a human and an AI. The AI is talkative and provides lots of specific details from its context. If the AI does not know the answer to a question, it truthfully says it does not know.
Current conversation:
Human: Hi there!
AI: Hi there! It's nice to meet you. How can I help you today?
Human: I'm doing well! Just having a conversation with an AI.
AI:
> Finished chain.
" That's great! It's always nice to have a conversation with someone new. What would you like to talk about?"
```
</CodeOutputBlock>
```python
conversation.predict(input="Tell me about yourself.")
```
<CodeOutputBlock lang="python">
```
> Entering new ConversationChain chain...
Prompt after formatting:
The following is a friendly conversation between a human and an AI. The AI is talkative and provides lots of specific details from its context. If the AI does not know the answer to a question, it truthfully says it does not know.
Current conversation:
Human: Hi there!
AI: Hi there! It's nice to meet you. How can I help you today?
Human: I'm doing well! Just having a conversation with an AI.
AI: That's great! It's always nice to have a conversation with someone new. What would you like to talk about?
Human: Tell me about yourself.
AI:
> Finished chain.
" Sure! I'm an AI created to help people with their everyday tasks. I'm programmed to understand natural language and provide helpful information. I'm also constantly learning and updating my knowledge base so I can provide more accurate and helpful answers."
```
</CodeOutputBlock>
## Saving Message History
You may often have to save messages, and then load them to use again. This can be done easily by first converting the messages to normal python dictionaries, saving those (as json or something) and then loading those. Here is an example of doing that.
```python
import json
from langchain.memory import ChatMessageHistory
from langchain.schema import messages_from_dict, messages_to_dict
history = ChatMessageHistory()
history.add_user_message("hi!")
history.add_ai_message("whats up?")
llm = ChatOpenAI()
prompt = ChatPromptTemplate(
messages=[
SystemMessagePromptTemplate.from_template(
"You are a nice chatbot having a conversation with a human."
),
# The `variable_name` here is what must align with memory
MessagesPlaceholder(variable_name="chat_history"),
HumanMessagePromptTemplate.from_template("{question}")
]
)
# Notice that we `return_messages=True` to fit into the MessagesPlaceholder
# Notice that `"chat_history"` aligns with the MessagesPlaceholder name.
memory = ConversationBufferMemory(memory_key="chat_history", return_messages=True)
conversation = LLMChain(
llm=llm,
prompt=prompt,
verbose=True,
memory=memory
)
```
```python
dicts = messages_to_dict(history.messages)
# Notice that we just pass in the `question` variables - `chat_history` gets populated by memory
conversation({"question": "hi"})
```
```python
dicts
```
<CodeOutputBlock lang="python">
```
[{'type': 'human', 'data': {'content': 'hi!', 'additional_kwargs': {}}},
{'type': 'ai', 'data': {'content': 'whats up?', 'additional_kwargs': {}}}]
```
</CodeOutputBlock>
```python
new_messages = messages_from_dict(dicts)
```
```python
new_messages
```
<CodeOutputBlock lang="python">
```
[HumanMessage(content='hi!', additional_kwargs={}),
AIMessage(content='whats up?', additional_kwargs={})]
```
</CodeOutputBlock>
And that's it for the getting started! There are plenty of different types of memory, check out our examples to see them all