docs: Upgrade examples with RunnableWithMessageHistory to langgraph memory (#26855)

This PR updates the documentation examples that used
RunnableWithMessageHistory to show how to achieve the same
implementation with langgraph memory.

Some of the underlying PRs (not all of them):

- docs[patch]: update chatbot tutorial and migration guide (#26780)
- docs[patch]: update chatbot memory how-to (#26790)
- docs[patch]: update chatbot tools how-to (#26816)
- docs: update chat history in rag how-to (#26821)
- docs: update trim messages notebook (#26793)
- docs: clean up imports in how to guide for rag qa with chat history
(#26825)
- docs[patch]: update conversational rag tutorial (#26814)

---------

Co-authored-by: ccurme <chester.curme@gmail.com>
Co-authored-by: Vadym Barda <vadym@langchain.dev>
Co-authored-by: mercyspirit <ziying.qiu@gmail.com>
Co-authored-by: aqiu7 <aqiu7@gatech.edu>
Co-authored-by: John <43506685+Coniferish@users.noreply.github.com>
Co-authored-by: Erick Friis <erick@langchain.dev>
Co-authored-by: William FH <13333726+hinthornw@users.noreply.github.com>
Co-authored-by: Subhrajyoty Roy <subhrajyotyroy@gmail.com>
Co-authored-by: Rajendra Kadam <raj.725@outlook.com>
Co-authored-by: Christophe Bornet <cbornet@hotmail.com>
Co-authored-by: Devin Gaffney <itsme@devingaffney.com>
Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
This commit is contained in:
Eugene Yurtsev
2024-09-27 16:04:30 -04:00
committed by GitHub
parent 44eddd39d6
commit de0b48c41a
42 changed files with 2326 additions and 2418 deletions

View File

@@ -581,12 +581,38 @@ def trim_messages(
) -> list[BaseMessage]:
"""Trim messages to be below a token count.
trim_messages can be used to reduce the size of a chat history to a specified token
count or specified message count.
In either case, if passing the trimmed chat history back into a chat model
directly, the resulting chat history should usually satisfy the following
properties:
1. The resulting chat history should be valid. Most chat models expect that chat
history starts with either (1) a `HumanMessage` or (2) a `SystemMessage` followed
by a `HumanMessage`. To achieve this, set `start_on="human"`.
In addition, generally a `ToolMessage` can only appear after an `AIMessage`
that involved a tool call.
Please see the following link for more information about messages:
https://python.langchain.com/docs/concepts/#messages
2. It includes recent messages and drops old messages in the chat history.
To achieve this set the `strategy="last"`.
3. Usually, the new chat history should include the `SystemMessage` if it
was present in the original chat history since the `SystemMessage` includes
special instructions to the chat model. The `SystemMessage` is almost always
the first message in the history if present. To achieve this set the
`include_system=True`.
**Note** The examples below show how to configure `trim_messages` to achieve
a behavior consistent with the above properties.
Args:
messages: Sequence of Message-like objects to trim.
max_tokens: Max token count of trimmed messages.
token_counter: Function or llm for counting tokens in a BaseMessage or a list of
BaseMessage. If a BaseLanguageModel is passed in then
BaseLanguageModel.get_num_tokens_from_messages() will be used.
Set to `len` to count the number of **messages** in the chat history.
strategy: Strategy for trimming.
- "first": Keep the first <= n_count tokens of the messages.
- "last": Keep the last <= n_count tokens of the messages.
@@ -633,11 +659,97 @@ def trim_messages(
``strategy`` is specified.
Example:
Trim chat history based on token count, keeping the SystemMessage if
present, and ensuring that the chat history starts with a HumanMessage (
or a SystemMessage followed by a HumanMessage).
.. code-block:: python
from typing import List
from langchain_core.messages import trim_messages, AIMessage, BaseMessage, HumanMessage, SystemMessage
from langchain_core.messages import (
AIMessage,
HumanMessage,
BaseMessage,
SystemMessage,
trim_messages,
)
messages = [
SystemMessage("you're a good assistant, you always respond with a joke."),
HumanMessage("i wonder why it's called langchain"),
AIMessage(
'Well, I guess they thought "WordRope" and "SentenceString" just didn\'t have the same ring to it!'
),
HumanMessage("and who is harrison chasing anyways"),
AIMessage(
"Hmmm let me think.\n\nWhy, he's probably chasing after the last cup of coffee in the office!"
),
HumanMessage("what do you call a speechless parrot"),
]
trim_messages(
messages,
max_tokens=45,
strategy="last",
token_counter=ChatOpenAI(model="gpt-4o"),
# Most chat models expect that chat history starts with either:
# (1) a HumanMessage or
# (2) a SystemMessage followed by a HumanMessage
start_on="human",
# Usually, we want to keep the SystemMessage
# if it's present in the original history.
# The SystemMessage has special instructions for the model.
include_system=True,
allow_partial=False,
)
.. code-block:: python
[
SystemMessage(content="you're a good assistant, you always respond with a joke."),
HumanMessage(content='what do you call a speechless parrot'),
]
Trim chat history based on the message count, keeping the SystemMessage if
present, and ensuring that the chat history starts with a HumanMessage (
or a SystemMessage followed by a HumanMessage).
trim_messages(
messages,
# When `len` is passed in as the token counter function,
# max_tokens will count the number of messages in the chat history.
max_tokens=4,
strategy="last",
# Passing in `len` as a token counter function will
# count the number of messages in the chat history.
token_counter=len,
# Most chat models expect that chat history starts with either:
# (1) a HumanMessage or
# (2) a SystemMessage followed by a HumanMessage
start_on="human",
# Usually, we want to keep the SystemMessage
# if it's present in the original history.
# The SystemMessage has special instructions for the model.
include_system=True,
allow_partial=False,
)
.. code-block:: python
[
SystemMessage(content="you're a good assistant, you always respond with a joke."),
HumanMessage(content='and who is harrison chasing anyways'),
AIMessage(content="Hmmm let me think.\n\nWhy, he's probably chasing after the last cup of coffee in the office!"),
HumanMessage(content='what do you call a speechless parrot'),
]
Trim chat history using a custom token counter function that counts the
number of tokens in each message.
.. code-block:: python
messages = [
SystemMessage("This is a 4 token text. The full message is 10 tokens."),
@@ -670,18 +782,6 @@ def trim_messages(
count += default_msg_prefix_len + len(msg.content) * default_content_len + default_msg_suffix_len
return count
First 30 tokens, not allowing partial messages:
.. code-block:: python
trim_messages(messages, max_tokens=30, token_counter=dummy_token_counter, strategy="first")
.. code-block:: python
[
SystemMessage("This is a 4 token text. The full message is 10 tokens."),
HumanMessage("This is a 4 token text. The full message is 10 tokens.", id="first"),
]
First 30 tokens, allowing partial messages:
.. code-block:: python
@@ -700,108 +800,6 @@ def trim_messages(
HumanMessage("This is a 4 token text. The full message is 10 tokens.", id="first"),
AIMessage( [{"type": "text", "text": "This is the FIRST 4 token block."}], id="second"),
]
First 30 tokens, allowing partial messages, have to end on HumanMessage:
.. code-block:: python
trim_messages(
messages,
max_tokens=30,
token_counter=dummy_token_counter,
strategy="first"
allow_partial=True,
end_on="human",
)
.. code-block:: python
[
SystemMessage("This is a 4 token text. The full message is 10 tokens."),
HumanMessage("This is a 4 token text. The full message is 10 tokens.", id="first"),
]
Last 30 tokens, including system message, not allowing partial messages:
.. code-block:: python
trim_messages(messages, max_tokens=30, include_system=True, token_counter=dummy_token_counter, strategy="last")
.. code-block:: python
[
SystemMessage("This is a 4 token text. The full message is 10 tokens."),
HumanMessage("This is a 4 token text. The full message is 10 tokens.", id="third"),
AIMessage("This is a 4 token text. The full message is 10 tokens.", id="fourth"),
]
Last 40 tokens, including system message, allowing partial messages:
.. code-block:: python
trim_messages(
messages,
max_tokens=40,
token_counter=dummy_token_counter,
strategy="last",
allow_partial=True,
include_system=True
)
.. code-block:: python
[
SystemMessage("This is a 4 token text. The full message is 10 tokens."),
AIMessage(
[{"type": "text", "text": "This is the FIRST 4 token block."},],
id="second",
),
HumanMessage("This is a 4 token text. The full message is 10 tokens.", id="third"),
AIMessage("This is a 4 token text. The full message is 10 tokens.", id="fourth"),
]
Last 30 tokens, including system message, allowing partial messages, end on HumanMessage:
.. code-block:: python
trim_messages(
messages,
max_tokens=30,
token_counter=dummy_token_counter,
strategy="last",
end_on="human",
include_system=True,
allow_partial=True,
)
.. code-block:: python
[
SystemMessage("This is a 4 token text. The full message is 10 tokens."),
AIMessage(
[{"type": "text", "text": "This is the FIRST 4 token block."},],
id="second",
),
HumanMessage("This is a 4 token text. The full message is 10 tokens.", id="third"),
]
Last 40 tokens, including system message, allowing partial messages, start on HumanMessage:
.. code-block:: python
trim_messages(
messages,
max_tokens=40,
token_counter=dummy_token_counter,
strategy="last",
include_system=True,
allow_partial=True,
start_on="human"
)
.. code-block:: python
[
SystemMessage("This is a 4 token text. The full message is 10 tokens."),
HumanMessage("This is a 4 token text. The full message is 10 tokens.", id="third"),
AIMessage("This is a 4 token text. The full message is 10 tokens.", id="fourth"),
]
""" # noqa: E501
if start_on and strategy == "first":