mirror of
https://github.com/hwchase17/langchain.git
synced 2025-09-12 12:59:07 +00:00
docs: Upgrade examples with RunnableWithMessageHistory to langgraph memory (#26855)
This PR updates the documentation examples that used RunnableWithMessageHistory to show how to achieve the same implementation with langgraph memory. Some of the underlying PRs (not all of them): - docs[patch]: update chatbot tutorial and migration guide (#26780) - docs[patch]: update chatbot memory how-to (#26790) - docs[patch]: update chatbot tools how-to (#26816) - docs: update chat history in rag how-to (#26821) - docs: update trim messages notebook (#26793) - docs: clean up imports in how to guide for rag qa with chat history (#26825) - docs[patch]: update conversational rag tutorial (#26814) --------- Co-authored-by: ccurme <chester.curme@gmail.com> Co-authored-by: Vadym Barda <vadym@langchain.dev> Co-authored-by: mercyspirit <ziying.qiu@gmail.com> Co-authored-by: aqiu7 <aqiu7@gatech.edu> Co-authored-by: John <43506685+Coniferish@users.noreply.github.com> Co-authored-by: Erick Friis <erick@langchain.dev> Co-authored-by: William FH <13333726+hinthornw@users.noreply.github.com> Co-authored-by: Subhrajyoty Roy <subhrajyotyroy@gmail.com> Co-authored-by: Rajendra Kadam <raj.725@outlook.com> Co-authored-by: Christophe Bornet <cbornet@hotmail.com> Co-authored-by: Devin Gaffney <itsme@devingaffney.com> Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
This commit is contained in:
@@ -581,12 +581,38 @@ def trim_messages(
|
||||
) -> list[BaseMessage]:
|
||||
"""Trim messages to be below a token count.
|
||||
|
||||
trim_messages can be used to reduce the size of a chat history to a specified token
|
||||
count or specified message count.
|
||||
|
||||
In either case, if passing the trimmed chat history back into a chat model
|
||||
directly, the resulting chat history should usually satisfy the following
|
||||
properties:
|
||||
|
||||
1. The resulting chat history should be valid. Most chat models expect that chat
|
||||
history starts with either (1) a `HumanMessage` or (2) a `SystemMessage` followed
|
||||
by a `HumanMessage`. To achieve this, set `start_on="human"`.
|
||||
In addition, generally a `ToolMessage` can only appear after an `AIMessage`
|
||||
that involved a tool call.
|
||||
Please see the following link for more information about messages:
|
||||
https://python.langchain.com/docs/concepts/#messages
|
||||
2. It includes recent messages and drops old messages in the chat history.
|
||||
To achieve this set the `strategy="last"`.
|
||||
3. Usually, the new chat history should include the `SystemMessage` if it
|
||||
was present in the original chat history since the `SystemMessage` includes
|
||||
special instructions to the chat model. The `SystemMessage` is almost always
|
||||
the first message in the history if present. To achieve this set the
|
||||
`include_system=True`.
|
||||
|
||||
**Note** The examples below show how to configure `trim_messages` to achieve
|
||||
a behavior consistent with the above properties.
|
||||
|
||||
Args:
|
||||
messages: Sequence of Message-like objects to trim.
|
||||
max_tokens: Max token count of trimmed messages.
|
||||
token_counter: Function or llm for counting tokens in a BaseMessage or a list of
|
||||
BaseMessage. If a BaseLanguageModel is passed in then
|
||||
BaseLanguageModel.get_num_tokens_from_messages() will be used.
|
||||
Set to `len` to count the number of **messages** in the chat history.
|
||||
strategy: Strategy for trimming.
|
||||
- "first": Keep the first <= n_count tokens of the messages.
|
||||
- "last": Keep the last <= n_count tokens of the messages.
|
||||
@@ -633,11 +659,97 @@ def trim_messages(
|
||||
``strategy`` is specified.
|
||||
|
||||
Example:
|
||||
Trim chat history based on token count, keeping the SystemMessage if
|
||||
present, and ensuring that the chat history starts with a HumanMessage (
|
||||
or a SystemMessage followed by a HumanMessage).
|
||||
|
||||
.. code-block:: python
|
||||
|
||||
from typing import List
|
||||
|
||||
from langchain_core.messages import trim_messages, AIMessage, BaseMessage, HumanMessage, SystemMessage
|
||||
from langchain_core.messages import (
|
||||
AIMessage,
|
||||
HumanMessage,
|
||||
BaseMessage,
|
||||
SystemMessage,
|
||||
trim_messages,
|
||||
)
|
||||
|
||||
messages = [
|
||||
SystemMessage("you're a good assistant, you always respond with a joke."),
|
||||
HumanMessage("i wonder why it's called langchain"),
|
||||
AIMessage(
|
||||
'Well, I guess they thought "WordRope" and "SentenceString" just didn\'t have the same ring to it!'
|
||||
),
|
||||
HumanMessage("and who is harrison chasing anyways"),
|
||||
AIMessage(
|
||||
"Hmmm let me think.\n\nWhy, he's probably chasing after the last cup of coffee in the office!"
|
||||
),
|
||||
HumanMessage("what do you call a speechless parrot"),
|
||||
]
|
||||
|
||||
|
||||
trim_messages(
|
||||
messages,
|
||||
max_tokens=45,
|
||||
strategy="last",
|
||||
token_counter=ChatOpenAI(model="gpt-4o"),
|
||||
# Most chat models expect that chat history starts with either:
|
||||
# (1) a HumanMessage or
|
||||
# (2) a SystemMessage followed by a HumanMessage
|
||||
start_on="human",
|
||||
# Usually, we want to keep the SystemMessage
|
||||
# if it's present in the original history.
|
||||
# The SystemMessage has special instructions for the model.
|
||||
include_system=True,
|
||||
allow_partial=False,
|
||||
)
|
||||
|
||||
.. code-block:: python
|
||||
|
||||
[
|
||||
SystemMessage(content="you're a good assistant, you always respond with a joke."),
|
||||
HumanMessage(content='what do you call a speechless parrot'),
|
||||
]
|
||||
|
||||
Trim chat history based on the message count, keeping the SystemMessage if
|
||||
present, and ensuring that the chat history starts with a HumanMessage (
|
||||
or a SystemMessage followed by a HumanMessage).
|
||||
|
||||
trim_messages(
|
||||
messages,
|
||||
# When `len` is passed in as the token counter function,
|
||||
# max_tokens will count the number of messages in the chat history.
|
||||
max_tokens=4,
|
||||
strategy="last",
|
||||
# Passing in `len` as a token counter function will
|
||||
# count the number of messages in the chat history.
|
||||
token_counter=len,
|
||||
# Most chat models expect that chat history starts with either:
|
||||
# (1) a HumanMessage or
|
||||
# (2) a SystemMessage followed by a HumanMessage
|
||||
start_on="human",
|
||||
# Usually, we want to keep the SystemMessage
|
||||
# if it's present in the original history.
|
||||
# The SystemMessage has special instructions for the model.
|
||||
include_system=True,
|
||||
allow_partial=False,
|
||||
)
|
||||
|
||||
.. code-block:: python
|
||||
|
||||
[
|
||||
SystemMessage(content="you're a good assistant, you always respond with a joke."),
|
||||
HumanMessage(content='and who is harrison chasing anyways'),
|
||||
AIMessage(content="Hmmm let me think.\n\nWhy, he's probably chasing after the last cup of coffee in the office!"),
|
||||
HumanMessage(content='what do you call a speechless parrot'),
|
||||
]
|
||||
|
||||
|
||||
Trim chat history using a custom token counter function that counts the
|
||||
number of tokens in each message.
|
||||
|
||||
.. code-block:: python
|
||||
|
||||
messages = [
|
||||
SystemMessage("This is a 4 token text. The full message is 10 tokens."),
|
||||
@@ -670,18 +782,6 @@ def trim_messages(
|
||||
count += default_msg_prefix_len + len(msg.content) * default_content_len + default_msg_suffix_len
|
||||
return count
|
||||
|
||||
First 30 tokens, not allowing partial messages:
|
||||
.. code-block:: python
|
||||
|
||||
trim_messages(messages, max_tokens=30, token_counter=dummy_token_counter, strategy="first")
|
||||
|
||||
.. code-block:: python
|
||||
|
||||
[
|
||||
SystemMessage("This is a 4 token text. The full message is 10 tokens."),
|
||||
HumanMessage("This is a 4 token text. The full message is 10 tokens.", id="first"),
|
||||
]
|
||||
|
||||
First 30 tokens, allowing partial messages:
|
||||
.. code-block:: python
|
||||
|
||||
@@ -700,108 +800,6 @@ def trim_messages(
|
||||
HumanMessage("This is a 4 token text. The full message is 10 tokens.", id="first"),
|
||||
AIMessage( [{"type": "text", "text": "This is the FIRST 4 token block."}], id="second"),
|
||||
]
|
||||
|
||||
First 30 tokens, allowing partial messages, have to end on HumanMessage:
|
||||
.. code-block:: python
|
||||
|
||||
trim_messages(
|
||||
messages,
|
||||
max_tokens=30,
|
||||
token_counter=dummy_token_counter,
|
||||
strategy="first"
|
||||
allow_partial=True,
|
||||
end_on="human",
|
||||
)
|
||||
|
||||
.. code-block:: python
|
||||
|
||||
[
|
||||
SystemMessage("This is a 4 token text. The full message is 10 tokens."),
|
||||
HumanMessage("This is a 4 token text. The full message is 10 tokens.", id="first"),
|
||||
]
|
||||
|
||||
|
||||
Last 30 tokens, including system message, not allowing partial messages:
|
||||
.. code-block:: python
|
||||
|
||||
trim_messages(messages, max_tokens=30, include_system=True, token_counter=dummy_token_counter, strategy="last")
|
||||
|
||||
.. code-block:: python
|
||||
|
||||
[
|
||||
SystemMessage("This is a 4 token text. The full message is 10 tokens."),
|
||||
HumanMessage("This is a 4 token text. The full message is 10 tokens.", id="third"),
|
||||
AIMessage("This is a 4 token text. The full message is 10 tokens.", id="fourth"),
|
||||
]
|
||||
|
||||
Last 40 tokens, including system message, allowing partial messages:
|
||||
.. code-block:: python
|
||||
|
||||
trim_messages(
|
||||
messages,
|
||||
max_tokens=40,
|
||||
token_counter=dummy_token_counter,
|
||||
strategy="last",
|
||||
allow_partial=True,
|
||||
include_system=True
|
||||
)
|
||||
|
||||
.. code-block:: python
|
||||
|
||||
[
|
||||
SystemMessage("This is a 4 token text. The full message is 10 tokens."),
|
||||
AIMessage(
|
||||
[{"type": "text", "text": "This is the FIRST 4 token block."},],
|
||||
id="second",
|
||||
),
|
||||
HumanMessage("This is a 4 token text. The full message is 10 tokens.", id="third"),
|
||||
AIMessage("This is a 4 token text. The full message is 10 tokens.", id="fourth"),
|
||||
]
|
||||
|
||||
Last 30 tokens, including system message, allowing partial messages, end on HumanMessage:
|
||||
.. code-block:: python
|
||||
|
||||
trim_messages(
|
||||
messages,
|
||||
max_tokens=30,
|
||||
token_counter=dummy_token_counter,
|
||||
strategy="last",
|
||||
end_on="human",
|
||||
include_system=True,
|
||||
allow_partial=True,
|
||||
)
|
||||
|
||||
.. code-block:: python
|
||||
|
||||
[
|
||||
SystemMessage("This is a 4 token text. The full message is 10 tokens."),
|
||||
AIMessage(
|
||||
[{"type": "text", "text": "This is the FIRST 4 token block."},],
|
||||
id="second",
|
||||
),
|
||||
HumanMessage("This is a 4 token text. The full message is 10 tokens.", id="third"),
|
||||
]
|
||||
|
||||
Last 40 tokens, including system message, allowing partial messages, start on HumanMessage:
|
||||
.. code-block:: python
|
||||
|
||||
trim_messages(
|
||||
messages,
|
||||
max_tokens=40,
|
||||
token_counter=dummy_token_counter,
|
||||
strategy="last",
|
||||
include_system=True,
|
||||
allow_partial=True,
|
||||
start_on="human"
|
||||
)
|
||||
|
||||
.. code-block:: python
|
||||
|
||||
[
|
||||
SystemMessage("This is a 4 token text. The full message is 10 tokens."),
|
||||
HumanMessage("This is a 4 token text. The full message is 10 tokens.", id="third"),
|
||||
AIMessage("This is a 4 token text. The full message is 10 tokens.", id="fourth"),
|
||||
]
|
||||
""" # noqa: E501
|
||||
|
||||
if start_on and strategy == "first":
|
||||
|
Reference in New Issue
Block a user