docs[patch]: update chatbot memory how-to (#26790)

2025-08-14 07:07:34 +00:00 · 2024-09-24 10:53:39 -04:00 · 2024-09-24 10:53:39 -04:00 · e8ce5cde99
commit e8ce5cde99
parent a7aad27cba
1 changed files with 239 additions and 483 deletions
--- a/docs/docs/how_to/chatbots_memory.ipynb
+++ b/docs/docs/how_to/chatbots_memory.ipynb
@ -23,6 +23,14 @@
    "\n",
    "We'll go into more detail on a few techniques below!\n",
    "\n",
+    ":::{.callout-note}\n",
+    "\n",
+    "This how-to guide previously built a chatbot using [RunnableWithMessageHistory](https://python.langchain.com/api_reference/core/runnables/langchain_core.runnables.history.RunnableWithMessageHistory.html). You can access this version of the tutorial in the [v0.2 docs](https://python.langchain.com/v0.2/docs/how_to/chatbots_memory/).\n",
+    "\n",
+    "The LangGraph implementation offers a number of advantages over `RunnableWithMessageHistory`, including the ability to persist arbitrary components of an application's state (instead of only messages).\n",
+    "\n",
+    ":::\n",
+    "\n",
    "## Setup\n",
    "\n",
    "You'll need to install a few packages, and have your OpenAI API key set as an environment variable named `OPENAI_API_KEY`:"
@ -33,15 +41,6 @@
   "execution_count": 1,
   "metadata": {},
   "outputs": [
-    {
-     "name": "stdout",
-     "output_type": "stream",
-     "text": [
-      "\u001b[33mWARNING: You are using pip version 22.0.4; however, version 23.3.2 is available.\n",
-      "You should consider upgrading via the '/Users/jacoblee/.pyenv/versions/3.10.5/bin/python -m pip install --upgrade pip' command.\u001b[0m\u001b[33m\n",
-      "\u001b[0mNote: you may need to restart the kernel to use updated packages.\n"
-     ]
-    },
    {
     "data": {
      "text/plain": [
@ -54,12 +53,13 @@
    }
   ],
   "source": [
-    "%pip install --upgrade --quiet langchain langchain-openai\n",
+    "%pip install --upgrade --quiet langchain langchain-openai langgraph\n",
    "\n",
-    "# Set env var OPENAI_API_KEY or load from a .env file:\n",
-    "import dotenv\n",
+    "import getpass\n",
+    "import os\n",
    "\n",
-    "dotenv.load_dotenv()"
+    "if not os.environ.get(\"OPENAI_API_KEY\"):\n",
+    "    os.environ[\"OPENAI_API_KEY\"] = getpass.getpass(\"OpenAI API Key:\")"
   ]
  },
  {
@ -71,13 +71,13 @@
  },
  {
   "cell_type": "code",
-   "execution_count": 1,
+   "execution_count": 2,
   "metadata": {},
   "outputs": [],
   "source": [
    "from langchain_openai import ChatOpenAI\n",
    "\n",
-    "chat = ChatOpenAI(model=\"gpt-4o-mini\")"
+    "model = ChatOpenAI(model=\"gpt-4o-mini\")"
   ]
  },
  {
@ -98,34 +98,33 @@
     "name": "stdout",
     "output_type": "stream",
     "text": [
-      "I said \"J'adore la programmation,\" which means \"I love programming\" in French.\n"
+      "I translated the sentence \"I love programming\" into French, which is \"J'adore la programmation.\"\n"
     ]
    }
   ],
   "source": [
-    "from langchain_core.prompts import ChatPromptTemplate\n",
+    "from langchain_core.messages import AIMessage, HumanMessage, SystemMessage\n",
+    "from langchain_core.prompts import ChatPromptTemplate, MessagesPlaceholder\n",
    "\n",
    "prompt = ChatPromptTemplate.from_messages(\n",
    "    [\n",
-    "        (\n",
-    "            \"system\",\n",
-    "            \"You are a helpful assistant. Answer all questions to the best of your ability.\",\n",
+    "        SystemMessage(\n",
+    "            content=\"You are a helpful assistant. Answer all questions to the best of your ability.\"\n",
    "        ),\n",
-    "        (\"placeholder\", \"{messages}\"),\n",
+    "        MessagesPlaceholder(variable_name=\"messages\"),\n",
    "    ]\n",
    ")\n",
    "\n",
-    "chain = prompt | chat\n",
+    "chain = prompt | model\n",
    "\n",
    "ai_msg = chain.invoke(\n",
    "    {\n",
    "        \"messages\": [\n",
-    "            (\n",
-    "                \"human\",\n",
-    "                \"Translate this sentence from English to French: I love programming.\",\n",
+    "            HumanMessage(\n",
+    "                content=\"Translate this sentence from English to French: I love programming.\"\n",
    "            ),\n",
-    "            (\"ai\", \"J'adore la programmation.\"),\n",
-    "            (\"human\", \"What did you just say?\"),\n",
+    "            AIMessage(content=\"J'adore la programmation.\"),\n",
+    "            HumanMessage(content=\"What did you just say?\"),\n",
    "        ],\n",
    "    }\n",
    ")\n",
@ -136,51 +135,57 @@
   "cell_type": "markdown",
   "metadata": {},
   "source": [
-    "We can see that by passing the previous conversation into a chain, it can use it as context to answer questions. This is the basic concept underpinning chatbot memory - the rest of the guide will demonstrate convenient techniques for passing or reformatting messages.\n",
-    "\n",
-    "## Chat history\n",
-    "\n",
-    "It's perfectly fine to store and pass messages directly as an array, but we can use LangChain's built-in [message history class](https://python.langchain.com/api_reference/langchain/index.html#module-langchain.memory) to store and load messages as well. Instances of this class are responsible for storing and loading chat messages from persistent storage. LangChain integrates with many providers - you can see a [list of integrations here](/docs/integrations/memory) - but for this demo we will use an ephemeral demo class.\n",
-    "\n",
-    "Here's an example of the API:"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": 4,
-   "metadata": {},
-   "outputs": [
-    {
-     "data": {
-      "text/plain": [
-       "[HumanMessage(content='Translate this sentence from English to French: I love programming.'),\n",
-       " AIMessage(content=\"J'adore la programmation.\")]"
-      ]
-     },
-     "execution_count": 4,
-     "metadata": {},
-     "output_type": "execute_result"
-    }
-   ],
-   "source": [
-    "from langchain_community.chat_message_histories import ChatMessageHistory\n",
-    "\n",
-    "demo_ephemeral_chat_history = ChatMessageHistory()\n",
-    "\n",
-    "demo_ephemeral_chat_history.add_user_message(\n",
-    "    \"Translate this sentence from English to French: I love programming.\"\n",
-    ")\n",
-    "\n",
-    "demo_ephemeral_chat_history.add_ai_message(\"J'adore la programmation.\")\n",
-    "\n",
-    "demo_ephemeral_chat_history.messages"
+    "We can see that by passing the previous conversation into a chain, it can use it as context to answer questions. This is the basic concept underpinning chatbot memory - the rest of the guide will demonstrate convenient techniques for passing or reformatting messages."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
-    "We can use it directly to store conversation turns for our chain:"
+    "## Automatic history management\n",
+    "\n",
+    "The previous examples pass messages to the chain (and model) explicitly. This is a completely acceptable approach, but it does require external management of new messages. LangChain also provides a way to build applications that have memory using LangGraph's [persistence](https://langchain-ai.github.io/langgraph/concepts/persistence/). You can [enable persistence](https://langchain-ai.github.io/langgraph/how-tos/persistence/) in LangGraph applications by providing a `checkpointer` when compiling the graph."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 4,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "from langgraph.checkpoint.memory import MemorySaver\n",
+    "from langgraph.graph import START, MessagesState, StateGraph\n",
+    "\n",
+    "workflow = StateGraph(state_schema=MessagesState)\n",
+    "\n",
+    "\n",
+    "# Define the function that calls the model\n",
+    "def call_model(state: MessagesState):\n",
+    "    system_prompt = (\n",
+    "        \"You are a helpful assistant. \"\n",
+    "        \"Answer all questions to the best of your ability.\"\n",
+    "    )\n",
+    "    messages = [SystemMessage(content=system_prompt)] + state[\"messages\"]\n",
+    "    response = model.invoke(messages)\n",
+    "    return {\"messages\": response}\n",
+    "\n",
+    "\n",
+    "# Define the node and edge\n",
+    "workflow.add_node(\"model\", call_model)\n",
+    "workflow.add_edge(START, \"model\")\n",
+    "\n",
+    "# Add simple in-memory checkpointer\n",
+    "# highlight-start\n",
+    "memory = MemorySaver()\n",
+    "app = workflow.compile(checkpointer=memory)\n",
+    "# highlight-end"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    " We'll pass the latest input to the conversation here and let the LangGraph keep track of the conversation history using the checkpointer:"
   ]
  },
  {
@ -191,7 +196,8 @@
    {
     "data": {
      "text/plain": [
-       "AIMessage(content='You just asked me to translate the sentence \"I love programming\" from English to French.', response_metadata={'token_usage': {'completion_tokens': 18, 'prompt_tokens': 61, 'total_tokens': 79}, 'model_name': 'gpt-4o-mini', 'system_fingerprint': None, 'finish_reason': 'stop', 'logprobs': None}, id='run-5cbb21c2-9c30-4031-8ea8-bfc497989535-0', usage_metadata={'input_tokens': 61, 'output_tokens': 18, 'total_tokens': 79})"
+       "{'messages': [HumanMessage(content='Translate this sentence from English to French: I love programming.', additional_kwargs={}, response_metadata={}, id='200f88bb-936a-4877-990c-8b4112d82cfe'),\n",
+       "  AIMessage(content='The translation of \"I love programming\" in French is \"J\\'aime programmer.\"', additional_kwargs={'refusal': None}, response_metadata={'token_usage': {'completion_tokens': 16, 'prompt_tokens': 39, 'total_tokens': 55, 'completion_tokens_details': {'reasoning_tokens': 0}}, 'model_name': 'gpt-4o-mini-2024-07-18', 'system_fingerprint': 'fp_1bb46167f9', 'finish_reason': 'stop', 'logprobs': None}, id='run-d4ebcdcf-9a60-4471-ad8d-96169f614ada-0', usage_metadata={'input_tokens': 39, 'output_tokens': 16, 'total_tokens': 55})]}"
      ]
     },
     "execution_count": 5,
@ -200,159 +206,35 @@
    }
   ],
   "source": [
-    "demo_ephemeral_chat_history = ChatMessageHistory()\n",
-    "\n",
-    "input1 = \"Translate this sentence from English to French: I love programming.\"\n",
-    "\n",
-    "demo_ephemeral_chat_history.add_user_message(input1)\n",
-    "\n",
-    "response = chain.invoke(\n",
-    "    {\n",
-    "        \"messages\": demo_ephemeral_chat_history.messages,\n",
-    "    }\n",
-    ")\n",
-    "\n",
-    "demo_ephemeral_chat_history.add_ai_message(response)\n",
-    "\n",
-    "input2 = \"What did I just ask you?\"\n",
-    "\n",
-    "demo_ephemeral_chat_history.add_user_message(input2)\n",
-    "\n",
-    "chain.invoke(\n",
-    "    {\n",
-    "        \"messages\": demo_ephemeral_chat_history.messages,\n",
-    "    }\n",
+    "app.invoke(\n",
+    "    {\"messages\": [HumanMessage(content=\"Translate to French: I love programming.\")]},\n",
+    "    config={\"configurable\": {\"thread_id\": \"1\"}},\n",
    ")"
   ]
  },
-  {
-   "cell_type": "markdown",
-   "metadata": {},
-   "source": [
-    "## Automatic history management\n",
-    "\n",
-    "The previous examples pass messages to the chain explicitly. This is a completely acceptable approach, but it does require external management of new messages. LangChain also includes an wrapper for LCEL chains that can handle this process automatically called `RunnableWithMessageHistory`.\n",
-    "\n",
-    "To show how it works, let's slightly modify the above prompt to take a final `input` variable that populates a `HumanMessage` template after the chat history. This means that we will expect a `chat_history` parameter that contains all messages BEFORE the current messages instead of all messages:"
-   ]
-  },
  {
   "cell_type": "code",
   "execution_count": 6,
   "metadata": {},
-   "outputs": [],
-   "source": [
-    "prompt = ChatPromptTemplate.from_messages(\n",
-    "    [\n",
-    "        (\n",
-    "            \"system\",\n",
-    "            \"You are a helpful assistant. Answer all questions to the best of your ability.\",\n",
-    "        ),\n",
-    "        (\"placeholder\", \"{chat_history}\"),\n",
-    "        (\"human\", \"{input}\"),\n",
-    "    ]\n",
-    ")\n",
-    "\n",
-    "chain = prompt | chat"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "metadata": {},
-   "source": [
-    " We'll pass the latest input to the conversation here and let the `RunnableWithMessageHistory` class wrap our chain and do the work of appending that `input` variable to the chat history.\n",
-    " \n",
-    " Next, let's declare our wrapped chain:"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": 7,
-   "metadata": {},
-   "outputs": [],
-   "source": [
-    "from langchain_core.runnables.history import RunnableWithMessageHistory\n",
-    "\n",
-    "demo_ephemeral_chat_history_for_chain = ChatMessageHistory()\n",
-    "\n",
-    "chain_with_message_history = RunnableWithMessageHistory(\n",
-    "    chain,\n",
-    "    lambda session_id: demo_ephemeral_chat_history_for_chain,\n",
-    "    input_messages_key=\"input\",\n",
-    "    history_messages_key=\"chat_history\",\n",
-    ")"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "metadata": {},
-   "source": [
-    "This class takes a few parameters in addition to the chain that we want to wrap:\n",
-    "\n",
-    "- A factory function that returns a message history for a given session id. This allows your chain to handle multiple users at once by loading different messages for different conversations.\n",
-    "- An `input_messages_key` that specifies which part of the input should be tracked and stored in the chat history. In this example, we want to track the string passed in as `input`.\n",
-    "- A `history_messages_key` that specifies what the previous messages should be injected into the prompt as. Our prompt has a `MessagesPlaceholder` named `chat_history`, so we specify this property to match.\n",
-    "- (For chains with multiple outputs) an `output_messages_key` which specifies which output to store as history. This is the inverse of `input_messages_key`.\n",
-    "\n",
-    "We can invoke this new chain as normal, with an additional `configurable` field that specifies the particular `session_id` to pass to the factory function. This is unused for the demo, but in real-world chains, you'll want to return a chat history corresponding to the passed session:"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": 8,
-   "metadata": {},
   "outputs": [
-    {
-     "name": "stderr",
-     "output_type": "stream",
-     "text": [
-      "Parent run dc4e2f79-4bcd-4a36-9506-55ace9040588 not found for run 34b5773e-3ced-46a6-8daf-4d464c15c940. Treating as a root run.\n"
-     ]
-    },
    {
     "data": {
      "text/plain": [
-       "AIMessage(content='\"J\\'adore la programmation.\"', response_metadata={'token_usage': {'completion_tokens': 9, 'prompt_tokens': 39, 'total_tokens': 48}, 'model_name': 'gpt-4o-mini', 'system_fingerprint': None, 'finish_reason': 'stop', 'logprobs': None}, id='run-648b0822-b0bb-47a2-8e7d-7d34744be8f2-0', usage_metadata={'input_tokens': 39, 'output_tokens': 9, 'total_tokens': 48})"
+       "{'messages': [HumanMessage(content='Translate this sentence from English to French: I love programming.', additional_kwargs={}, response_metadata={}, id='200f88bb-936a-4877-990c-8b4112d82cfe'),\n",
+       "  AIMessage(content='The translation of \"I love programming\" in French is \"J\\'aime programmer.\"', additional_kwargs={'refusal': None}, response_metadata={'token_usage': {'completion_tokens': 16, 'prompt_tokens': 39, 'total_tokens': 55, 'completion_tokens_details': {'reasoning_tokens': 0}}, 'model_name': 'gpt-4o-mini-2024-07-18', 'system_fingerprint': 'fp_1bb46167f9', 'finish_reason': 'stop', 'logprobs': None}, id='run-d4ebcdcf-9a60-4471-ad8d-96169f614ada-0', usage_metadata={'input_tokens': 39, 'output_tokens': 16, 'total_tokens': 55}),\n",
+       "  HumanMessage(content='What did I just ask you?', additional_kwargs={}, response_metadata={}, id='df32f0a6-38fe-418a-98fe-7a5f17d0b812'),\n",
+       "  AIMessage(content='You asked me to translate the sentence \"I love programming\" from English to French.', additional_kwargs={'refusal': None}, response_metadata={'token_usage': {'completion_tokens': 17, 'prompt_tokens': 70, 'total_tokens': 87, 'completion_tokens_details': {'reasoning_tokens': 0}}, 'model_name': 'gpt-4o-mini-2024-07-18', 'system_fingerprint': 'fp_1bb46167f9', 'finish_reason': 'stop', 'logprobs': None}, id='run-1ee8ad67-d7f0-4bb9-adff-e632be6e2825-0', usage_metadata={'input_tokens': 70, 'output_tokens': 17, 'total_tokens': 87})]}"
      ]
     },
-     "execution_count": 8,
+     "execution_count": 6,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
-    "chain_with_message_history.invoke(\n",
-    "    {\"input\": \"Translate this sentence from English to French: I love programming.\"},\n",
-    "    {\"configurable\": {\"session_id\": \"unused\"}},\n",
-    ")"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": 9,
-   "metadata": {},
-   "outputs": [
-    {
-     "name": "stderr",
-     "output_type": "stream",
-     "text": [
-      "Parent run cc14b9d8-c59e-40db-a523-d6ab3fc2fa4f not found for run 5b75e25c-131e-46ee-9982-68569db04330. Treating as a root run.\n"
-     ]
-    },
-    {
-     "data": {
-      "text/plain": [
-       "AIMessage(content='You asked me to translate the sentence \"I love programming\" from English to French.', response_metadata={'token_usage': {'completion_tokens': 17, 'prompt_tokens': 63, 'total_tokens': 80}, 'model_name': 'gpt-4o-mini', 'system_fingerprint': None, 'finish_reason': 'stop', 'logprobs': None}, id='run-5950435c-1dc2-43a6-836f-f989fd62c95e-0', usage_metadata={'input_tokens': 63, 'output_tokens': 17, 'total_tokens': 80})"
-      ]
-     },
-     "execution_count": 9,
-     "metadata": {},
-     "output_type": "execute_result"
-    }
-   ],
-   "source": [
-    "chain_with_message_history.invoke(\n",
-    "    {\"input\": \"What did I just ask you?\"}, {\"configurable\": {\"session_id\": \"unused\"}}\n",
+    "app.invoke(\n",
+    "    {\"messages\": [HumanMessage(content=\"What did I just ask you?\")]},\n",
+    "    config={\"configurable\": {\"thread_id\": \"1\"}},\n",
    ")"
   ]
  },
@ -366,80 +248,44 @@
    "\n",
    "### Trimming messages\n",
    "\n",
-    "LLMs and chat models have limited context windows, and even if you're not directly hitting limits, you may want to limit the amount of distraction the model has to deal with. One solution is trim the historic messages before passing them to the model. Let's use an example history with some preloaded messages:"
+    "LLMs and chat models have limited context windows, and even if you're not directly hitting limits, you may want to limit the amount of distraction the model has to deal with. One solution is trim the history messages before passing them to the model. Let's use an example history with the `app` we declared above:"
   ]
  },
  {
   "cell_type": "code",
-   "execution_count": 21,
+   "execution_count": 7,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
-       "[HumanMessage(content=\"Hey there! I'm Nemo.\"),\n",
-       " AIMessage(content='Hello!'),\n",
-       " HumanMessage(content='How are you today?'),\n",
-       " AIMessage(content='Fine thanks!')]"
+       "{'messages': [HumanMessage(content=\"Hey there! I'm Nemo.\", additional_kwargs={}, response_metadata={}, id='99321048-3390-4da6-919b-4ad933c4913b'),\n",
+       "  AIMessage(content='Hello!', additional_kwargs={}, response_metadata={}, id='1c3eaf4a-b698-4bc6-a7a6-549290c3fc7e'),\n",
+       "  HumanMessage(content='How are you today?', additional_kwargs={}, response_metadata={}, id='6f96db9d-ac30-4b4a-9ebc-bc11ae87646b'),\n",
+       "  AIMessage(content='Fine thanks!', additional_kwargs={}, response_metadata={}, id='e783fbb6-2892-42ea-9859-ae449e4cfdf6'),\n",
+       "  HumanMessage(content=\"What's my name?\", additional_kwargs={}, response_metadata={}, id='854065c4-09a0-4c2a-9f2c-eb7182dcc9d5'),\n",
+       "  AIMessage(content='Your name is Nemo.', additional_kwargs={'refusal': None}, response_metadata={'token_usage': {'completion_tokens': 5, 'prompt_tokens': 63, 'total_tokens': 68, 'completion_tokens_details': {'reasoning_tokens': 0}}, 'model_name': 'gpt-4o-mini-2024-07-18', 'system_fingerprint': 'fp_1bb46167f9', 'finish_reason': 'stop', 'logprobs': None}, id='run-eed15b83-b215-47a3-b374-404d6a05ab94-0', usage_metadata={'input_tokens': 63, 'output_tokens': 5, 'total_tokens': 68})]}"
      ]
     },
-     "execution_count": 21,
+     "execution_count": 7,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
-    "demo_ephemeral_chat_history = ChatMessageHistory()\n",
+    "demo_ephemeral_chat_history = [\n",
+    "    HumanMessage(content=\"Hey there! I'm Nemo.\"),\n",
+    "    AIMessage(content=\"Hello!\"),\n",
+    "    HumanMessage(content=\"How are you today?\"),\n",
+    "    AIMessage(content=\"Fine thanks!\"),\n",
+    "]\n",
    "\n",
-    "demo_ephemeral_chat_history.add_user_message(\"Hey there! I'm Nemo.\")\n",
-    "demo_ephemeral_chat_history.add_ai_message(\"Hello!\")\n",
-    "demo_ephemeral_chat_history.add_user_message(\"How are you today?\")\n",
-    "demo_ephemeral_chat_history.add_ai_message(\"Fine thanks!\")\n",
-    "\n",
-    "demo_ephemeral_chat_history.messages"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "metadata": {},
-   "source": [
-    "Let's use this message history with the `RunnableWithMessageHistory` chain we declared above:"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": 22,
-   "metadata": {},
-   "outputs": [
-    {
-     "name": "stderr",
-     "output_type": "stream",
-     "text": [
-      "Parent run 7ff2d8ec-65e2-4f67-8961-e498e2c4a591 not found for run 3881e990-6596-4326-84f6-2b76949e0657. Treating as a root run.\n"
-     ]
-    },
-    {
-     "data": {
-      "text/plain": [
-       "AIMessage(content='Your name is Nemo.', response_metadata={'token_usage': {'completion_tokens': 6, 'prompt_tokens': 66, 'total_tokens': 72}, 'model_name': 'gpt-4o-mini', 'system_fingerprint': None, 'finish_reason': 'stop', 'logprobs': None}, id='run-f8aabef8-631a-4238-a39b-701e881fbe47-0', usage_metadata={'input_tokens': 66, 'output_tokens': 6, 'total_tokens': 72})"
-      ]
-     },
-     "execution_count": 22,
-     "metadata": {},
-     "output_type": "execute_result"
-    }
-   ],
-   "source": [
-    "chain_with_message_history = RunnableWithMessageHistory(\n",
-    "    chain,\n",
-    "    lambda session_id: demo_ephemeral_chat_history,\n",
-    "    input_messages_key=\"input\",\n",
-    "    history_messages_key=\"chat_history\",\n",
-    ")\n",
-    "\n",
-    "chain_with_message_history.invoke(\n",
-    "    {\"input\": \"What's my name?\"},\n",
-    "    {\"configurable\": {\"session_id\": \"unused\"}},\n",
+    "app.invoke(\n",
+    "    {\n",
+    "        \"messages\": demo_ephemeral_chat_history\n",
+    "        + [HumanMessage(content=\"What's my name?\")]\n",
+    "    },\n",
+    "    config={\"configurable\": {\"thread_id\": \"2\"}},\n",
    ")"
   ]
  },
@ -447,35 +293,88 @@
   "cell_type": "markdown",
   "metadata": {},
   "source": [
-    "We can see the chain remembers the preloaded name.\n",
+    "We can see the app remembers the preloaded name.\n",
    "\n",
-    "But let's say we have a very small context window, and we want to trim the number of messages passed to the chain to only the 2 most recent ones. We can use the built in [trim_messages](/docs/how_to/trim_messages/) util to trim messages based on their token count before they reach our prompt. In this case we'll count each message as 1 \"token\" and keep only the last two messages:"
+    "But let's say we have a very small context window, and we want to trim the number of messages passed to the model to only the 2 most recent ones. We can use the built in [trim_messages](/docs/how_to/trim_messages/) util to trim messages based on their token count before they reach our prompt. In this case we'll count each message as 1 \"token\" and keep only the last two messages:"
   ]
  },
  {
   "cell_type": "code",
-   "execution_count": 23,
+   "execution_count": 8,
   "metadata": {},
   "outputs": [],
   "source": [
-    "from operator import itemgetter\n",
-    "\n",
    "from langchain_core.messages import trim_messages\n",
-    "from langchain_core.runnables import RunnablePassthrough\n",
+    "from langgraph.checkpoint.memory import MemorySaver\n",
+    "from langgraph.graph import START, MessagesState, StateGraph\n",
    "\n",
+    "# Define trimmer\n",
+    "# highlight-start\n",
+    "# count each message as 1 \"token\" (token_counter=len) and keep only the last two messages\n",
    "trimmer = trim_messages(strategy=\"last\", max_tokens=2, token_counter=len)\n",
+    "# highlight-end\n",
    "\n",
-    "chain_with_trimming = (\n",
-    "    RunnablePassthrough.assign(chat_history=itemgetter(\"chat_history\") | trimmer)\n",
-    "    | prompt\n",
-    "    | chat\n",
-    ")\n",
+    "workflow = StateGraph(state_schema=MessagesState)\n",
    "\n",
-    "chain_with_trimmed_history = RunnableWithMessageHistory(\n",
-    "    chain_with_trimming,\n",
-    "    lambda session_id: demo_ephemeral_chat_history,\n",
-    "    input_messages_key=\"input\",\n",
-    "    history_messages_key=\"chat_history\",\n",
+    "\n",
+    "# Define the function that calls the model\n",
+    "def call_model(state: MessagesState):\n",
+    "    # highlight-start\n",
+    "    trimmed_messages = trimmer.invoke(state[\"messages\"])\n",
+    "    system_prompt = (\n",
+    "        \"You are a helpful assistant. \"\n",
+    "        \"Answer all questions to the best of your ability.\"\n",
+    "    )\n",
+    "    messages = [SystemMessage(content=system_prompt)] + trimmed_messages\n",
+    "    # highlight-end\n",
+    "    response = model.invoke(messages)\n",
+    "    return {\"messages\": response}\n",
+    "\n",
+    "\n",
+    "# Define the node and edge\n",
+    "workflow.add_node(\"model\", call_model)\n",
+    "workflow.add_edge(START, \"model\")\n",
+    "\n",
+    "# Add simple in-memory checkpointer\n",
+    "memory = MemorySaver()\n",
+    "app = workflow.compile(checkpointer=memory)"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "Let's call this new app and check the response"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 9,
+   "metadata": {},
+   "outputs": [
+    {
+     "data": {
+      "text/plain": [
+       "{'messages': [HumanMessage(content=\"Hey there! I'm Nemo.\", additional_kwargs={}, response_metadata={}, id='99321048-3390-4da6-919b-4ad933c4913b'),\n",
+       "  AIMessage(content='Hello!', additional_kwargs={}, response_metadata={}, id='1c3eaf4a-b698-4bc6-a7a6-549290c3fc7e'),\n",
+       "  HumanMessage(content='How are you today?', additional_kwargs={}, response_metadata={}, id='6f96db9d-ac30-4b4a-9ebc-bc11ae87646b'),\n",
+       "  AIMessage(content='Fine thanks!', additional_kwargs={}, response_metadata={}, id='e783fbb6-2892-42ea-9859-ae449e4cfdf6'),\n",
+       "  HumanMessage(content='What is my name?', additional_kwargs={}, response_metadata={}, id='c8ba5e90-89cb-4b34-ad4c-11c0478422d8'),\n",
+       "  AIMessage(content=\"I'm sorry, but I don't know your name. How can I assist you today?\", additional_kwargs={'refusal': None}, response_metadata={'token_usage': {'completion_tokens': 17, 'prompt_tokens': 39, 'total_tokens': 56, 'completion_tokens_details': {'reasoning_tokens': 0}}, 'model_name': 'gpt-4o-mini-2024-07-18', 'system_fingerprint': 'fp_1bb46167f9', 'finish_reason': 'stop', 'logprobs': None}, id='run-aa86d3f8-898e-4146-aa3c-2c424934b0f5-0', usage_metadata={'input_tokens': 39, 'output_tokens': 17, 'total_tokens': 56})]}"
+      ]
+     },
+     "execution_count": 9,
+     "metadata": {},
+     "output_type": "execute_result"
+    }
+   ],
+   "source": [
+    "app.invoke(\n",
+    "    {\n",
+    "        \"messages\": demo_ephemeral_chat_history\n",
+    "        + [HumanMessage(content=\"What is my name?\")]\n",
+    "    },\n",
+    "    config={\"configurable\": {\"thread_id\": \"3\"}},\n",
    ")"
   ]
  },
@ -483,101 +382,7 @@
   "cell_type": "markdown",
   "metadata": {},
   "source": [
-    "Let's call this new chain and check the messages afterwards:"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": 24,
-   "metadata": {},
-   "outputs": [
-    {
-     "name": "stderr",
-     "output_type": "stream",
-     "text": [
-      "Parent run 775cde65-8d22-4c44-80bb-f0b9811c32ca not found for run 5cf71d0e-4663-41cd-8dbe-e9752689cfac. Treating as a root run.\n"
-     ]
-    },
-    {
-     "data": {
-      "text/plain": [
-       "AIMessage(content='P. Sherman is a fictional character from the animated movie \"Finding Nemo\" who lives at 42 Wallaby Way, Sydney.', response_metadata={'token_usage': {'completion_tokens': 27, 'prompt_tokens': 53, 'total_tokens': 80}, 'model_name': 'gpt-4o-mini', 'system_fingerprint': None, 'finish_reason': 'stop', 'logprobs': None}, id='run-5642ef3a-fdbe-43cf-a575-d1785976a1b9-0', usage_metadata={'input_tokens': 53, 'output_tokens': 27, 'total_tokens': 80})"
-      ]
-     },
-     "execution_count": 24,
-     "metadata": {},
-     "output_type": "execute_result"
-    }
-   ],
-   "source": [
-    "chain_with_trimmed_history.invoke(\n",
-    "    {\"input\": \"Where does P. Sherman live?\"},\n",
-    "    {\"configurable\": {\"session_id\": \"unused\"}},\n",
-    ")"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": 25,
-   "metadata": {},
-   "outputs": [
-    {
-     "data": {
-      "text/plain": [
-       "[HumanMessage(content=\"Hey there! I'm Nemo.\"),\n",
-       " AIMessage(content='Hello!'),\n",
-       " HumanMessage(content='How are you today?'),\n",
-       " AIMessage(content='Fine thanks!'),\n",
-       " HumanMessage(content=\"What's my name?\"),\n",
-       " AIMessage(content='Your name is Nemo.', response_metadata={'token_usage': {'completion_tokens': 6, 'prompt_tokens': 66, 'total_tokens': 72}, 'model_name': 'gpt-4o-mini', 'system_fingerprint': None, 'finish_reason': 'stop', 'logprobs': None}, id='run-f8aabef8-631a-4238-a39b-701e881fbe47-0', usage_metadata={'input_tokens': 66, 'output_tokens': 6, 'total_tokens': 72}),\n",
-       " HumanMessage(content='Where does P. Sherman live?'),\n",
-       " AIMessage(content='P. Sherman is a fictional character from the animated movie \"Finding Nemo\" who lives at 42 Wallaby Way, Sydney.', response_metadata={'token_usage': {'completion_tokens': 27, 'prompt_tokens': 53, 'total_tokens': 80}, 'model_name': 'gpt-4o-mini', 'system_fingerprint': None, 'finish_reason': 'stop', 'logprobs': None}, id='run-5642ef3a-fdbe-43cf-a575-d1785976a1b9-0', usage_metadata={'input_tokens': 53, 'output_tokens': 27, 'total_tokens': 80})]"
-      ]
-     },
-     "execution_count": 25,
-     "metadata": {},
-     "output_type": "execute_result"
-    }
-   ],
-   "source": [
-    "demo_ephemeral_chat_history.messages"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "metadata": {},
-   "source": [
-    "And we can see that our history has removed the two oldest messages while still adding the most recent conversation at the end. The next time the chain is called, `trim_messages` will be called again, and only the two most recent messages will be passed to the model. In this case, this means that the model will forget the name we gave it the next time we invoke it:"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": 27,
-   "metadata": {},
-   "outputs": [
-    {
-     "name": "stderr",
-     "output_type": "stream",
-     "text": [
-      "Parent run fde7123f-6fd3-421a-a3fc-2fb37dead119 not found for run 061a4563-2394-470d-a3ed-9bf1388ca431. Treating as a root run.\n"
-     ]
-    },
-    {
-     "data": {
-      "text/plain": [
-       "AIMessage(content=\"I'm sorry, but I don't have access to your personal information, so I don't know your name. How else may I assist you today?\", response_metadata={'token_usage': {'completion_tokens': 31, 'prompt_tokens': 74, 'total_tokens': 105}, 'model_name': 'gpt-4o-mini', 'system_fingerprint': None, 'finish_reason': 'stop', 'logprobs': None}, id='run-0ab03495-1f7c-4151-9070-56d2d1c565ff-0', usage_metadata={'input_tokens': 74, 'output_tokens': 31, 'total_tokens': 105})"
-      ]
-     },
-     "execution_count": 27,
-     "metadata": {},
-     "output_type": "execute_result"
-    }
-   ],
-   "source": [
-    "chain_with_trimmed_history.invoke(\n",
-    "    {\"input\": \"What is my name?\"},\n",
-    "    {\"configurable\": {\"session_id\": \"unused\"}},\n",
-    ")"
+    "We can see that `trim_messages` was called and only the two most recent messages will be passed to the model. In this case, this means that the model forgot the name we gave it."
   ]
  },
  {
@ -593,114 +398,82 @@
   "source": [
    "### Summary memory\n",
    "\n",
-    "We can use this same pattern in other ways too. For example, we could use an additional LLM call to generate a summary of the conversation before calling our chain. Let's recreate our chat history and chatbot chain:"
+    "We can use this same pattern in other ways too. For example, we could use an additional LLM call to generate a summary of the conversation before calling our app. Let's recreate our chat history:"
   ]
  },
  {
   "cell_type": "code",
-   "execution_count": 17,
+   "execution_count": 10,
   "metadata": {},
-   "outputs": [
-    {
-     "data": {
-      "text/plain": [
-       "[HumanMessage(content=\"Hey there! I'm Nemo.\"),\n",
-       " AIMessage(content='Hello!'),\n",
-       " HumanMessage(content='How are you today?'),\n",
-       " AIMessage(content='Fine thanks!')]"
-      ]
-     },
-     "execution_count": 17,
-     "metadata": {},
-     "output_type": "execute_result"
-    }
-   ],
+   "outputs": [],
   "source": [
-    "demo_ephemeral_chat_history = ChatMessageHistory()\n",
-    "\n",
-    "demo_ephemeral_chat_history.add_user_message(\"Hey there! I'm Nemo.\")\n",
-    "demo_ephemeral_chat_history.add_ai_message(\"Hello!\")\n",
-    "demo_ephemeral_chat_history.add_user_message(\"How are you today?\")\n",
-    "demo_ephemeral_chat_history.add_ai_message(\"Fine thanks!\")\n",
-    "\n",
-    "demo_ephemeral_chat_history.messages"
+    "demo_ephemeral_chat_history = [\n",
+    "    HumanMessage(content=\"Hey there! I'm Nemo.\"),\n",
+    "    AIMessage(content=\"Hello!\"),\n",
+    "    HumanMessage(content=\"How are you today?\"),\n",
+    "    AIMessage(content=\"Fine thanks!\"),\n",
+    "]"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
-    "We'll slightly modify the prompt to make the LLM aware that will receive a condensed summary instead of a chat history:"
+    "And now, let's update the model-calling function to distill previous interactions into a summary:"
   ]
  },
  {
   "cell_type": "code",
-   "execution_count": 18,
+   "execution_count": 13,
   "metadata": {},
   "outputs": [],
   "source": [
-    "prompt = ChatPromptTemplate.from_messages(\n",
-    "    [\n",
-    "        (\n",
-    "            \"system\",\n",
-    "            \"You are a helpful assistant. Answer all questions to the best of your ability. The provided chat history includes facts about the user you are speaking with.\",\n",
-    "        ),\n",
-    "        (\"placeholder\", \"{chat_history}\"),\n",
-    "        (\"user\", \"{input}\"),\n",
-    "    ]\n",
-    ")\n",
+    "from langchain_core.messages import HumanMessage, RemoveMessage\n",
+    "from langgraph.checkpoint.memory import MemorySaver\n",
+    "from langgraph.graph import START, MessagesState, StateGraph\n",
    "\n",
-    "chain = prompt | chat\n",
+    "workflow = StateGraph(state_schema=MessagesState)\n",
    "\n",
-    "chain_with_message_history = RunnableWithMessageHistory(\n",
-    "    chain,\n",
-    "    lambda session_id: demo_ephemeral_chat_history,\n",
-    "    input_messages_key=\"input\",\n",
-    "    history_messages_key=\"chat_history\",\n",
-    ")"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "metadata": {},
-   "source": [
-    "And now, let's create a function that will distill previous interactions into a summary. We can add this one to the front of the chain too:"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": 19,
-   "metadata": {},
-   "outputs": [],
-   "source": [
-    "def summarize_messages(chain_input):\n",
-    "    stored_messages = demo_ephemeral_chat_history.messages\n",
-    "    if len(stored_messages) == 0:\n",
-    "        return False\n",
-    "    summarization_prompt = ChatPromptTemplate.from_messages(\n",
-    "        [\n",
-    "            (\"placeholder\", \"{chat_history}\"),\n",
-    "            (\n",
-    "                \"user\",\n",
-    "                \"Distill the above chat messages into a single summary message. Include as many specific details as you can.\",\n",
-    "            ),\n",
-    "        ]\n",
+    "\n",
+    "# Define the function that calls the model\n",
+    "def call_model(state: MessagesState):\n",
+    "    system_prompt = (\n",
+    "        \"You are a helpful assistant. \"\n",
+    "        \"Answer all questions to the best of your ability. \"\n",
+    "        \"The provided chat history includes a summary of the earlier conversation.\"\n",
    "    )\n",
-    "    summarization_chain = summarization_prompt | chat\n",
+    "    system_message = SystemMessage(content=system_prompt)\n",
+    "    # Summarize the messages\n",
+    "    if len(state[\"messages\"]) > 1:\n",
+    "        *message_history, last_human_message = state[\"messages\"]\n",
+    "        # Invoke the model to generate conversation summary\n",
+    "        summary_prompt = (\n",
+    "            \"Distill the above chat messages into a single summary message. \"\n",
+    "            \"Include as many specific details as you can.\"\n",
+    "        )\n",
+    "        summary_message = model.invoke(\n",
+    "            message_history + [HumanMessage(content=summary_prompt)]\n",
+    "        )\n",
+    "        # Delete messages that we no longer want to show up\n",
+    "        delete_messages = [RemoveMessage(id=m.id) for m in state[\"messages\"]]\n",
+    "        # Re-add user message\n",
+    "        human_message = HumanMessage(content=last_human_message.content)\n",
+    "        # Call the model with summary & response\n",
+    "        response = model.invoke([system_message, summary_message, human_message])\n",
+    "        message_updates = [summary_message, human_message, response] + delete_messages\n",
+    "    else:\n",
+    "        message_updates = model.invoke([system_message] + state[\"messages\"])\n",
    "\n",
-    "    summary_message = summarization_chain.invoke({\"chat_history\": stored_messages})\n",
-    "\n",
-    "    demo_ephemeral_chat_history.clear()\n",
-    "\n",
-    "    demo_ephemeral_chat_history.add_message(summary_message)\n",
-    "\n",
-    "    return True\n",
+    "    return {\"messages\": message_updates}\n",
    "\n",
    "\n",
-    "chain_with_summarization = (\n",
-    "    RunnablePassthrough.assign(messages_summarized=summarize_messages)\n",
-    "    | chain_with_message_history\n",
-    ")"
+    "# Define the node and edge\n",
+    "workflow.add_node(\"model\", call_model)\n",
+    "workflow.add_edge(START, \"model\")\n",
+    "\n",
+    "# Add simple in-memory checkpointer\n",
+    "memory = MemorySaver()\n",
+    "app = workflow.compile(checkpointer=memory)"
   ]
  },
  {
@ -712,54 +485,37 @@
  },
  {
   "cell_type": "code",
-   "execution_count": 20,
+   "execution_count": 14,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
-       "AIMessage(content='You introduced yourself as Nemo. How can I assist you today, Nemo?')"
+       "{'messages': [AIMessage(content='Nemo greeted me, and I responded positively, indicating that I am doing fine.', additional_kwargs={'refusal': None}, response_metadata={'token_usage': {'completion_tokens': 17, 'prompt_tokens': 60, 'total_tokens': 77, 'completion_tokens_details': {'reasoning_tokens': 0}}, 'model_name': 'gpt-4o-mini-2024-07-18', 'system_fingerprint': 'fp_1bb46167f9', 'finish_reason': 'stop', 'logprobs': None}, id='run-94df0e9f-6b1c-4e68-858c-5b23058b16d8-0', usage_metadata={'input_tokens': 60, 'output_tokens': 17, 'total_tokens': 77}),\n",
+       "  HumanMessage(content='What did I say my name was?', additional_kwargs={}, response_metadata={}, id='d3f57f56-dd1a-45f9-add2-146f54c1180c'),\n",
+       "  AIMessage(content='You mentioned that your name is Nemo.', additional_kwargs={'refusal': None}, response_metadata={'token_usage': {'completion_tokens': 8, 'prompt_tokens': 68, 'total_tokens': 76, 'completion_tokens_details': {'reasoning_tokens': 0}}, 'model_name': 'gpt-4o-mini-2024-07-18', 'system_fingerprint': 'fp_1bb46167f9', 'finish_reason': 'stop', 'logprobs': None}, id='run-ea144209-5d37-4bb5-8529-be235626fc74-0', usage_metadata={'input_tokens': 68, 'output_tokens': 8, 'total_tokens': 76})]}"
      ]
     },
-     "execution_count": 20,
+     "execution_count": 14,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
-    "chain_with_summarization.invoke(\n",
-    "    {\"input\": \"What did I say my name was?\"},\n",
-    "    {\"configurable\": {\"session_id\": \"unused\"}},\n",
+    "app.invoke(\n",
+    "    {\n",
+    "        \"messages\": demo_ephemeral_chat_history\n",
+    "        + [HumanMessage(\"What did I say my name was?\")]\n",
+    "    },\n",
+    "    config={\"configurable\": {\"thread_id\": \"4\"}},\n",
    ")"
   ]
  },
-  {
-   "cell_type": "code",
-   "execution_count": 21,
-   "metadata": {},
-   "outputs": [
-    {
-     "data": {
-      "text/plain": [
-       "[AIMessage(content='The conversation is between Nemo and an AI. Nemo introduces himself and the AI responds with a greeting. Nemo then asks the AI how it is doing, and the AI responds that it is fine.'),\n",
-       " HumanMessage(content='What did I say my name was?'),\n",
-       " AIMessage(content='You introduced yourself as Nemo. How can I assist you today, Nemo?')]"
-      ]
-     },
-     "execution_count": 21,
-     "metadata": {},
-     "output_type": "execute_result"
-    }
-   ],
-   "source": [
-    "demo_ephemeral_chat_history.messages"
-   ]
-  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
-    "Note that invoking the chain again will generate another summary generated from the initial summary plus new messages and so on. You could also design a hybrid approach where a certain number of messages are retained in chat history while others are summarized."
+    "Note that invoking the app again will generate another summary generated from the initial summary plus new messages and so on. You could also design a hybrid approach where a certain number of messages are retained in chat history while others are summarized."
   ]
  }
 ],
@ -779,7 +535,7 @@
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
-   "version": "3.11.9"
+   "version": "3.12.3"
  }
 },
 "nbformat": 4,