docs: update chat history in rag how-to (#26821)

Update how-to add chat history to rag
2025-08-09 13:00:34 +00:00 · 2024-09-24 13:50:11 -04:00 · 2024-09-24 13:50:11 -04:00 · 15d49d3df2
commit 15d49d3df2
parent 2b38a4ee55
1 changed files with 190 additions and 160 deletions
--- a/docs/docs/how_to/qa_chat_history_how_to.ipynb
+++ b/docs/docs/how_to/qa_chat_history_how_to.ipynb
@ -7,6 +7,15 @@
   "source": [
    "# How to add chat history\n",
    "\n",
+    "\n",
+    ":::{.callout-note}\n",
+    "\n",
+    "This tutorial previously built a chatbot using [RunnableWithMessageHistory](https://python.langchain.com/api_reference/core/runnables/langchain_core.runnables.history.RunnableWithMessageHistory.html). You can access this version of the tutorial in the [v0.2 docs](https://python.langchain.com/v0.2/docs/how_to/qa_chat_history_how_to/).\n",
+    "\n",
+    "The LangGraph implementation offers a number of advantages over `RunnableWithMessageHistory`, including the ability to persist arbitrary components of an application's state (instead of only messages).\n",
+    "\n",
+    ":::\n",
+    "\n",
    "In many Q&A applications we want to allow the user to have a back-and-forth conversation, meaning the application needs some sort of \"memory\" of past questions and answers, and some logic for incorporating those into its current thinking.\n",
    "\n",
    "In this guide we focus on **adding logic for incorporating historical messages.**\n",
@ -29,7 +38,7 @@
    "\n",
    "### Dependencies\n",
    "\n",
-    "We'll use OpenAI embeddings and a Chroma vector store in this walkthrough, but everything shown here works with any [Embeddings](/docs/concepts#embedding-models), and [VectorStore](/docs/concepts#vectorstores) or [Retriever](/docs/concepts#retrievers). \n",
+    "We'll use OpenAI embeddings and an InMemory vector store in this walkthrough, but everything shown here works with any [Embeddings](/docs/concepts#embedding-models), and [VectorStore](/docs/concepts#vectorstores) or [Retriever](/docs/concepts#retrievers). \n",
    "\n",
    "We'll use the following packages:"
   ]
@ -42,7 +51,7 @@
   "outputs": [],
   "source": [
    "%%capture --no-stderr\n",
-    "%pip install --upgrade --quiet  langchain langchain-community langchain-chroma beautifulsoup4"
+    "%pip install --upgrade --quiet  langchain langchain-community beautifulsoup4"
   ]
  },
  {
@ -64,11 +73,7 @@
    "import os\n",
    "\n",
    "if not os.environ.get(\"OPENAI_API_KEY\"):\n",
-    "    os.environ[\"OPENAI_API_KEY\"] = getpass.getpass()\n",
-    "\n",
-    "# import dotenv\n",
-    "\n",
-    "# dotenv.load_dotenv()"
+    "    os.environ[\"OPENAI_API_KEY\"] = getpass.getpass()"
   ]
  },
  {
@ -155,7 +160,7 @@
   "id": "15f8ad59-19de-42e3-85a8-3ba95ee0bd43",
   "metadata": {},
   "source": [
-    "For the retriever, we will use [WebBaseLoader](https://python.langchain.com/api_reference/community/document_loaders/langchain_community.document_loaders.web_base.WebBaseLoader.html) to load the content of a web page. Here we instantiate a `Chroma` vectorstore and then use its [.as_retriever](https://python.langchain.com/api_reference/core/vectorstores/langchain_core.vectorstores.VectorStore.html#langchain_core.vectorstores.VectorStore.as_retriever) method to build a retriever that can be incorporated into [LCEL](/docs/concepts/#langchain-expression-language) chains."
+    "For the retriever, we will use [WebBaseLoader](https://python.langchain.com/api_reference/community/document_loaders/langchain_community.document_loaders.web_base.WebBaseLoader.html) to load the content of a web page. Here we instantiate a `InMemoryVectorStore` vectorstore and then use its [.as_retriever](https://python.langchain.com/api_reference/core/vectorstores/langchain_core.vectorstores.VectorStore.html#langchain_core.vectorstores.VectorStore.as_retriever) method to build a retriever that can be incorporated into [LCEL](/docs/concepts/#langchain-expression-language) chains."
   ]
  },
  {
@ -163,16 +168,24 @@
   "execution_count": 5,
   "id": "820244ae-74b4-4593-b392-822979dd91b8",
   "metadata": {},
-   "outputs": [],
+   "outputs": [
+    {
+     "name": "stderr",
+     "output_type": "stream",
+     "text": [
+      "USER_AGENT environment variable not set, consider setting it to identify your requests.\n"
+     ]
+    }
+   ],
   "source": [
    "import bs4\n",
    "from langchain.chains import create_retrieval_chain\n",
    "from langchain.chains.combine_documents import create_stuff_documents_chain\n",
-    "from langchain_chroma import Chroma\n",
    "from langchain_community.document_loaders import WebBaseLoader\n",
    "from langchain_core.output_parsers import StrOutputParser\n",
    "from langchain_core.prompts import ChatPromptTemplate\n",
    "from langchain_core.runnables import RunnablePassthrough\n",
+    "from langchain_core.vectorstores import InMemoryVectorStore\n",
    "from langchain_openai import OpenAIEmbeddings\n",
    "from langchain_text_splitters import RecursiveCharacterTextSplitter\n",
    "\n",
@ -188,7 +201,8 @@
    "\n",
    "text_splitter = RecursiveCharacterTextSplitter(chunk_size=1000, chunk_overlap=200)\n",
    "splits = text_splitter.split_documents(docs)\n",
-    "vectorstore = Chroma.from_documents(documents=splits, embedding=OpenAIEmbeddings())\n",
+    "vectorstore = InMemoryVectorStore(embedding=OpenAIEmbeddings())\n",
+    "vectorstore.add_documents(splits)\n",
    "retriever = vectorstore.as_retriever()"
   ]
  },
@ -288,8 +302,8 @@
    "        (\"human\", \"{input}\"),\n",
    "    ]\n",
    ")\n",
-    "question_answer_chain = create_stuff_documents_chain(llm, qa_prompt)\n",
    "\n",
+    "question_answer_chain = create_stuff_documents_chain(llm, qa_prompt)\n",
    "rag_chain = create_retrieval_chain(history_aware_retriever, question_answer_chain)"
   ]
  },
@ -298,20 +312,17 @@
   "id": "53a662c2-f38b-45f9-95c4-66de15637614",
   "metadata": {},
   "source": [
-    "### Adding chat history\n",
+    "### Stateful Management of chat history\n",
    "\n",
-    "To manage the chat history, we will need:\n",
+    "We have added application logic for incorporating chat history, but we are still manually plumbing it through our application. In production, the Q&A application we usually persist the chat history into a database, and be able to read and update it appropriately.\n",
    "\n",
-    "1. An object for storing the chat history;\n",
-    "2. An object that wraps our chain and manages updates to the chat history.\n",
+    "[LangGraph](https://langchain-ai.github.io/langgraph/) implements a built-in [persistence layer](https://langchain-ai.github.io/langgraph/concepts/persistence/), making it ideal for chat applications that support multiple conversational turns.\n",
    "\n",
-    "For these we will use [BaseChatMessageHistory](https://python.langchain.com/api_reference/core/chat_history/langchain_core.chat_history.BaseChatMessageHistory.html) and [RunnableWithMessageHistory](https://python.langchain.com/api_reference/core/runnables/langchain_core.runnables.history.RunnableWithMessageHistory.html). The latter is a wrapper for an LCEL chain and a `BaseChatMessageHistory` that handles injecting chat history into inputs and updating it after each invocation.\n",
+    "Wrapping our chat model in a minimal LangGraph application allows us to automatically persist the message history, simplifying the development of multi-turn applications.\n",
    "\n",
-    "For a detailed walkthrough of how to use these classes together to create a stateful conversational chain, head to the [How to add message history (memory)](/docs/how_to/message_history/) LCEL how-to guide.\n",
+    "LangGraph comes with a simple [in-memory checkpointer](https://langchain-ai.github.io/langgraph/reference/checkpoints/#memorysaver), which we use below. See its documentation for more detail, including how to use different persistence backends (e.g., SQLite or Postgres).\n",
    "\n",
-    "Below, we implement a simple example of the second option, in which chat histories are stored in a simple dict. LangChain manages memory integrations with [Redis](/docs/integrations/memory/redis_chat_message_history/) and other technologies to provide for more robust persistence.\n",
-    "\n",
-    "Instances of `RunnableWithMessageHistory` manage the chat history for you. They accept a config with a key (`\"session_id\"` by default) that specifies what conversation history to fetch and prepend to the input, and append the output to the same conversation history. Below is an example:"
+    "For a detailed walkthrough of how to manage message history, head to the How to add message history (memory) guide."
   ]
  },
  {
@ -321,26 +332,48 @@
   "metadata": {},
   "outputs": [],
   "source": [
-    "from langchain_community.chat_message_histories import ChatMessageHistory\n",
-    "from langchain_core.chat_history import BaseChatMessageHistory\n",
-    "from langchain_core.runnables.history import RunnableWithMessageHistory\n",
+    "from typing import Sequence\n",
    "\n",
-    "store = {}\n",
+    "from langchain_core.messages import AIMessage, BaseMessage, HumanMessage\n",
+    "from langgraph.checkpoint.memory import MemorySaver\n",
+    "from langgraph.graph import START, StateGraph\n",
+    "from langgraph.graph.message import add_messages\n",
+    "from typing_extensions import Annotated, TypedDict\n",
    "\n",
    "\n",
-    "def get_session_history(session_id: str) -> BaseChatMessageHistory:\n",
-    "    if session_id not in store:\n",
-    "        store[session_id] = ChatMessageHistory()\n",
-    "    return store[session_id]\n",
+    "# We define a dict representing the state of the application.\n",
+    "# This state has the same input and output keys as `rag_chain`.\n",
+    "class State(TypedDict):\n",
+    "    input: str\n",
+    "    chat_history: Annotated[Sequence[BaseMessage], add_messages]\n",
+    "    context: str\n",
+    "    answer: str\n",
    "\n",
    "\n",
-    "conversational_rag_chain = RunnableWithMessageHistory(\n",
-    "    rag_chain,\n",
-    "    get_session_history,\n",
-    "    input_messages_key=\"input\",\n",
-    "    history_messages_key=\"chat_history\",\n",
-    "    output_messages_key=\"answer\",\n",
-    ")"
+    "# We then define a simple node that runs the `rag_chain`.\n",
+    "# The `return` values of the node update the graph state, so here we just\n",
+    "# update the chat history with the input message and response.\n",
+    "def call_model(state: State):\n",
+    "    response = rag_chain.invoke(state)\n",
+    "    return {\n",
+    "        \"chat_history\": [\n",
+    "            HumanMessage(state[\"input\"]),\n",
+    "            AIMessage(response[\"answer\"]),\n",
+    "        ],\n",
+    "        \"context\": response[\"context\"],\n",
+    "        \"answer\": response[\"answer\"],\n",
+    "    }\n",
+    "\n",
+    "\n",
+    "# Our graph consists only of one node:\n",
+    "workflow = StateGraph(state_schema=State)\n",
+    "workflow.add_edge(START, \"model\")\n",
+    "workflow.add_node(\"model\", call_model)\n",
+    "\n",
+    "# Finally, we compile the graph with a checkpointer object.\n",
+    "# This persists the state, in this case in memory.\n",
+    "memory = MemorySaver()\n",
+    "app = workflow.compile(checkpointer=memory)"
   ]
  },
  {
@ -350,23 +383,21 @@
   "metadata": {},
   "outputs": [
    {
-     "data": {
-      "text/plain": [
-       "'Task decomposition involves breaking down a complex task into smaller and simpler steps to make it more manageable and easier to accomplish. This process can be done using techniques like Chain of Thought (CoT) or Tree of Thoughts to guide the model in breaking down tasks effectively. Task decomposition can be facilitated by providing simple prompts to a language model, task-specific instructions, or human inputs.'"
-      ]
-     },
-     "execution_count": 10,
-     "metadata": {},
-     "output_type": "execute_result"
+     "name": "stdout",
+     "output_type": "stream",
+     "text": [
+      "Task decomposition is a technique used to break down complex tasks into smaller and simpler steps. This process helps agents or models tackle difficult tasks by dividing them into more manageable subtasks. Task decomposition can be achieved through methods like Chain of Thought or Tree of Thoughts, which guide the model in thinking step by step or exploring multiple reasoning possibilities at each step.\n"
+     ]
    }
   ],
   "source": [
-    "conversational_rag_chain.invoke(\n",
+    "config = {\"configurable\": {\"thread_id\": \"abc123\"}}\n",
+    "\n",
+    "result = app.invoke(\n",
    "    {\"input\": \"What is Task Decomposition?\"},\n",
-    "    config={\n",
-    "        \"configurable\": {\"session_id\": \"abc123\"}\n",
-    "    },  # constructs a key \"abc123\" in `store`.\n",
-    ")[\"answer\"]"
+    "    config=config,\n",
+    ")\n",
+    "print(result[\"answer\"])"
   ]
  },
  {
@ -376,21 +407,19 @@
   "metadata": {},
   "outputs": [
    {
-     "data": {
-      "text/plain": [
-       "'Task decomposition can be achieved through various methods, including using techniques like Chain of Thought (CoT) or Tree of Thoughts to guide the model in breaking down tasks effectively. Common ways of task decomposition include providing simple prompts to a language model, task-specific instructions, or human inputs to break down complex tasks into smaller and more manageable steps. Additionally, task decomposition can involve utilizing resources like internet access for information gathering, long-term memory management, and GPT-3.5 powered agents for delegation of simple tasks.'"
-      ]
-     },
-     "execution_count": 11,
-     "metadata": {},
-     "output_type": "execute_result"
+     "name": "stdout",
+     "output_type": "stream",
+     "text": [
+      "One way of task decomposition is by using a Language Model (LLM) with simple prompting, such as providing instructions like \"Steps for XYZ\" or \"What are the subgoals for achieving XYZ?\" This method guides the LLM to break down the task into smaller components for easier processing and execution.\n"
+     ]
    }
   ],
   "source": [
-    "conversational_rag_chain.invoke(\n",
-    "    {\"input\": \"What are common ways of doing it?\"},\n",
-    "    config={\"configurable\": {\"session_id\": \"abc123\"}},\n",
-    ")[\"answer\"]"
+    "result = app.invoke(\n",
+    "    {\"input\": \"What is one way of doing it?\"},\n",
+    "    config=config,\n",
+    ")\n",
+    "print(result[\"answer\"])"
   ]
  },
  {
@ -398,7 +427,7 @@
   "id": "3ab59258-84bc-4904-880e-2ebfebbca563",
   "metadata": {},
   "source": [
-    "The conversation history can be inspected in the `store` dict:"
+    "The conversation history can be inspected via the state of the application:"
   ]
  },
  {
@ -411,27 +440,25 @@
     "name": "stdout",
     "output_type": "stream",
     "text": [
-      "User: What is Task Decomposition?\n",
+      "================================\u001b[1m Human Message \u001b[0m=================================\n",
      "\n",
-      "AI: Task decomposition involves breaking down a complex task into smaller and simpler steps to make it more manageable and easier to accomplish. This process can be done using techniques like Chain of Thought (CoT) or Tree of Thoughts to guide the model in breaking down tasks effectively. Task decomposition can be facilitated by providing simple prompts to a language model, task-specific instructions, or human inputs.\n",
+      "What is Task Decomposition?\n",
+      "==================================\u001b[1m Ai Message \u001b[0m==================================\n",
      "\n",
-      "User: What are common ways of doing it?\n",
+      "Task decomposition is a technique used to break down complex tasks into smaller and simpler steps. This process helps agents or models tackle difficult tasks by dividing them into more manageable subtasks. Task decomposition can be achieved through methods like Chain of Thought or Tree of Thoughts, which guide the model in thinking step by step or exploring multiple reasoning possibilities at each step.\n",
+      "================================\u001b[1m Human Message \u001b[0m=================================\n",
      "\n",
-      "AI: Task decomposition can be achieved through various methods, including using techniques like Chain of Thought (CoT) or Tree of Thoughts to guide the model in breaking down tasks effectively. Common ways of task decomposition include providing simple prompts to a language model, task-specific instructions, or human inputs to break down complex tasks into smaller and more manageable steps. Additionally, task decomposition can involve utilizing resources like internet access for information gathering, long-term memory management, and GPT-3.5 powered agents for delegation of simple tasks.\n",
-      "\n"
+      "What is one way of doing it?\n",
+      "==================================\u001b[1m Ai Message \u001b[0m==================================\n",
+      "\n",
+      "One way of task decomposition is by using a Language Model (LLM) with simple prompting, such as providing instructions like \"Steps for XYZ\" or \"What are the subgoals for achieving XYZ?\" This method guides the LLM to break down the task into smaller components for easier processing and execution.\n"
     ]
    }
   ],
   "source": [
-    "from langchain_core.messages import AIMessage\n",
-    "\n",
-    "for message in store[\"abc123\"].messages:\n",
-    "    if isinstance(message, AIMessage):\n",
-    "        prefix = \"AI\"\n",
-    "    else:\n",
-    "        prefix = \"User\"\n",
-    "\n",
-    "    print(f\"{prefix}: {message.content}\\n\")"
+    "chat_history = app.get_state(config).values[\"chat_history\"]\n",
+    "for message in chat_history:\n",
+    "    message.pretty_print()"
   ]
  },
  {
@ -459,17 +486,24 @@
   "metadata": {},
   "outputs": [],
   "source": [
+    "from typing import Sequence\n",
+    "\n",
    "import bs4\n",
    "from langchain.chains import create_history_aware_retriever, create_retrieval_chain\n",
    "from langchain.chains.combine_documents import create_stuff_documents_chain\n",
-    "from langchain_chroma import Chroma\n",
    "from langchain_community.chat_message_histories import ChatMessageHistory\n",
    "from langchain_community.document_loaders import WebBaseLoader\n",
    "from langchain_core.chat_history import BaseChatMessageHistory\n",
+    "from langchain_core.messages import AIMessage, BaseMessage, HumanMessage\n",
    "from langchain_core.prompts import ChatPromptTemplate, MessagesPlaceholder\n",
    "from langchain_core.runnables.history import RunnableWithMessageHistory\n",
+    "from langchain_core.vectorstores import InMemoryVectorStore\n",
    "from langchain_openai import ChatOpenAI, OpenAIEmbeddings\n",
    "from langchain_text_splitters import RecursiveCharacterTextSplitter\n",
+    "from langgraph.checkpoint.memory import MemorySaver\n",
+    "from langgraph.graph import START, StateGraph\n",
+    "from langgraph.graph.message import add_messages\n",
+    "from typing_extensions import Annotated, TypedDict\n",
    "\n",
    "llm = ChatOpenAI(model=\"gpt-3.5-turbo\", temperature=0)\n",
    "\n",
@ -487,7 +521,9 @@
    "\n",
    "text_splitter = RecursiveCharacterTextSplitter(chunk_size=1000, chunk_overlap=200)\n",
    "splits = text_splitter.split_documents(docs)\n",
-    "vectorstore = Chroma.from_documents(documents=splits, embedding=OpenAIEmbeddings())\n",
+    "\n",
+    "vectorstore = InMemoryVectorStore(embedding=OpenAIEmbeddings())\n",
+    "vectorstore.add_documents(documents=splits)\n",
    "retriever = vectorstore.as_retriever()\n",
    "\n",
    "\n",
@ -534,22 +570,41 @@
    "\n",
    "\n",
    "### Statefully manage chat history ###\n",
-    "store = {}\n",
    "\n",
    "\n",
-    "def get_session_history(session_id: str) -> BaseChatMessageHistory:\n",
-    "    if session_id not in store:\n",
-    "        store[session_id] = ChatMessageHistory()\n",
-    "    return store[session_id]\n",
+    "# We define a dict representing the state of the application.\n",
+    "# This state has the same input and output keys as `rag_chain`.\n",
+    "class State(TypedDict):\n",
+    "    input: str\n",
+    "    chat_history: Annotated[Sequence[BaseMessage], add_messages]\n",
+    "    context: str\n",
+    "    answer: str\n",
    "\n",
    "\n",
-    "conversational_rag_chain = RunnableWithMessageHistory(\n",
-    "    rag_chain,\n",
-    "    get_session_history,\n",
-    "    input_messages_key=\"input\",\n",
-    "    history_messages_key=\"chat_history\",\n",
-    "    output_messages_key=\"answer\",\n",
-    ")"
+    "# We then define a simple node that runs the `rag_chain`.\n",
+    "# The `return` values of the node update the graph state, so here we just\n",
+    "# update the chat history with the input message and response.\n",
+    "def call_model(state: State):\n",
+    "    response = rag_chain.invoke(state)\n",
+    "    return {\n",
+    "        \"chat_history\": [\n",
+    "            HumanMessage(state[\"input\"]),\n",
+    "            AIMessage(response[\"answer\"]),\n",
+    "        ],\n",
+    "        \"context\": response[\"context\"],\n",
+    "        \"answer\": response[\"answer\"],\n",
+    "    }\n",
+    "\n",
+    "\n",
+    "# Our graph consists only of one node:\n",
+    "workflow = StateGraph(state_schema=State)\n",
+    "workflow.add_edge(START, \"model\")\n",
+    "workflow.add_node(\"model\", call_model)\n",
+    "\n",
+    "# Finally, we compile the graph with a checkpointer object.\n",
+    "# This persists the state, in this case in memory.\n",
+    "memory = MemorySaver()\n",
+    "app = workflow.compile(checkpointer=memory)"
   ]
  },
  {
@ -559,23 +614,21 @@
   "metadata": {},
   "outputs": [
    {
-     "data": {
-      "text/plain": [
-       "'Task decomposition involves breaking down a complex task into smaller and simpler steps to make it more manageable. Techniques like Chain of Thought (CoT) and Tree of Thoughts help in decomposing hard tasks into multiple manageable tasks by instructing models to think step by step and explore multiple reasoning possibilities at each step. Task decomposition can be achieved through various methods such as using prompting techniques, task-specific instructions, or human inputs.'"
-      ]
-     },
-     "execution_count": 14,
-     "metadata": {},
-     "output_type": "execute_result"
+     "name": "stdout",
+     "output_type": "stream",
+     "text": [
+      "Task decomposition is a technique used to break down complex tasks into smaller and simpler steps. This process helps agents or models tackle difficult tasks by dividing them into more manageable subtasks. Different methods like Chain of Thought and Tree of Thoughts are used to guide the decomposition process, enabling a step-by-step approach to problem-solving.\n"
+     ]
    }
   ],
   "source": [
-    "conversational_rag_chain.invoke(\n",
+    "config = {\"configurable\": {\"thread_id\": \"abc123\"}}\n",
+    "\n",
+    "result = app.invoke(\n",
    "    {\"input\": \"What is Task Decomposition?\"},\n",
-    "    config={\n",
-    "        \"configurable\": {\"session_id\": \"abc123\"}\n",
-    "    },  # constructs a key \"abc123\" in `store`.\n",
-    ")[\"answer\"]"
+    "    config=config,\n",
+    ")\n",
+    "print(result[\"answer\"])"
   ]
  },
  {
@ -585,21 +638,19 @@
   "metadata": {},
   "outputs": [
    {
-     "data": {
-      "text/plain": [
-       "'Task decomposition can be done in common ways such as using prompting techniques like Chain of Thought (CoT) or Tree of Thoughts, which instruct models to think step by step and explore multiple reasoning possibilities at each step. Another way is to provide task-specific instructions, such as asking to \"Write a story outline\" for writing a novel, to guide the decomposition process. Additionally, task decomposition can also involve human inputs to break down complex tasks into smaller and simpler steps.'"
-      ]
-     },
-     "execution_count": 15,
-     "metadata": {},
-     "output_type": "execute_result"
+     "name": "stdout",
+     "output_type": "stream",
+     "text": [
+      "One way of task decomposition is by using Large Language Models (LLMs) with simple prompting, such as providing instructions like \"Steps for XYZ\" or \"What are the subgoals for achieving XYZ?\" This method leverages the capabilities of LLMs to break down tasks into smaller components, making them easier to manage and solve.\n"
+     ]
    }
   ],
   "source": [
-    "conversational_rag_chain.invoke(\n",
-    "    {\"input\": \"What are common ways of doing it?\"},\n",
-    "    config={\"configurable\": {\"session_id\": \"abc123\"}},\n",
-    ")[\"answer\"]"
+    "result = app.invoke(\n",
+    "    {\"input\": \"What is one way of doing it?\"},\n",
+    "    config=config,\n",
+    ")\n",
+    "print(result[\"answer\"])"
   ]
  },
  {
@ -672,22 +723,11 @@
   "id": "52ae46d9-43f7-481b-96d5-df750be3ad65",
   "metadata": {},
   "outputs": [
-    {
-     "name": "stderr",
-     "output_type": "stream",
-     "text": [
-      "Error in LangChainTracer.on_tool_end callback: TracerException(\"Found chain run at ID 5cd28d13-88dd-4eac-a465-3770ac27eff6, but expected {'tool'} run.\")\n"
-     ]
-    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
-      "{'agent': {'messages': [AIMessage(content='', additional_kwargs={'tool_calls': [{'id': 'call_TbhPPPN05GKi36HLeaN4QM90', 'function': {'arguments': '{\"query\":\"Task Decomposition\"}', 'name': 'blog_post_retriever'}, 'type': 'function'}]}, response_metadata={'token_usage': {'completion_tokens': 19, 'prompt_tokens': 68, 'total_tokens': 87}, 'model_name': 'gpt-3.5-turbo', 'system_fingerprint': None, 'finish_reason': 'tool_calls', 'logprobs': None}, id='run-2e60d910-879a-4a2a-b1e9-6a6c5c7d7ebc-0', tool_calls=[{'name': 'blog_post_retriever', 'args': {'query': 'Task Decomposition'}, 'id': 'call_TbhPPPN05GKi36HLeaN4QM90'}])]}}\n",
-      "----\n",
-      "{'tools': {'messages': [ToolMessage(content='Fig. 1. Overview of a LLM-powered autonomous agent system.\\nComponent One: Planning#\\nA complicated task usually involves many steps. An agent needs to know what they are and plan ahead.\\nTask Decomposition#\\nChain of thought (CoT; Wei et al. 2022) has become a standard prompting technique for enhancing model performance on complex tasks. The model is instructed to “think step by step” to utilize more test-time computation to decompose hard tasks into smaller and simpler steps. CoT transforms big tasks into multiple manageable tasks and shed lights into an interpretation of the model’s thinking process.\\n\\nFig. 1. Overview of a LLM-powered autonomous agent system.\\nComponent One: Planning#\\nA complicated task usually involves many steps. An agent needs to know what they are and plan ahead.\\nTask Decomposition#\\nChain of thought (CoT; Wei et al. 2022) has become a standard prompting technique for enhancing model performance on complex tasks. The model is instructed to “think step by step” to utilize more test-time computation to decompose hard tasks into smaller and simpler steps. CoT transforms big tasks into multiple manageable tasks and shed lights into an interpretation of the model’s thinking process.\\n\\nTree of Thoughts (Yao et al. 2023) extends CoT by exploring multiple reasoning possibilities at each step. It first decomposes the problem into multiple thought steps and generates multiple thoughts per step, creating a tree structure. The search process can be BFS (breadth-first search) or DFS (depth-first search) with each state evaluated by a classifier (via a prompt) or majority vote.\\nTask decomposition can be done (1) by LLM with simple prompting like \"Steps for XYZ.\\\\n1.\", \"What are the subgoals for achieving XYZ?\", (2) by using task-specific instructions; e.g. \"Write a story outline.\" for writing a novel, or (3) with human inputs.\\n\\nTree of Thoughts (Yao et al. 2023) extends CoT by exploring multiple reasoning possibilities at each step. It first decomposes the problem into multiple thought steps and generates multiple thoughts per step, creating a tree structure. The search process can be BFS (breadth-first search) or DFS (depth-first search) with each state evaluated by a classifier (via a prompt) or majority vote.\\nTask decomposition can be done (1) by LLM with simple prompting like \"Steps for XYZ.\\\\n1.\", \"What are the subgoals for achieving XYZ?\", (2) by using task-specific instructions; e.g. \"Write a story outline.\" for writing a novel, or (3) with human inputs.', name='blog_post_retriever', tool_call_id='call_TbhPPPN05GKi36HLeaN4QM90')]}}\n",
-      "----\n",
-      "{'agent': {'messages': [AIMessage(content='Task decomposition is a technique used to break down complex tasks into smaller and simpler steps. This approach helps in transforming big tasks into multiple manageable tasks, making it easier for autonomous agents to handle and interpret the thinking process. One common method for task decomposition is the Chain of Thought (CoT) technique, where models are instructed to \"think step by step\" to decompose hard tasks. Another extension of CoT is the Tree of Thoughts, which explores multiple reasoning possibilities at each step by creating a tree structure of multiple thoughts per step. Task decomposition can be facilitated through various methods such as using simple prompts, task-specific instructions, or human inputs.', response_metadata={'token_usage': {'completion_tokens': 130, 'prompt_tokens': 636, 'total_tokens': 766}, 'model_name': 'gpt-3.5-turbo', 'system_fingerprint': None, 'finish_reason': 'stop', 'logprobs': None}, id='run-3ef17638-65df-4030-a7fe-795e6da91c69-0')]}}\n",
+      "{'agent': {'messages': [AIMessage(content='Task decomposition is a problem-solving strategy that involves breaking down a complex task or problem into smaller, more manageable subtasks. By decomposing a task, individuals can better understand the components of the task, allocate resources effectively, and solve the problem more efficiently. This approach allows for a systematic and organized way of approaching complex tasks by dividing them into smaller, more achievable steps.', additional_kwargs={'refusal': None}, response_metadata={'token_usage': {'completion_tokens': 75, 'prompt_tokens': 68, 'total_tokens': 143, 'completion_tokens_details': {'reasoning_tokens': 0}}, 'model_name': 'gpt-3.5-turbo-0125', 'system_fingerprint': None, 'finish_reason': 'stop', 'logprobs': None}, id='run-01d17f40-c853-4e16-96bd-1e231e2486b5-0', usage_metadata={'input_tokens': 68, 'output_tokens': 75, 'total_tokens': 143})]}}\n",
      "----\n"
     ]
    }
@ -748,7 +788,7 @@
     "name": "stdout",
     "output_type": "stream",
     "text": [
-      "{'agent': {'messages': [AIMessage(content='Hello Bob! How can I assist you today?', response_metadata={'token_usage': {'completion_tokens': 11, 'prompt_tokens': 67, 'total_tokens': 78}, 'model_name': 'gpt-3.5-turbo', 'system_fingerprint': None, 'finish_reason': 'stop', 'logprobs': None}, id='run-1cd17562-18aa-4839-b41b-403b17a0fc20-0')]}}\n",
+      "{'agent': {'messages': [AIMessage(content='Hello Bob! How can I assist you today?', additional_kwargs={'refusal': None}, response_metadata={'token_usage': {'completion_tokens': 11, 'prompt_tokens': 67, 'total_tokens': 78, 'completion_tokens_details': {'reasoning_tokens': 0}}, 'model_name': 'gpt-3.5-turbo-0125', 'system_fingerprint': None, 'finish_reason': 'stop', 'logprobs': None}, id='run-e41bbdf4-da73-43e3-b980-f0d258c4713d-0', usage_metadata={'input_tokens': 67, 'output_tokens': 11, 'total_tokens': 78})]}}\n",
      "----\n"
     ]
    }
@ -777,22 +817,15 @@
   "id": "e2c570ae-dd91-402c-8693-ae746de63b16",
   "metadata": {},
   "outputs": [
-    {
-     "name": "stderr",
-     "output_type": "stream",
-     "text": [
-      "Error in LangChainTracer.on_tool_end callback: TracerException(\"Found chain run at ID c54381c0-c5d9-495a-91a0-aca4ae755663, but expected {'tool'} run.\")\n"
-     ]
-    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
-      "{'agent': {'messages': [AIMessage(content='', additional_kwargs={'tool_calls': [{'id': 'call_rg7zKTE5e0ICxVSslJ1u9LMg', 'function': {'arguments': '{\"query\":\"Task Decomposition\"}', 'name': 'blog_post_retriever'}, 'type': 'function'}]}, response_metadata={'token_usage': {'completion_tokens': 19, 'prompt_tokens': 91, 'total_tokens': 110}, 'model_name': 'gpt-3.5-turbo', 'system_fingerprint': None, 'finish_reason': 'tool_calls', 'logprobs': None}, id='run-122bf097-7ff1-49aa-b430-e362b51354ad-0', tool_calls=[{'name': 'blog_post_retriever', 'args': {'query': 'Task Decomposition'}, 'id': 'call_rg7zKTE5e0ICxVSslJ1u9LMg'}])]}}\n",
+      "{'agent': {'messages': [AIMessage(content='', additional_kwargs={'tool_calls': [{'id': 'call_ygtIVKtuMQEsY95j31BvhzzN', 'function': {'arguments': '{\"query\":\"Task Decomposition\"}', 'name': 'blog_post_retriever'}, 'type': 'function'}], 'refusal': None}, response_metadata={'token_usage': {'completion_tokens': 19, 'prompt_tokens': 91, 'total_tokens': 110, 'completion_tokens_details': {'reasoning_tokens': 0}}, 'model_name': 'gpt-3.5-turbo-0125', 'system_fingerprint': None, 'finish_reason': 'tool_calls', 'logprobs': None}, id='run-61b7e948-e450-4902-b21c-66db5df816fc-0', tool_calls=[{'name': 'blog_post_retriever', 'args': {'query': 'Task Decomposition'}, 'id': 'call_ygtIVKtuMQEsY95j31BvhzzN', 'type': 'tool_call'}], usage_metadata={'input_tokens': 91, 'output_tokens': 19, 'total_tokens': 110})]}}\n",
      "----\n",
-      "{'tools': {'messages': [ToolMessage(content='Fig. 1. Overview of a LLM-powered autonomous agent system.\\nComponent One: Planning#\\nA complicated task usually involves many steps. An agent needs to know what they are and plan ahead.\\nTask Decomposition#\\nChain of thought (CoT; Wei et al. 2022) has become a standard prompting technique for enhancing model performance on complex tasks. The model is instructed to “think step by step” to utilize more test-time computation to decompose hard tasks into smaller and simpler steps. CoT transforms big tasks into multiple manageable tasks and shed lights into an interpretation of the model’s thinking process.\\n\\nFig. 1. Overview of a LLM-powered autonomous agent system.\\nComponent One: Planning#\\nA complicated task usually involves many steps. An agent needs to know what they are and plan ahead.\\nTask Decomposition#\\nChain of thought (CoT; Wei et al. 2022) has become a standard prompting technique for enhancing model performance on complex tasks. The model is instructed to “think step by step” to utilize more test-time computation to decompose hard tasks into smaller and simpler steps. CoT transforms big tasks into multiple manageable tasks and shed lights into an interpretation of the model’s thinking process.\\n\\nTree of Thoughts (Yao et al. 2023) extends CoT by exploring multiple reasoning possibilities at each step. It first decomposes the problem into multiple thought steps and generates multiple thoughts per step, creating a tree structure. The search process can be BFS (breadth-first search) or DFS (depth-first search) with each state evaluated by a classifier (via a prompt) or majority vote.\\nTask decomposition can be done (1) by LLM with simple prompting like \"Steps for XYZ.\\\\n1.\", \"What are the subgoals for achieving XYZ?\", (2) by using task-specific instructions; e.g. \"Write a story outline.\" for writing a novel, or (3) with human inputs.\\n\\nTree of Thoughts (Yao et al. 2023) extends CoT by exploring multiple reasoning possibilities at each step. It first decomposes the problem into multiple thought steps and generates multiple thoughts per step, creating a tree structure. The search process can be BFS (breadth-first search) or DFS (depth-first search) with each state evaluated by a classifier (via a prompt) or majority vote.\\nTask decomposition can be done (1) by LLM with simple prompting like \"Steps for XYZ.\\\\n1.\", \"What are the subgoals for achieving XYZ?\", (2) by using task-specific instructions; e.g. \"Write a story outline.\" for writing a novel, or (3) with human inputs.', name='blog_post_retriever', tool_call_id='call_rg7zKTE5e0ICxVSslJ1u9LMg')]}}\n",
+      "{'tools': {'messages': [ToolMessage(content='Fig. 1. Overview of a LLM-powered autonomous agent system.\\nComponent One: Planning#\\nA complicated task usually involves many steps. An agent needs to know what they are and plan ahead.\\nTask Decomposition#\\nChain of thought (CoT; Wei et al. 2022) has become a standard prompting technique for enhancing model performance on complex tasks. The model is instructed to “think step by step” to utilize more test-time computation to decompose hard tasks into smaller and simpler steps. CoT transforms big tasks into multiple manageable tasks and shed lights into an interpretation of the model’s thinking process.\\n\\nTree of Thoughts (Yao et al. 2023) extends CoT by exploring multiple reasoning possibilities at each step. It first decomposes the problem into multiple thought steps and generates multiple thoughts per step, creating a tree structure. The search process can be BFS (breadth-first search) or DFS (depth-first search) with each state evaluated by a classifier (via a prompt) or majority vote.\\nTask decomposition can be done (1) by LLM with simple prompting like \"Steps for XYZ.\\\\n1.\", \"What are the subgoals for achieving XYZ?\", (2) by using task-specific instructions; e.g. \"Write a story outline.\" for writing a novel, or (3) with human inputs.\\n\\n(3) Task execution: Expert models execute on the specific tasks and log results.\\nInstruction:\\n\\nWith the input and the inference results, the AI assistant needs to describe the process and results. The previous stages can be formed as - User Input: {{ User Input }}, Task Planning: {{ Tasks }}, Model Selection: {{ Model Assignment }}, Task Execution: {{ Predictions }}. You must first answer the user\\'s request in a straightforward manner. Then describe the task process and show your analysis and model inference results to the user in the first person. If inference results contain a file path, must tell the user the complete file path.\\n\\nFig. 11. Illustration of how HuggingGPT works. (Image source: Shen et al. 2023)\\nThe system comprises of 4 stages:\\n(1) Task planning: LLM works as the brain and parses the user requests into multiple tasks. There are four attributes associated with each task: task type, ID, dependencies, and arguments. They use few-shot examples to guide LLM to do task parsing and planning.\\nInstruction:', name='blog_post_retriever', tool_call_id='call_ygtIVKtuMQEsY95j31BvhzzN')]}}\n",
      "----\n",
-      "{'agent': {'messages': [AIMessage(content='Task decomposition is a technique used to break down complex tasks into smaller and simpler steps. This approach helps in managing and solving intricate problems by dividing them into more manageable components. By decomposing tasks, agents or models can better understand the steps involved and plan their actions accordingly. Techniques like Chain of Thought (CoT) and Tree of Thoughts are examples of methods that enhance model performance on complex tasks by breaking them down into smaller steps.', response_metadata={'token_usage': {'completion_tokens': 87, 'prompt_tokens': 659, 'total_tokens': 746}, 'model_name': 'gpt-3.5-turbo', 'system_fingerprint': None, 'finish_reason': 'stop', 'logprobs': None}, id='run-b9166386-83e5-4b82-9a4b-590e5fa76671-0')]}}\n",
+      "{'agent': {'messages': [AIMessage(content='Task decomposition is a technique used to break down complex tasks into smaller and simpler steps. This approach helps autonomous agents or models to handle challenging tasks by dividing them into more manageable subtasks. One common method for task decomposition is the Chain of Thought (CoT) technique, where models are prompted to think step by step to decompose difficult tasks.\\n\\nAnother extension of CoT is the Tree of Thoughts, which explores multiple reasoning possibilities at each step by creating a tree structure of multiple thoughts per step. Task decomposition can be facilitated by providing simple prompts to language models, using task-specific instructions, or incorporating human inputs.\\n\\nOverall, task decomposition plays a crucial role in enabling autonomous agents to plan and execute complex tasks effectively by breaking them down into smaller, more manageable components.', additional_kwargs={'refusal': None}, response_metadata={'token_usage': {'completion_tokens': 153, 'prompt_tokens': 611, 'total_tokens': 764, 'completion_tokens_details': {'reasoning_tokens': 0}}, 'model_name': 'gpt-3.5-turbo-0125', 'system_fingerprint': None, 'finish_reason': 'stop', 'logprobs': None}, id='run-68aed524-fdf4-4d34-8546-dfb02f2a03cd-0', usage_metadata={'input_tokens': 611, 'output_tokens': 153, 'total_tokens': 764})]}}\n",
      "----\n"
     ]
    }
@ -827,24 +860,11 @@
     "name": "stdout",
     "output_type": "stream",
     "text": [
-      "{'agent': {'messages': [AIMessage(content='', additional_kwargs={'tool_calls': [{'id': 'call_6kbxTU5CDWLmF9mrvR7bWSkI', 'function': {'arguments': '{\"query\":\"Common ways of task decomposition\"}', 'name': 'blog_post_retriever'}, 'type': 'function'}]}, response_metadata={'token_usage': {'completion_tokens': 21, 'prompt_tokens': 769, 'total_tokens': 790}, 'model_name': 'gpt-3.5-turbo', 'system_fingerprint': None, 'finish_reason': 'tool_calls', 'logprobs': None}, id='run-2d2c8327-35cd-484a-b8fd-52436657c2d8-0', tool_calls=[{'name': 'blog_post_retriever', 'args': {'query': 'Common ways of task decomposition'}, 'id': 'call_6kbxTU5CDWLmF9mrvR7bWSkI'}])]}}\n",
-      "----\n"
-     ]
-    },
-    {
-     "name": "stderr",
-     "output_type": "stream",
-     "text": [
-      "Error in LangChainTracer.on_tool_end callback: TracerException(\"Found chain run at ID 29553415-e0f4-41a9-8921-ba489e377f68, but expected {'tool'} run.\")\n"
-     ]
-    },
-    {
-     "name": "stdout",
-     "output_type": "stream",
-     "text": [
-      "{'tools': {'messages': [ToolMessage(content='Fig. 1. Overview of a LLM-powered autonomous agent system.\\nComponent One: Planning#\\nA complicated task usually involves many steps. An agent needs to know what they are and plan ahead.\\nTask Decomposition#\\nChain of thought (CoT; Wei et al. 2022) has become a standard prompting technique for enhancing model performance on complex tasks. The model is instructed to “think step by step” to utilize more test-time computation to decompose hard tasks into smaller and simpler steps. CoT transforms big tasks into multiple manageable tasks and shed lights into an interpretation of the model’s thinking process.\\n\\nFig. 1. Overview of a LLM-powered autonomous agent system.\\nComponent One: Planning#\\nA complicated task usually involves many steps. An agent needs to know what they are and plan ahead.\\nTask Decomposition#\\nChain of thought (CoT; Wei et al. 2022) has become a standard prompting technique for enhancing model performance on complex tasks. The model is instructed to “think step by step” to utilize more test-time computation to decompose hard tasks into smaller and simpler steps. CoT transforms big tasks into multiple manageable tasks and shed lights into an interpretation of the model’s thinking process.\\n\\nTree of Thoughts (Yao et al. 2023) extends CoT by exploring multiple reasoning possibilities at each step. It first decomposes the problem into multiple thought steps and generates multiple thoughts per step, creating a tree structure. The search process can be BFS (breadth-first search) or DFS (depth-first search) with each state evaluated by a classifier (via a prompt) or majority vote.\\nTask decomposition can be done (1) by LLM with simple prompting like \"Steps for XYZ.\\\\n1.\", \"What are the subgoals for achieving XYZ?\", (2) by using task-specific instructions; e.g. \"Write a story outline.\" for writing a novel, or (3) with human inputs.\\n\\nTree of Thoughts (Yao et al. 2023) extends CoT by exploring multiple reasoning possibilities at each step. It first decomposes the problem into multiple thought steps and generates multiple thoughts per step, creating a tree structure. The search process can be BFS (breadth-first search) or DFS (depth-first search) with each state evaluated by a classifier (via a prompt) or majority vote.\\nTask decomposition can be done (1) by LLM with simple prompting like \"Steps for XYZ.\\\\n1.\", \"What are the subgoals for achieving XYZ?\", (2) by using task-specific instructions; e.g. \"Write a story outline.\" for writing a novel, or (3) with human inputs.', name='blog_post_retriever', tool_call_id='call_6kbxTU5CDWLmF9mrvR7bWSkI')]}}\n",
+      "{'agent': {'messages': [AIMessage(content='', additional_kwargs={'tool_calls': [{'id': 'call_QOoWDqK4Bopi8P9HzGmnHAd5', 'function': {'arguments': '{\"query\":\"common ways of task decomposition\"}', 'name': 'blog_post_retriever'}, 'type': 'function'}], 'refusal': None}, response_metadata={'token_usage': {'completion_tokens': 21, 'prompt_tokens': 787, 'total_tokens': 808, 'completion_tokens_details': {'reasoning_tokens': 0}}, 'model_name': 'gpt-3.5-turbo-0125', 'system_fingerprint': None, 'finish_reason': 'tool_calls', 'logprobs': None}, id='run-096ddff3-9505-4b2f-ae87-c5af6924dd00-0', tool_calls=[{'name': 'blog_post_retriever', 'args': {'query': 'common ways of task decomposition'}, 'id': 'call_QOoWDqK4Bopi8P9HzGmnHAd5', 'type': 'tool_call'}], usage_metadata={'input_tokens': 787, 'output_tokens': 21, 'total_tokens': 808})]}}\n",
      "----\n",
-      "{'agent': {'messages': [AIMessage(content='Common ways of task decomposition include:\\n1. Using LLM with simple prompting like \"Steps for XYZ\" or \"What are the subgoals for achieving XYZ?\"\\n2. Using task-specific instructions, for example, \"Write a story outline\" for writing a novel.\\n3. Involving human inputs in the task decomposition process.', response_metadata={'token_usage': {'completion_tokens': 67, 'prompt_tokens': 1339, 'total_tokens': 1406}, 'model_name': 'gpt-3.5-turbo', 'system_fingerprint': None, 'finish_reason': 'stop', 'logprobs': None}, id='run-9ad14cde-ca75-4238-a868-f865e0fc50dd-0')]}}\n",
+      "{'tools': {'messages': [ToolMessage(content='Tree of Thoughts (Yao et al. 2023) extends CoT by exploring multiple reasoning possibilities at each step. It first decomposes the problem into multiple thought steps and generates multiple thoughts per step, creating a tree structure. The search process can be BFS (breadth-first search) or DFS (depth-first search) with each state evaluated by a classifier (via a prompt) or majority vote.\\nTask decomposition can be done (1) by LLM with simple prompting like \"Steps for XYZ.\\\\n1.\", \"What are the subgoals for achieving XYZ?\", (2) by using task-specific instructions; e.g. \"Write a story outline.\" for writing a novel, or (3) with human inputs.\\n\\nFig. 1. Overview of a LLM-powered autonomous agent system.\\nComponent One: Planning#\\nA complicated task usually involves many steps. An agent needs to know what they are and plan ahead.\\nTask Decomposition#\\nChain of thought (CoT; Wei et al. 2022) has become a standard prompting technique for enhancing model performance on complex tasks. The model is instructed to “think step by step” to utilize more test-time computation to decompose hard tasks into smaller and simpler steps. CoT transforms big tasks into multiple manageable tasks and shed lights into an interpretation of the model’s thinking process.\\n\\nResources:\\n1. Internet access for searches and information gathering.\\n2. Long Term memory management.\\n3. GPT-3.5 powered Agents for delegation of simple tasks.\\n4. File output.\\n\\nPerformance Evaluation:\\n1. Continuously review and analyze your actions to ensure you are performing to the best of your abilities.\\n2. Constructively self-criticize your big-picture behavior constantly.\\n3. Reflect on past decisions and strategies to refine your approach.\\n4. Every command has a cost, so be smart and efficient. Aim to complete tasks in the least number of steps.\\n\\n(3) Task execution: Expert models execute on the specific tasks and log results.\\nInstruction:\\n\\nWith the input and the inference results, the AI assistant needs to describe the process and results. The previous stages can be formed as - User Input: {{ User Input }}, Task Planning: {{ Tasks }}, Model Selection: {{ Model Assignment }}, Task Execution: {{ Predictions }}. You must first answer the user\\'s request in a straightforward manner. Then describe the task process and show your analysis and model inference results to the user in the first person. If inference results contain a file path, must tell the user the complete file path.', name='blog_post_retriever', tool_call_id='call_QOoWDqK4Bopi8P9HzGmnHAd5')]}}\n",
+      "----\n",
+      "{'agent': {'messages': [AIMessage(content='Common ways of task decomposition include:\\n\\n1. Using Language Models (LLM) with simple prompting: Language models can be prompted with instructions like \"Steps for XYZ\" or \"What are the subgoals for achieving XYZ?\" to break down tasks into smaller steps.\\n\\n2. Task-specific instructions: Providing specific instructions tailored to the task at hand, such as \"Write a story outline\" for writing a novel, can help in decomposing tasks effectively.\\n\\n3. Human inputs: Involving human inputs in the task decomposition process can also be a common approach to breaking down complex tasks into manageable subtasks.\\n\\nThese methods of task decomposition play a crucial role in enabling autonomous agents to effectively plan and execute complex tasks by breaking them down into smaller, more manageable components.', additional_kwargs={'refusal': None}, response_metadata={'token_usage': {'completion_tokens': 152, 'prompt_tokens': 1332, 'total_tokens': 1484, 'completion_tokens_details': {'reasoning_tokens': 0}}, 'model_name': 'gpt-3.5-turbo-0125', 'system_fingerprint': None, 'finish_reason': 'stop', 'logprobs': None}, id='run-41868dd4-a1d9-4323-b7b0-ac52c228a2ac-0', usage_metadata={'input_tokens': 1332, 'output_tokens': 152, 'total_tokens': 1484})]}}\n",
      "----\n"
     ]
    }
@ -879,18 +899,27 @@
  },
  {
   "cell_type": "code",
-   "execution_count": 23,
+   "execution_count": 1,
   "id": "b1d2b4d4-e604-497d-873d-d345b808578e",
   "metadata": {},
-   "outputs": [],
+   "outputs": [
+    {
+     "name": "stderr",
+     "output_type": "stream",
+     "text": [
+      "USER_AGENT environment variable not set, consider setting it to identify your requests.\n"
+     ]
+    }
+   ],
   "source": [
    "import bs4\n",
    "from langchain.tools.retriever import create_retriever_tool\n",
-    "from langchain_chroma import Chroma\n",
    "from langchain_community.document_loaders import WebBaseLoader\n",
+    "from langchain_core.vectorstores import InMemoryVectorStore\n",
    "from langchain_openai import ChatOpenAI, OpenAIEmbeddings\n",
    "from langchain_text_splitters import RecursiveCharacterTextSplitter\n",
    "from langgraph.checkpoint.memory import MemorySaver\n",
+    "from langgraph.prebuilt import create_react_agent\n",
    "\n",
    "memory = MemorySaver()\n",
    "llm = ChatOpenAI(model=\"gpt-3.5-turbo\", temperature=0)\n",
@ -909,7 +938,8 @@
    "\n",
    "text_splitter = RecursiveCharacterTextSplitter(chunk_size=1000, chunk_overlap=200)\n",
    "splits = text_splitter.split_documents(docs)\n",
-    "vectorstore = Chroma.from_documents(documents=splits, embedding=OpenAIEmbeddings())\n",
+    "vectorstore = InMemoryVectorStore(embedding=OpenAIEmbeddings())\n",
+    "vectorstore.add_documents(documents=splits)\n",
    "retriever = vectorstore.as_retriever()\n",
    "\n",
    "\n",
@ -961,7 +991,7 @@
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
-   "version": "3.11.2"
+   "version": "3.11.4"
  }
 },
 "nbformat": 4,