Zep Retriever - Vector Search Over Chat History (#4533)

# Zep Retriever - Vector Search Over Chat History with the Zep Long-term Memory Service More on Zep: https://github.com/getzep/zep Note: This PR is related to and relies on https://github.com/hwchase17/langchain/pull/4834. I did not want to modify the `pyproject.toml` file to add the `zep-python` dependency a second time. Co-authored-by: Daniel Chalef <daniel.chalef@private.org>
2025-09-26 13:59:49 +00:00 · 2023-05-18 16:27:18 -07:00
parent 5525b704cc
commit c8c2276ccb
4 changed files with 478 additions and 0 deletions
--- a/docs/modules/indexes/retrievers/examples/zep_memorystore.ipynb
+++ b/docs/modules/indexes/retrievers/examples/zep_memorystore.ipynb
@@ -0,0 +1,291 @@
+{
+ "cells": [
+  {
+   "cell_type": "markdown",
+   "source": [
+    "# Zep Memory\n",
+    "\n",
+    "## Retriever Example\n",
+    "\n",
+    "This notebook demonstrates how to search historical chat message histories using the [Zep Long-term Memory Store](https://getzep.github.io/).\n",
+    "\n",
+    "We'll demonstrate:\n",
+    "\n",
+    "1. Adding conversation history to the Zep memory store.\n",
+    "2. Vector search over the conversation history.\n",
+    "\n",
+    "More on Zep:\n",
+    "\n",
+    "Zep stores, summarizes, embeds, indexes, and enriches conversational AI chat histories, and exposes them via simple, low-latency APIs.\n",
+    "\n",
+    "Key Features:\n",
+    "\n",
+    "- Long-term memory persistence, with access to historical messages irrespective of your summarization strategy.\n",
+    "- Auto-summarization of memory messages based on a configurable message window. A series of summaries are stored, providing flexibility for future summarization strategies.\n",
+    "- Vector search over memories, with messages automatically embedded on creation.\n",
+    "- Auto-token counting of memories and summaries, allowing finer-grained control over prompt assembly.\n",
+    "- Python and JavaScript SDKs.\n",
+    "\n",
+    "Zep's Go Extractor model is easily extensible, with a simple, clean interface available to build new enrichment functionality, such as summarizers, entity extractors, embedders, and more.\n",
+    "\n",
+    "Zep project: [https://github.com/getzep/zep](https://github.com/getzep/zep)\n"
+   ],
+   "metadata": {
+    "collapsed": false
+   }
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 1,
+   "outputs": [],
+   "source": [
+    "from langchain.memory.chat_message_histories import ZepChatMessageHistory\n",
+    "from langchain.schema import HumanMessage, AIMessage\n",
+    "from uuid import uuid4\n",
+    "\n",
+    "# Set this to your Zep server URL\n",
+    "ZEP_API_URL = \"http://localhost:8000\"\n",
+    "\n",
+    "# Zep is async-first. Our sync APIs use an asyncio wrapper to run outside an app's event loop.\n",
+    "# This interferes with Jupyter's event loop, so we need to install nest_asyncio to run the\n",
+    "# Zep client in a notebook.\n",
+    "\n",
+    "# !pip install nest_asyncio  # Uncomment to install nest_asyncio\n",
+    "import nest_asyncio\n",
+    "\n",
+    "nest_asyncio.apply()"
+   ],
+   "metadata": {
+    "collapsed": false,
+    "ExecuteTime": {
+     "end_time": "2023-05-18T20:09:20.355017Z",
+     "start_time": "2023-05-18T20:09:19.526069Z"
+    }
+   }
+  },
+  {
+   "cell_type": "markdown",
+   "source": [
+    "### Initialize the Zep Chat Message History Class and add a chat message history to the memory store\n",
+    "\n",
+    "**NOTE:** Unlike other Retrievers, the content returned by the Zep Retriever is session/user specific. A `session_id` is required when instantiating the Retriever."
+   ],
+   "metadata": {
+    "collapsed": false
+   }
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 2,
+   "outputs": [],
+   "source": [
+    "session_id = str(uuid4())  # This is a unique identifier for the user/session\n",
+    "\n",
+    "# Set up Zep Chat History. We'll use this to add chat histories to the memory store\n",
+    "zep_chat_history = ZepChatMessageHistory(\n",
+    "    session_id=session_id,\n",
+    "    url=ZEP_API_URL,\n",
+    ")"
+   ],
+   "metadata": {
+    "collapsed": false,
+    "ExecuteTime": {
+     "end_time": "2023-05-18T20:09:20.424764Z",
+     "start_time": "2023-05-18T20:09:20.355626Z"
+    }
+   }
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 3,
+   "outputs": [],
+   "source": [
+    "# Preload some messages into the memory. The default message window is 12 messages. We want to push beyond this to demonstrate auto-summarization.\n",
+    "test_history = [\n",
+    "    {\"role\": \"human\", \"content\": \"Who was Octavia Butler?\"},\n",
+    "    {\n",
+    "        \"role\": \"ai\",\n",
+    "        \"content\": (\n",
+    "            \"Octavia Estelle Butler (June 22, 1947 – February 24, 2006) was an American\"\n",
+    "            \" science fiction author.\"\n",
+    "        ),\n",
+    "    },\n",
+    "    {\"role\": \"human\", \"content\": \"Which books of hers were made into movies?\"},\n",
+    "    {\n",
+    "        \"role\": \"ai\",\n",
+    "        \"content\": (\n",
+    "            \"The most well-known adaptation of Octavia Butler's work is the FX series\"\n",
+    "            \" Kindred, based on her novel of the same name.\"\n",
+    "        ),\n",
+    "    },\n",
+    "    {\"role\": \"human\", \"content\": \"Who were her contemporaries?\"},\n",
+    "    {\n",
+    "        \"role\": \"ai\",\n",
+    "        \"content\": (\n",
+    "            \"Octavia Butler's contemporaries included Ursula K. Le Guin, Samuel R.\"\n",
+    "            \" Delany, and Joanna Russ.\"\n",
+    "        ),\n",
+    "    },\n",
+    "    {\"role\": \"human\", \"content\": \"What awards did she win?\"},\n",
+    "    {\n",
+    "        \"role\": \"ai\",\n",
+    "        \"content\": (\n",
+    "            \"Octavia Butler won the Hugo Award, the Nebula Award, and the MacArthur\"\n",
+    "            \" Fellowship.\"\n",
+    "        ),\n",
+    "    },\n",
+    "    {\n",
+    "        \"role\": \"human\",\n",
+    "        \"content\": \"Which other women sci-fi writers might I want to read?\",\n",
+    "    },\n",
+    "    {\n",
+    "        \"role\": \"ai\",\n",
+    "        \"content\": \"You might want to read Ursula K. Le Guin or Joanna Russ.\",\n",
+    "    },\n",
+    "    {\n",
+    "        \"role\": \"human\",\n",
+    "        \"content\": (\n",
+    "            \"Write a short synopsis of Butler's book, Parable of the Sower. What is it\"\n",
+    "            \" about?\"\n",
+    "        ),\n",
+    "    },\n",
+    "    {\n",
+    "        \"role\": \"ai\",\n",
+    "        \"content\": (\n",
+    "            \"Parable of the Sower is a science fiction novel by Octavia Butler,\"\n",
+    "            \" published in 1993. It follows the story of Lauren Olamina, a young woman\"\n",
+    "            \" living in a dystopian future where society has collapsed due to\"\n",
+    "            \" environmental disasters, poverty, and violence.\"\n",
+    "        ),\n",
+    "    },\n",
+    "]\n",
+    "\n",
+    "for msg in test_history:\n",
+    "    zep_chat_history.append(\n",
+    "        HumanMessage(content=msg[\"content\"])\n",
+    "        if msg[\"role\"] == \"human\"\n",
+    "        else AIMessage(content=msg[\"content\"])\n",
+    "    )\n"
+   ],
+   "metadata": {
+    "collapsed": false,
+    "ExecuteTime": {
+     "end_time": "2023-05-18T20:09:20.603865Z",
+     "start_time": "2023-05-18T20:09:20.427041Z"
+    }
+   }
+  },
+  {
+   "cell_type": "markdown",
+   "source": [
+    "### Use the Zep Retriever to vector search over the Zep memory\n",
+    "\n",
+    "Zep provides native vector search over historical conversation memory. Embedding happens automatically.\n",
+    "\n",
+    "NOTE: Embedding of messages occurs asynchronously, so the first query may not return results. Subsequent queries will return results as the embeddings are generated."
+   ],
+   "metadata": {
+    "collapsed": false
+   }
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 4,
+   "outputs": [
+    {
+     "data": {
+      "text/plain": "[Document(page_content='Who was Octavia Butler?', metadata={'score': 0.7759001673780126, 'uuid': '3bedb2bf-aeaf-4849-924b-40a6d91e54b9', 'created_at': '2023-05-18T20:09:20.47556Z', 'role': 'human', 'token_count': 8})]"
+     },
+     "execution_count": 4,
+     "metadata": {},
+     "output_type": "execute_result"
+    }
+   ],
+   "source": [
+    "from langchain.retrievers import ZepRetriever\n",
+    "\n",
+    "zep_retriever = ZepRetriever(\n",
+    "    session_id=session_id,  # Ensure that you provide the session_id when instantiating the Retriever\n",
+    "    url=ZEP_API_URL,\n",
+    "    top_k=5,\n",
+    ")\n",
+    "\n",
+    "await zep_retriever.aget_relevant_documents(\"Who wrote Parable of the Sower?\")"
+   ],
+   "metadata": {
+    "collapsed": false,
+    "ExecuteTime": {
+     "end_time": "2023-05-18T20:09:20.979411Z",
+     "start_time": "2023-05-18T20:09:20.604147Z"
+    }
+   }
+  },
+  {
+   "cell_type": "markdown",
+   "source": [
+    "We can also use the Zep sync API to retrieve results:"
+   ],
+   "metadata": {
+    "collapsed": false
+   }
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 5,
+   "outputs": [
+    {
+     "data": {
+      "text/plain": "[Document(page_content='Who was Octavia Butler?', metadata={'score': 0.7759001673780126, 'uuid': '3bedb2bf-aeaf-4849-924b-40a6d91e54b9', 'created_at': '2023-05-18T20:09:20.47556Z', 'role': 'human', 'token_count': 8}),\n Document(page_content='Octavia Estelle Butler (June 22, 1947 – February 24, 2006) was an American science fiction author.', metadata={'score': 0.7545887969667749, 'uuid': 'b32c0644-2dcb-4c1d-a445-6622e7ba82e5', 'created_at': '2023-05-18T20:09:20.512044Z', 'role': 'ai', 'token_count': 31})]"
+     },
+     "execution_count": 5,
+     "metadata": {},
+     "output_type": "execute_result"
+    }
+   ],
+   "source": [
+    "zep_retriever.get_relevant_documents(\"Who wrote Parable of the Sower?\")"
+   ],
+   "metadata": {
+    "collapsed": false,
+    "ExecuteTime": {
+     "end_time": "2023-05-18T20:09:21.296699Z",
+     "start_time": "2023-05-18T20:09:20.983624Z"
+    }
+   }
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 5,
+   "outputs": [],
+   "source": [],
+   "metadata": {
+    "collapsed": false,
+    "ExecuteTime": {
+     "end_time": "2023-05-18T20:09:21.298710Z",
+     "start_time": "2023-05-18T20:09:21.297169Z"
+    }
+   }
+  }
+ ],
+ "metadata": {
+  "kernelspec": {
+   "display_name": "Python 3",
+   "language": "python",
+   "name": "python3"
+  },
+  "language_info": {
+   "codemirror_mode": {
+    "name": "ipython",
+    "version": 2
+   },
+   "file_extension": ".py",
+   "mimetype": "text/x-python",
+   "name": "python",
+   "nbconvert_exporter": "python",
+   "pygments_lexer": "ipython2",
+   "version": "2.7.6"
+  }
+ },
+ "nbformat": 4,
+ "nbformat_minor": 0
+}
--- a/langchain/retrievers/init.py
+++ b/langchain/retrievers/init.py
@@ -17,6 +17,7 @@ from langchain.retrievers.time_weighted_retriever import (
 from langchain.retrievers.vespa_retriever import VespaRetriever
 from langchain.retrievers.weaviate_hybrid_search import WeaviateHybridSearchRetriever
 from langchain.retrievers.wikipedia import WikipediaRetriever
+from langchain.retrievers.zep import ZepRetriever

 __all__ = [
    "ArxivRetriever",
@@ -36,4 +37,5 @@ __all__ = [
    "VespaRetriever",
    "WeaviateHybridSearchRetriever",
    "WikipediaRetriever",
+    "ZepRetriever",
 ]
--- a/langchain/retrievers/zep.py
+++ b/langchain/retrievers/zep.py
@@ -0,0 +1,74 @@
+from __future__ import annotations
+
+from typing import TYPE_CHECKING, List, Optional
+
+from langchain.schema import BaseRetriever, Document
+
+if TYPE_CHECKING:
+    from zep_python import SearchResult
+
+
+class ZepRetriever(BaseRetriever):
+    """A Retriever implementation for the Zep long-term memory store. Search your
+    user's long-term chat history with Zep.
+
+    Note: You will need to provide the user's `session_id` to use this retriever.
+
+    More on Zep:
+    Zep provides long-term conversation storage for LLM apps. The server stores,
+    summarizes, embeds, indexes, and enriches conversational AI chat
+    histories, and exposes them via simple, low-latency APIs.
+
+    For server installation instructions, see:
+    https://getzep.github.io/deployment/quickstart/
+    """
+
+    def __init__(
+        self,
+        session_id: str,
+        url: str,
+        top_k: Optional[int] = None,
+    ):
+        try:
+            from zep_python import ZepClient
+        except ImportError:
+            raise ValueError(
+                "Could not import zep-python package. "
+                "Please install it with `pip install zep-python`."
+            )
+
+        self.zep_client = ZepClient(base_url=url)
+        self.session_id = session_id
+        self.top_k = top_k
+
+    def _search_result_to_doc(self, results: List[SearchResult]) -> List[Document]:
+        return [
+            Document(
+                page_content=r.message.pop("content"),
+                metadata={"score": r.dist, **r.message},
+            )
+            for r in results
+            if r.message
+        ]
+
+    def get_relevant_documents(self, query: str) -> List[Document]:
+        from zep_python import SearchPayload
+
+        payload: SearchPayload = SearchPayload(text=query)
+
+        results: List[SearchResult] = self.zep_client.search_memory(
+            self.session_id, payload, limit=self.top_k
+        )
+
+        return self._search_result_to_doc(results)
+
+    async def aget_relevant_documents(self, query: str) -> List[Document]:
+        from zep_python import SearchPayload
+
+        payload: SearchPayload = SearchPayload(text=query)
+
+        results: List[SearchResult] = await self.zep_client.asearch_memory(
+            self.session_id, payload, limit=self.top_k
+        )
+
+        return self._search_result_to_doc(results)
--- a/tests/unit_tests/retrievers/test_zep.py
+++ b/tests/unit_tests/retrievers/test_zep.py
@@ -0,0 +1,111 @@
+from __future__ import annotations
+
+import copy
+from typing import TYPE_CHECKING, List
+
+import pytest
+from pytest_mock import MockerFixture
+
+from langchain.retrievers import ZepRetriever
+from langchain.schema import Document
+
+if TYPE_CHECKING:
+    from zep_python import SearchResult, ZepClient
+
+
+@pytest.fixture
+def search_results() -> List[SearchResult]:
+    from zep_python import Message, SearchResult
+
+    search_result = [
+        {
+            "message": {
+                "uuid": "66830914-19f5-490b-8677-1ba06bcd556b",
+                "created_at": "2023-05-18T20:40:42.743773Z",
+                "role": "user",
+                "content": "I'm looking to plan a trip to Iceland. Can you help me?",
+                "token_count": 17,
+            },
+            "summary": None,
+            "dist": 0.8734284910450115,
+        },
+        {
+            "message": {
+                "uuid": "015e618c-ba9d-45b6-95c3-77a8e611570b",
+                "created_at": "2023-05-18T20:40:42.743773Z",
+                "role": "user",
+                "content": "How much does a trip to Iceland typically cost?",
+                "token_count": 12,
+            },
+            "summary": None,
+            "dist": 0.8554048017463456,
+        },
+    ]
+
+    return [
+        SearchResult(
+            message=Message.parse_obj(result["message"]),
+            summary=result["summary"],
+            dist=result["dist"],
+        )
+        for result in search_result
+    ]
+
+
+@pytest.fixture
+@pytest.mark.requires("zep_python")
+def zep_retriever(
+    mocker: MockerFixture, search_results: List[SearchResult]
+) -> ZepRetriever:
+    mock_zep_client: ZepClient = mocker.patch("zep_python.ZepClient", autospec=True)
+    mock_zep_client.search_memory.return_value = copy.deepcopy(  # type: ignore
+        search_results
+    )
+    mock_zep_client.asearch_memory.return_value = copy.deepcopy(  # type: ignore
+        search_results
+    )
+    zep = ZepRetriever(session_id="123", url="http://localhost:8000")
+    zep.zep_client = mock_zep_client
+    return zep
+
+
+@pytest.mark.requires("zep_python")
+def test_zep_retriever_get_relevant_documents(
+    zep_retriever: ZepRetriever, search_results: List[SearchResult]
+) -> None:
+    documents: List[Document] = zep_retriever.get_relevant_documents(
+        query="My trip to Iceland"
+    )
+    _test_documents(documents, search_results)
+
+
+@pytest.mark.requires("zep_python")
+@pytest.mark.asyncio
+async def test_zep_retriever_aget_relevant_documents(
+    zep_retriever: ZepRetriever, search_results: List[SearchResult]
+) -> None:
+    documents: List[Document] = await zep_retriever.aget_relevant_documents(
+        query="My trip to Iceland"
+    )
+    _test_documents(documents, search_results)
+
+
+def _test_documents(
+    documents: List[Document], search_results: List[SearchResult]
+) -> None:
+    assert len(documents) == 2
+    for i, document in enumerate(documents):
+        assert document.page_content == search_results[i].message.get(  # type: ignore
+            "content"
+        )
+        assert document.metadata.get("uuid") == search_results[
+            i
+        ].message.get(  # type: ignore
+            "uuid"
+        )
+        assert document.metadata.get("role") == search_results[
+            i
+        ].message.get(  # type: ignore
+            "role"
+        )
+        assert document.metadata.get("score") == search_results[i].dist