zep/rag conversation zep template (#12762)

LangServe template for a RAG Conversation App using Zep. @baskaryan, @eyurtsev --------- Co-authored-by: Erick Friis <erick@langchain.dev>
2025-09-26 22:05:29 +00:00 · 2023-11-03 13:34:44 -07:00
parent ea1ab391d4
commit f41f4c5e37
8 changed files with 1999 additions and 0 deletions
--- a/templates/rag-conversation-zep/LICENSE
+++ b/templates/rag-conversation-zep/LICENSE
@@ -0,0 +1,21 @@
+MIT License
+
+Copyright (c) 2023 LangChain, Inc.
+
+Permission is hereby granted, free of charge, to any person obtaining a copy
+of this software and associated documentation files (the "Software"), to deal
+in the Software without restriction, including without limitation the rights
+to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
+copies of the Software, and to permit persons to whom the Software is
+furnished to do so, subject to the following conditions:
+
+The above copyright notice and this permission notice shall be included in all
+copies or substantial portions of the Software.
+
+THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
+AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
+OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
+SOFTWARE.
--- a/templates/rag-conversation-zep/README.md
+++ b/templates/rag-conversation-zep/README.md
@@ -0,0 +1,93 @@
+# rag-conversation-zep
+
+This template demonstrates building a RAG conversation app using Zep. 
+
+Included in this template:
+- Populating a [Zep Document Collection](https://docs.getzep.com/sdk/documents/) with a set of documents (a Collection is analogous to an index in other Vector Databases).
+- Using Zep's [integrated embedding](https://docs.getzep.com/deployment/embeddings/) functionality to embed the documents as vectors.
+- Configuring a LangChain [ZepVectorStore Retriever](https://docs.getzep.com/sdk/documents/) to retrieve documents using Zep's built, hardware accelerated in [Maximal Marginal Relevance](https://docs.getzep.com/sdk/search_query/) (MMR) re-ranking.
+- Prompts, a simple chat history data structure, and other components required to build a RAG conversation app.
+- The RAG conversation chain.
+
+## About [Zep - Fast, scalable building blocks for LLM Apps](https://www.getzep.com/)
+Zep is an open source platform for productionizing LLM apps. Go from a prototype built in LangChain or LlamaIndex, or a custom app, to production in minutes without rewriting code.
+
+Key Features:
+
+- Fast! Zep’s async extractors operate independently of the your chat loop, ensuring a snappy user experience.
+- Long-term memory persistence, with access to historical messages irrespective of your summarization strategy.
+- Auto-summarization of memory messages based on a configurable message window. A series of summaries are stored, providing flexibility for future summarization strategies.
+- Hybrid search over memories and metadata, with messages automatically embedded on creation.
+- Entity Extractor that automatically extracts named entities from messages and stores them in the message metadata.
+- Auto-token counting of memories and summaries, allowing finer-grained control over prompt assembly.
+- Python and JavaScript SDKs.
+
+Zep project: https://github.com/getzep/zep | Docs: https://docs.getzep.com/
+
+## Environment Setup
+
+Set up a Zep service by following the [Quick Start Guide](https://docs.getzep.com/deployment/quickstart/).
+
+## Ingesting Documents into a Zep Collection
+
+Run `python ingest.py` to ingest the test documents into a Zep Collection. Review the file to modify the Collection name and document source.
+
+
+## Usage
+
+To use this package, you should first have the LangChain CLI installed:
+
+```shell
+pip install -U "langchain-cli[serve]"
+```
+
+To create a new LangChain project and install this as the only package, you can do:
+
+```shell
+langchain app new my-app --package rag-conversation-zep
+```
+
+If you want to add this to an existing project, you can just run:
+
+```shell
+langchain app add rag-conversation-zep
+```
+
+And add the following code to your `server.py` file:
+```python
+from rag_conversation_zep import chain as rag_conversation_zep_chain
+
+add_routes(app, rag_conversation_zep_chain, path="/rag-conversation-zep")
+```
+
+(Optional) Let's now configure LangSmith. 
+LangSmith will help us trace, monitor and debug LangChain applications. 
+LangSmith is currently in private beta, you can sign up [here](https://smith.langchain.com/). 
+If you don't have access, you can skip this section
+
+
+```shell
+export LANGCHAIN_TRACING_V2=true
+export LANGCHAIN_API_KEY=<your-api-key>
+export LANGCHAIN_PROJECT=<your-project>  # if not specified, defaults to "default"
+```
+
+If you are inside this directory, then you can spin up a LangServe instance directly by:
+
+```shell
+langchain serve
+```
+
+This will start the FastAPI app with a server is running locally at 
+[http://localhost:8000](http://localhost:8000)
+
+We can see all templates at [http://127.0.0.1:8000/docs](http://127.0.0.1:8000/docs)
+We can access the playground at [http://127.0.0.1:8000/rag-conversation-zep/playground](http://127.0.0.1:8000/rag-conversation-zep/playground)  
+
+We can access the template from code with:
+
+```python
+from langserve.client import RemoteRunnable
+
+runnable = RemoteRunnable("http://localhost:8000/rag-conversation-zep")
+```
--- a/templates/rag-conversation-zep/ingest.py
+++ b/templates/rag-conversation-zep/ingest.py
@@ -0,0 +1,37 @@
+# Ingest Documents into a Zep Collection
+import os
+
+from langchain.document_loaders import WebBaseLoader
+from langchain.embeddings import FakeEmbeddings
+from langchain.text_splitter import RecursiveCharacterTextSplitter
+from langchain.vectorstores.zep import CollectionConfig, ZepVectorStore
+
+ZEP_API_URL = os.environ.get("ZEP_API_URL", "http://localhost:8000")
+ZEP_API_KEY = os.environ.get("ZEP_API_KEY", None)
+ZEP_COLLECTION_NAME = os.environ.get("ZEP_COLLECTION", "langchaintest")
+
+collection_config = CollectionConfig(
+    name=ZEP_COLLECTION_NAME,
+    description="Zep collection for LangChain",
+    metadata={},
+    embedding_dimensions=1536,
+    is_auto_embedded=True,
+)
+
+# Load
+loader = WebBaseLoader("https://lilianweng.github.io/posts/2023-06-23-agent/")
+data = loader.load()
+
+# Split
+text_splitter = RecursiveCharacterTextSplitter(chunk_size=500, chunk_overlap=0)
+all_splits = text_splitter.split_documents(data)
+
+# Add to vectorDB
+vectorstore = ZepVectorStore.from_documents(
+    documents=all_splits,
+    collection_name=ZEP_COLLECTION_NAME,
+    config=collection_config,
+    api_url=ZEP_API_URL,
+    api_key=ZEP_API_KEY,
+    embedding=FakeEmbeddings(size=1),
+)
--- a/templates/rag-conversation-zep/poetry.lock
+++ b/templates/rag-conversation-zep/poetry.lock
--- a/templates/rag-conversation-zep/pyproject.toml
+++ b/templates/rag-conversation-zep/pyproject.toml
@@ -0,0 +1,28 @@
+[tool.poetry]
+name = "rag_conversation_zep"
+version = "0.0.1"
+description = "A RAG application built with Zep. Zep provides a VectorStore implementation to the chain."
+authors = ["Daniel Chalef <daniel@getzep.com>"]
+readme = "README.md"
+
+[tool.poetry.dependencies]
+python = ">=3.8.1,<4.0"
+langchain = ">=0.0.313, <0.1"
+openai = "^0.28.1"
+zep-python = "^1.4.0"
+tiktoken = "^0.5.1"
+beautifulsoup4 = "^4.12.2"
+bs4 = "^0.0.1"
+
+[tool.poetry.group.dev.dependencies]
+langchain-cli = ">=0.0.15"
+fastapi = "^0.104.0"
+sse-starlette = "^1.6.5"
+
+[tool.langserve]
+export_module = "rag_conversation_zep"
+export_attr = "chain"
+
+[build-system]
+requires = ["poetry-core"]
+build-backend = "poetry.core.masonry.api"
--- a/templates/rag-conversation-zep/rag_conversation_zep/init.py
+++ b/templates/rag-conversation-zep/rag_conversation_zep/init.py
@@ -0,0 +1,3 @@
+from rag_conversation_zep.chain import chain
+
+__all__ = ["chain"]
--- a/templates/rag-conversation-zep/rag_conversation_zep/chain.py
+++ b/templates/rag-conversation-zep/rag_conversation_zep/chain.py
@@ -0,0 +1,144 @@
+import os
+from operator import itemgetter
+from typing import List, Tuple
+
+from langchain.chat_models import ChatOpenAI
+from langchain.prompts import ChatPromptTemplate, MessagesPlaceholder
+from langchain.prompts.prompt import PromptTemplate
+from langchain.pydantic_v1 import BaseModel, Field
+from langchain.schema import AIMessage, HumanMessage, format_document
+from langchain.schema.document import Document
+from langchain.schema.messages import BaseMessage
+from langchain.schema.output_parser import StrOutputParser
+from langchain.schema.runnable import (
+    ConfigurableField,
+    RunnableBranch,
+    RunnableLambda,
+    RunnableMap,
+    RunnablePassthrough,
+)
+from langchain.schema.runnable.utils import ConfigurableFieldSingleOption
+from langchain.vectorstores.zep import CollectionConfig, ZepVectorStore
+
+ZEP_API_URL = os.environ.get("ZEP_API_URL", "http://localhost:8000")
+ZEP_API_KEY = os.environ.get("ZEP_API_KEY", None)
+ZEP_COLLECTION_NAME = os.environ.get("ZEP_COLLECTION", "langchaintest")
+
+collection_config = CollectionConfig(
+    name=ZEP_COLLECTION_NAME,
+    description="Zep collection for LangChain",
+    metadata={},
+    embedding_dimensions=1536,
+    is_auto_embedded=True,
+)
+
+vectorstore = ZepVectorStore(
+    collection_name=ZEP_COLLECTION_NAME,
+    config=collection_config,
+    api_url=ZEP_API_URL,
+    api_key=ZEP_API_KEY,
+    embedding=None,
+)
+
+# Zep offers native, hardware-accelerated MMR. Enabling this will improve
+# the diversity of results, but may also reduce relevance. You can tune
+# the lambda parameter to control the tradeoff between relevance and diversity.
+# Enabling is a good default.
+retriever = vectorstore.as_retriever().configurable_fields(
+    search_type=ConfigurableFieldSingleOption(
+        id="search_type",
+        options={"Similarity": "similarity", "Similarity with MMR Reranking": "mmr"},
+        default="mmr",
+        name="Search Type",
+        description="Type of search to perform: 'similarity' or 'mmr'",
+    ),
+    search_kwargs=ConfigurableField(
+        id="search_kwargs",
+        name="Search kwargs",
+        description=(
+            "Specify 'k' for number of results to return and 'lambda_mult' for tuning"
+            " MMR relevance vs diversity."
+        ),
+    ),
+)
+
+# Condense a chat history and follow-up question into a standalone question
+_template = """Given the following conversation and a follow up question, rephrase the follow up question to be a standalone question, in its original language.
+Chat History:
+{chat_history}
+Follow Up Input: {question}
+Standalone question:"""  # noqa: E501
+CONDENSE_QUESTION_PROMPT = PromptTemplate.from_template(_template)
+
+# RAG answer synthesis prompt
+template = """Answer the question based only on the following context:
+<context>
+{context}
+</context>"""
+ANSWER_PROMPT = ChatPromptTemplate.from_messages(
+    [
+        ("system", template),
+        MessagesPlaceholder(variable_name="chat_history"),
+        ("user", "{question}"),
+    ]
+)
+
+# Conversational Retrieval Chain
+DEFAULT_DOCUMENT_PROMPT = PromptTemplate.from_template(template="{page_content}")
+
+
+def _combine_documents(
+    docs: List[Document],
+    document_prompt: PromptTemplate = DEFAULT_DOCUMENT_PROMPT,
+    document_separator: str = "\n\n",
+):
+    doc_strings = [format_document(doc, document_prompt) for doc in docs]
+    return document_separator.join(doc_strings)
+
+
+def _format_chat_history(chat_history: List[Tuple[str, str]]) -> List[BaseMessage]:
+    buffer: List[BaseMessage] = []
+    for human, ai in chat_history:
+        buffer.append(HumanMessage(content=human))
+        buffer.append(AIMessage(content=ai))
+    return buffer
+
+
+_condense_chain = (
+    RunnablePassthrough.assign(
+        chat_history=lambda x: _format_chat_history(x["chat_history"])
+    )
+    | CONDENSE_QUESTION_PROMPT
+    | ChatOpenAI(temperature=0)
+    | StrOutputParser()
+)
+
+_search_query = RunnableBranch(
+    # If input includes chat_history, we condense it with the follow-up question
+    (
+        RunnableLambda(lambda x: bool(x.get("chat_history"))).with_config(
+            run_name="HasChatHistoryCheck"
+        ),
+        # Condense follow-up question and chat into a standalone_question
+        _condense_chain,
+    ),
+    # Else, we have no chat history, so just pass through the question
+    RunnableLambda(itemgetter("question")),
+)
+
+
+# User input
+class ChatHistory(BaseModel):
+    chat_history: List[Tuple[str, str]] = Field(..., extra={"widget": {"type": "chat"}})
+    question: str
+
+
+_inputs = RunnableMap(
+    {
+        "question": lambda x: x["question"],
+        "chat_history": lambda x: _format_chat_history(x["chat_history"]),
+        "context": _search_query | retriever | _combine_documents,
+    }
+).with_types(input_type=ChatHistory)
+
+chain = _inputs | ANSWER_PROMPT | ChatOpenAI() | StrOutputParser()
--- a/templates/rag-conversation-zep/tests/init.py
+++ b/templates/rag-conversation-zep/tests/init.py