feat: add Momento as a standard cache and chat message history provider (#5221)

# Add Momento as a standard cache and chat message history provider This PR adds Momento as a standard caching provider. Implements the interface, adds integration tests, and documentation. We also add Momento as a chat history message provider along with integration tests, and documentation. [Momento](https://www.gomomento.com/) is a fully serverless cache. Similar to S3 or DynamoDB, it requires zero configuration, infrastructure management, and is instantly available. Users sign up for free and get 50GB of data in/out for free every month. ## Before submitting ✅ We have added documentation, notebooks, and integration tests demonstrating usage. Co-authored-by: Dev 2049 <dev.dev2049@gmail.com>
2025-09-05 21:12:48 +00:00 · 2023-05-25 19:13:21 -07:00
parent 56ad56c812
commit 7047a2c1af
11 changed files with 889 additions and 9 deletions
--- a/docs/integrations/momento.md
+++ b/docs/integrations/momento.md
@@ -0,0 +1,53 @@
+# Momento
+
+This page covers how to use the [Momento](https://gomomento.com) ecosystem within LangChain.
+It is broken into two parts: installation and setup, and then references to specific Momento wrappers.
+
+## Installation and Setup
+
+- Sign up for a free account [here](https://docs.momentohq.com/getting-started) and get an auth token
+- Install the Momento Python SDK with `pip install momento`
+
+## Wrappers
+
+### Cache
+
+The Cache wrapper allows for [Momento](https://gomomento.com) to be used as a serverless, distributed, low-latency cache for LLM prompts and responses.
+
+#### Standard Cache
+
+The standard cache is the go-to use case for [Momento](https://gomomento.com) users in any environment.
+
+Import the cache as follows:
+
+```python
+from langchain.cache import MomentoCache
+```
+
+And set up like so:
+
+```python
+from datetime import timedelta
+from momento import CacheClient, Configurations, CredentialProvider
+import langchain
+
+# Instantiate the Momento client
+cache_client = CacheClient(
+    Configurations.Laptop.v1(),
+    CredentialProvider.from_environment_variable("MOMENTO_AUTH_TOKEN"),
+    default_ttl=timedelta(days=1))
+
+# Choose a Momento cache name of your choice
+cache_name = "langchain"
+
+# Instantiate the LLM cache
+langchain.llm_cache = MomentoCache(cache_client, cache_name)
+```
+
+### Memory
+
+Momento can be used as a distributed memory store for LLMs.
+
+#### Chat Message History Memory
+
+See [this notebook](../modules/memory/examples/momento_chat_message_history.ipynb) for a walkthrough of how to use Momento as a memory store for chat message history.
--- a/docs/modules/memory/examples/momento_chat_message_history.ipynb
+++ b/docs/modules/memory/examples/momento_chat_message_history.ipynb
@@ -0,0 +1,86 @@
+{
+ "cells": [
+  {
+   "cell_type": "markdown",
+   "id": "91c6a7ef",
+   "metadata": {},
+   "source": [
+    "# Momento\n",
+    "\n",
+    "This notebook goes over how to use [Momento Cache](https://gomomento.com) to store chat message history using the `MomentoChatMessageHistory` class. See the Momento [docs](https://docs.momentohq.com/getting-started) for more detail on how to get set up with Momento.\n",
+    "\n",
+    "Note that, by default we will create a cache if one with the given name doesn't already exist.\n",
+    "\n",
+    "You'll need to get a Momento auth token to use this class. This can either be passed in to a momento.CacheClient if you'd like to instantiate that directly, as a named parameter `auth_token` to `MomentoChatMessageHistory.from_client_params`, or can just be set as an environment variable `MOMENTO_AUTH_TOKEN`."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 5,
+   "id": "d15e3302",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "from datetime import timedelta\n",
+    "\n",
+    "from langchain.memory import MomentoChatMessageHistory\n",
+    "\n",
+    "session_id = \"foo\"\n",
+    "cache_name = \"langchain\"\n",
+    "ttl = timedelta(days=1),\n",
+    "history = MomentoChatMessageHistory.from_client_params(\n",
+    "    session_id, \n",
+    "    cache_name,\n",
+    "    ttl,\n",
+    ")\n",
+    "\n",
+    "history.add_user_message(\"hi!\")\n",
+    "\n",
+    "history.add_ai_message(\"whats up?\")"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 6,
+   "id": "64fc465e",
+   "metadata": {},
+   "outputs": [
+    {
+     "data": {
+      "text/plain": [
+       "[HumanMessage(content='hi!', additional_kwargs={}, example=False),\n",
+       " AIMessage(content='whats up?', additional_kwargs={}, example=False)]"
+      ]
+     },
+     "execution_count": 6,
+     "metadata": {},
+     "output_type": "execute_result"
+    }
+   ],
+   "source": [
+    "history.messages"
+   ]
+  }
+ ],
+ "metadata": {
+  "kernelspec": {
+   "display_name": "Python 3 (ipykernel)",
+   "language": "python",
+   "name": "python3"
+  },
+  "language_info": {
+   "codemirror_mode": {
+    "name": "ipython",
+    "version": 3
+   },
+   "file_extension": ".py",
+   "mimetype": "text/x-python",
+   "name": "python",
+   "nbconvert_exporter": "python",
+   "pygments_lexer": "ipython3",
+   "version": "3.11.3"
+  }
+ },
+ "nbformat": 4,
+ "nbformat_minor": 5
+}
--- a/docs/modules/models/llms/examples/llm_caching.ipynb
+++ b/docs/modules/models/llms/examples/llm_caching.ipynb
@@ -41,7 +41,7 @@
  },
  {
   "cell_type": "code",
-   "execution_count": 3,
+   "execution_count": 6,
   "id": "f69f6283",
   "metadata": {},
   "outputs": [],
@@ -612,6 +612,115 @@
    "llm(\"Tell me joke\")"
   ]
  },
+  {
+   "cell_type": "markdown",
+   "id": "726fe754",
+   "metadata": {},
+   "source": [
+    "## Momento Cache\n",
+    "Use [Momento](../../../../integrations/momento.md) to cache prompts and responses.\n",
+    "\n",
+    "Requires momento to use, uncomment below to install:"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "e8949f29",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "# !pip install momento"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "56ea6a08",
+   "metadata": {},
+   "source": [
+    "You'll need to get a Momemto auth token to use this class. This can either be passed in to a momento.CacheClient if you'd like to instantiate that directly, as a named parameter `auth_token` to `MomentoChatMessageHistory.from_client_params`, or can just be set as an environment variable `MOMENTO_AUTH_TOKEN`."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 9,
+   "id": "2005f03a",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "from datetime import timedelta\n",
+    "\n",
+    "from langchain.cache import MomentoCache\n",
+    "\n",
+    "\n",
+    "cache_name = \"langchain\"\n",
+    "ttl = timedelta(days=1)\n",
+    "langchain.llm_cache = MomentoCache.from_client_params(cache_name, ttl)"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 10,
+   "id": "c6a6c238",
+   "metadata": {},
+   "outputs": [
+    {
+     "name": "stdout",
+     "output_type": "stream",
+     "text": [
+      "CPU times: user 40.7 ms, sys: 16.5 ms, total: 57.2 ms\n",
+      "Wall time: 1.73 s\n"
+     ]
+    },
+    {
+     "data": {
+      "text/plain": [
+       "'\\n\\nWhy did the chicken cross the road?\\n\\nTo get to the other side!'"
+      ]
+     },
+     "execution_count": 10,
+     "metadata": {},
+     "output_type": "execute_result"
+    }
+   ],
+   "source": [
+    "%%time\n",
+    "# The first time, it is not yet in cache, so it should take longer\n",
+    "llm(\"Tell me a joke\")"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 11,
+   "id": "b8f78f9d",
+   "metadata": {},
+   "outputs": [
+    {
+     "name": "stdout",
+     "output_type": "stream",
+     "text": [
+      "CPU times: user 3.16 ms, sys: 2.98 ms, total: 6.14 ms\n",
+      "Wall time: 57.9 ms\n"
+     ]
+    },
+    {
+     "data": {
+      "text/plain": [
+       "'\\n\\nWhy did the chicken cross the road?\\n\\nTo get to the other side!'"
+      ]
+     },
+     "execution_count": 11,
+     "metadata": {},
+     "output_type": "execute_result"
+    }
+   ],
+   "source": [
+    "%%time\n",
+    "# The second time it is, so it goes faster\n",
+    "# When run in the same region as the cache, latencies are single digit ms\n",
+    "llm(\"Tell me a joke\")"
+   ]
+  },
  {
   "cell_type": "markdown",
   "id": "934943dc",
@@ -909,9 +1018,9 @@
 ],
 "metadata": {
  "kernelspec": {
-   "display_name": "Python 3",
+   "display_name": "venv",
   "language": "python",
-   "name": "python3"
+   "name": "venv"
  },
  "language_info": {
   "codemirror_mode": {
@@ -923,7 +1032,7 @@
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
-   "version": "3.8.8"
+   "version": "3.11.3"
  }
 },
 "nbformat": 4,