mirror of
https://github.com/hwchase17/langchain.git
synced 2025-09-08 22:42:05 +00:00
Harrison/redis cache (#3766)
Co-authored-by: Tyler Hutcherson <tyler.hutcherson@redis.com>
This commit is contained in:
79
docs/ecosystem/redis.md
Normal file
79
docs/ecosystem/redis.md
Normal file
@@ -0,0 +1,79 @@
|
||||
# Redis
|
||||
|
||||
This page covers how to use the [Redis](https://redis.com) ecosystem within LangChain.
|
||||
It is broken into two parts: installation and setup, and then references to specific Redis wrappers.
|
||||
|
||||
## Installation and Setup
|
||||
- Install the Redis Python SDK with `pip install redis`
|
||||
|
||||
## Wrappers
|
||||
|
||||
### Cache
|
||||
|
||||
The Cache wrapper allows for [Redis](https://redis.io) to be used as a remote, low-latency, in-memory cache for LLM prompts and responses.
|
||||
|
||||
#### Standard Cache
|
||||
The standard cache is the Redis bread & butter of use case in production for both [open source](https://redis.io) and [enterprise](https://redis.com) users globally.
|
||||
|
||||
To import this cache:
|
||||
```python
|
||||
from langchain.cache import RedisCache
|
||||
```
|
||||
|
||||
To use this cache with your LLMs:
|
||||
```python
|
||||
import langchain
|
||||
import redis
|
||||
|
||||
redis_client = redis.Redis.from_url(...)
|
||||
langchain.llm_cache = RedisCache(redis_client)
|
||||
```
|
||||
|
||||
#### Semantic Cache
|
||||
Semantic caching allows users to retrieve cached prompts based on semantic similarity between the user input and previously cached results. Under the hood it blends Redis as both a cache and a vectorstore.
|
||||
|
||||
To import this cache:
|
||||
```python
|
||||
from langchain.cache import RedisSemanticCache
|
||||
```
|
||||
|
||||
To use this cache with your LLMs:
|
||||
```python
|
||||
import langchain
|
||||
import redis
|
||||
|
||||
# use any embedding provider...
|
||||
from tests.integration_tests.vectorstores.fake_embeddings import FakeEmbeddings
|
||||
|
||||
redis_url = "redis://localhost:6379"
|
||||
|
||||
langchain.llm_cache = RedisSemanticCache(
|
||||
embedding=FakeEmbeddings(),
|
||||
redis_url=redis_url
|
||||
)
|
||||
```
|
||||
|
||||
### VectorStore
|
||||
|
||||
The vectorstore wrapper turns Redis into a low-latency [vector database](https://redis.com/solutions/use-cases/vector-database/) for semantic search or LLM content retrieval.
|
||||
|
||||
To import this vectorstore:
|
||||
```python
|
||||
from langchain.vectorstores import Redis
|
||||
```
|
||||
|
||||
For a more detailed walkthrough of the Redis vectorstore wrapper, see [this notebook](../modules/indexes/vectorstores/examples/redis.ipynb).
|
||||
|
||||
### Retriever
|
||||
|
||||
The Redis vector store retriever wrapper generalizes the vectorstore class to perform low-latency document retrieval. To create the retriever, simply call `.as_retriever()` on the base vectorstore class.
|
||||
|
||||
### Memory
|
||||
Redis can be used to persist LLM conversations.
|
||||
|
||||
#### Vector Store Retriever Memory
|
||||
|
||||
For a more detailed walkthrough of the `VectorStoreRetrieverMemory` wrapper, see [this notebook](../modules/memory/types/vectorstore_retriever_memory.ipynb).
|
||||
|
||||
#### Chat Message History Memory
|
||||
For a detailed example of Redis to cache conversation message history, see [this notebook](../modules/memory/examples/redis_chat_message_history.ipynb).
|
@@ -41,7 +41,7 @@
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 3,
|
||||
"execution_count": 5,
|
||||
"id": "f69f6283",
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
@@ -52,7 +52,7 @@
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 4,
|
||||
"execution_count": 6,
|
||||
"id": "64005d1f",
|
||||
"metadata": {},
|
||||
"outputs": [
|
||||
@@ -60,8 +60,8 @@
|
||||
"name": "stdout",
|
||||
"output_type": "stream",
|
||||
"text": [
|
||||
"CPU times: user 14.2 ms, sys: 4.9 ms, total: 19.1 ms\n",
|
||||
"Wall time: 1.1 s\n"
|
||||
"CPU times: user 26.1 ms, sys: 21.5 ms, total: 47.6 ms\n",
|
||||
"Wall time: 1.68 s\n"
|
||||
]
|
||||
},
|
||||
{
|
||||
@@ -70,7 +70,7 @@
|
||||
"'\\n\\nWhy did the chicken cross the road?\\n\\nTo get to the other side.'"
|
||||
]
|
||||
},
|
||||
"execution_count": 4,
|
||||
"execution_count": 6,
|
||||
"metadata": {},
|
||||
"output_type": "execute_result"
|
||||
}
|
||||
@@ -83,7 +83,7 @@
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 5,
|
||||
"execution_count": 7,
|
||||
"id": "c8a1cb2b",
|
||||
"metadata": {},
|
||||
"outputs": [
|
||||
@@ -91,8 +91,8 @@
|
||||
"name": "stdout",
|
||||
"output_type": "stream",
|
||||
"text": [
|
||||
"CPU times: user 162 µs, sys: 7 µs, total: 169 µs\n",
|
||||
"Wall time: 175 µs\n"
|
||||
"CPU times: user 238 µs, sys: 143 µs, total: 381 µs\n",
|
||||
"Wall time: 1.76 ms\n"
|
||||
]
|
||||
},
|
||||
{
|
||||
@@ -101,7 +101,7 @@
|
||||
"'\\n\\nWhy did the chicken cross the road?\\n\\nTo get to the other side.'"
|
||||
]
|
||||
},
|
||||
"execution_count": 5,
|
||||
"execution_count": 7,
|
||||
"metadata": {},
|
||||
"output_type": "execute_result"
|
||||
}
|
||||
@@ -214,9 +214,18 @@
|
||||
"## Redis Cache"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"id": "c5c9a4d5",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"### Standard Cache\n",
|
||||
"Use [Redis](../../../../ecosystem/redis.md) to cache prompts and responses."
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": null,
|
||||
"execution_count": 8,
|
||||
"id": "39f6eb0b",
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
@@ -225,15 +234,35 @@
|
||||
"# (make sure your local Redis instance is running first before running this example)\n",
|
||||
"from redis import Redis\n",
|
||||
"from langchain.cache import RedisCache\n",
|
||||
"\n",
|
||||
"langchain.llm_cache = RedisCache(redis_=Redis())"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": null,
|
||||
"execution_count": 9,
|
||||
"id": "28920749",
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"outputs": [
|
||||
{
|
||||
"name": "stdout",
|
||||
"output_type": "stream",
|
||||
"text": [
|
||||
"CPU times: user 6.88 ms, sys: 8.75 ms, total: 15.6 ms\n",
|
||||
"Wall time: 1.04 s\n"
|
||||
]
|
||||
},
|
||||
{
|
||||
"data": {
|
||||
"text/plain": [
|
||||
"'\\n\\nWhy did the chicken cross the road?\\n\\nTo get to the other side!'"
|
||||
]
|
||||
},
|
||||
"execution_count": 9,
|
||||
"metadata": {},
|
||||
"output_type": "execute_result"
|
||||
}
|
||||
],
|
||||
"source": [
|
||||
"%%time\n",
|
||||
"# The first time, it is not yet in cache, so it should take longer\n",
|
||||
@@ -242,16 +271,124 @@
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": null,
|
||||
"execution_count": 14,
|
||||
"id": "94bf9415",
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"outputs": [
|
||||
{
|
||||
"name": "stdout",
|
||||
"output_type": "stream",
|
||||
"text": [
|
||||
"CPU times: user 1.59 ms, sys: 610 µs, total: 2.2 ms\n",
|
||||
"Wall time: 5.58 ms\n"
|
||||
]
|
||||
},
|
||||
{
|
||||
"data": {
|
||||
"text/plain": [
|
||||
"'\\n\\nWhy did the chicken cross the road?\\n\\nTo get to the other side!'"
|
||||
]
|
||||
},
|
||||
"execution_count": 14,
|
||||
"metadata": {},
|
||||
"output_type": "execute_result"
|
||||
}
|
||||
],
|
||||
"source": [
|
||||
"%%time\n",
|
||||
"# The second time it is, so it goes faster\n",
|
||||
"llm(\"Tell me a joke\")"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"id": "82be23f6",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"### Semantic Cache\n",
|
||||
"Use [Redis](../../../../ecosystem/redis.md) to cache prompts and responses and evaluate hits based on semantic similarity."
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 15,
|
||||
"id": "64df3099",
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"from langchain.embeddings import OpenAIEmbeddings\n",
|
||||
"from langchain.cache import RedisSemanticCache\n",
|
||||
"\n",
|
||||
"\n",
|
||||
"langchain.llm_cache = RedisSemanticCache(\n",
|
||||
" redis_url=\"redis://localhost:6379\",\n",
|
||||
" embedding=OpenAIEmbeddings()\n",
|
||||
")"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 16,
|
||||
"id": "8e91d3ac",
|
||||
"metadata": {},
|
||||
"outputs": [
|
||||
{
|
||||
"name": "stdout",
|
||||
"output_type": "stream",
|
||||
"text": [
|
||||
"CPU times: user 351 ms, sys: 156 ms, total: 507 ms\n",
|
||||
"Wall time: 3.37 s\n"
|
||||
]
|
||||
},
|
||||
{
|
||||
"data": {
|
||||
"text/plain": [
|
||||
"\"\\n\\nWhy don't scientists trust atoms?\\nBecause they make up everything.\""
|
||||
]
|
||||
},
|
||||
"execution_count": 16,
|
||||
"metadata": {},
|
||||
"output_type": "execute_result"
|
||||
}
|
||||
],
|
||||
"source": [
|
||||
"%%time\n",
|
||||
"# The first time, it is not yet in cache, so it should take longer\n",
|
||||
"llm(\"Tell me a joke\")"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 27,
|
||||
"id": "df856948",
|
||||
"metadata": {},
|
||||
"outputs": [
|
||||
{
|
||||
"name": "stdout",
|
||||
"output_type": "stream",
|
||||
"text": [
|
||||
"CPU times: user 6.25 ms, sys: 2.72 ms, total: 8.97 ms\n",
|
||||
"Wall time: 262 ms\n"
|
||||
]
|
||||
},
|
||||
{
|
||||
"data": {
|
||||
"text/plain": [
|
||||
"\"\\n\\nWhy don't scientists trust atoms?\\nBecause they make up everything.\""
|
||||
]
|
||||
},
|
||||
"execution_count": 27,
|
||||
"metadata": {},
|
||||
"output_type": "execute_result"
|
||||
}
|
||||
],
|
||||
"source": [
|
||||
"%%time\n",
|
||||
"# The second time, while not a direct hit, the question is semantically similar to the original question,\n",
|
||||
"# so it uses the cached result!\n",
|
||||
"llm(\"Tell me one joke\")"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"id": "684eab55",
|
||||
|
Reference in New Issue
Block a user