docs: self-query consistency (#10502)

The `self-que[ring`
navbar](https://python.langchain.com/docs/modules/data_connection/retrievers/self_query/)
has repeated `self-quering` repeated in each menu item. I've simplified
it to be more readable
- removed `self-quering` from a title of each page;
- added description to the vector stores
- added description and link to the Integration Card
(`integrations/providers`) of the vector stores when they are missed.
This commit is contained in:
Leonid Ganeline 2023-09-13 14:43:04 -07:00 committed by GitHub
parent 415d38ae62
commit f4e6eac3b6
No known key found for this signature in database
GPG Key ID: 4AEE18F83AFDEB23
18 changed files with 336 additions and 214 deletions

View File

@ -1,15 +1,20 @@
# Milvus # Milvus
This page covers how to use the Milvus ecosystem within LangChain. >[Milvus](https://milvus.io/docs/overview.md) is a database that stores, indexes, and manages
It is broken into two parts: installation and setup, and then references to specific Milvus wrappers. > massive embedding vectors generated by deep neural networks and other machine learning (ML) models.
## Installation and Setup ## Installation and Setup
- Install the Python SDK with `pip install pymilvus`
## Wrappers
### VectorStore Install the Python SDK:
There exists a wrapper around Milvus indexes, allowing you to use it as a vectorstore, ```bash
pip install pymilvus
```
## Vector Store
There exists a wrapper around `Milvus` indexes, allowing you to use it as a vectorstore,
whether for semantic search or example selection. whether for semantic search or example selection.
To import this vectorstore: To import this vectorstore:
@ -17,4 +22,4 @@ To import this vectorstore:
from langchain.vectorstores import Milvus from langchain.vectorstores import Milvus
``` ```
For a more detailed walkthrough of the Miluvs wrapper, see [this notebook](/docs/integrations/vectorstores/milvus.html) For a more detailed walkthrough of the `Miluvs` wrapper, see [this notebook](/docs/integrations/vectorstores/milvus.html)

View File

@ -1,10 +1,12 @@
# Pinecone # Pinecone
This page covers how to use the Pinecone ecosystem within LangChain. >[Pinecone](https://docs.pinecone.io/docs/overview) is a vector database with broad functionality.
It is broken into two parts: installation and setup, and then references to specific Pinecone wrappers.
## Installation and Setup ## Installation and Setup
Install the Python SDK: Install the Python SDK:
```bash ```bash
pip install pinecone-client pip install pinecone-client
``` ```

View File

@ -1,15 +1,22 @@
# Qdrant # Qdrant
This page covers how to use the Qdrant ecosystem within LangChain. >[Qdrant](https://qdrant.tech/documentation/) (read: quadrant) is a vector similarity search engine.
It is broken into two parts: installation and setup, and then references to specific Qdrant wrappers. > It provides a production-ready service with a convenient API to store, search, and manage
> points - vectors with an additional payload. `Qdrant` is tailored to extended filtering support.
## Installation and Setup ## Installation and Setup
- Install the Python SDK with `pip install qdrant-client`
## Wrappers
### VectorStore Install the Python SDK:
There exists a wrapper around Qdrant indexes, allowing you to use it as a vectorstore, ```bash
pip install qdrant-client
```
## Vector Store
There exists a wrapper around `Qdrant` indexes, allowing you to use it as a vectorstore,
whether for semantic search or example selection. whether for semantic search or example selection.
To import this vectorstore: To import this vectorstore:

View File

@ -1,18 +1,26 @@
# Redis # Redis
>[Redis](https://redis.com) is an open-source key-value store that can be used as a cache,
> message broker, database, vector database and more.
This page covers how to use the [Redis](https://redis.com) ecosystem within LangChain. This page covers how to use the [Redis](https://redis.com) ecosystem within LangChain.
It is broken into two parts: installation and setup, and then references to specific Redis wrappers. It is broken into two parts: installation and setup, and then references to specific Redis wrappers.
## Installation and Setup ## Installation and Setup
- Install the Redis Python SDK with `pip install redis`
Install the Python SDK:
```bash
pip install redis
```
## Wrappers ## Wrappers
All wrappers needing a redis url connection string to connect to the database support either a stand alone Redis server All wrappers need a redis url connection string to connect to the database support either a stand alone Redis server
or a High-Availability setup with Replication and Redis Sentinels. or a High-Availability setup with Replication and Redis Sentinels.
### Redis Standalone connection url ### Redis Standalone connection url
For standalone Redis server the official redis connection url formats can be used as describe in the python redis modules For standalone `Redis` server, the official redis connection url formats can be used as describe in the python redis modules
"from_url()" method [Redis.from_url](https://redis-py.readthedocs.io/en/stable/connections.html#redis.Redis.from_url) "from_url()" method [Redis.from_url](https://redis-py.readthedocs.io/en/stable/connections.html#redis.Redis.from_url)
Example: `redis_url = "redis://:secret-pass@localhost:6379/0"` Example: `redis_url = "redis://:secret-pass@localhost:6379/0"`
@ -20,7 +28,7 @@ Example: `redis_url = "redis://:secret-pass@localhost:6379/0"`
### Redis Sentinel connection url ### Redis Sentinel connection url
For [Redis sentinel setups](https://redis.io/docs/management/sentinel/) the connection scheme is "redis+sentinel". For [Redis sentinel setups](https://redis.io/docs/management/sentinel/) the connection scheme is "redis+sentinel".
This is an un-offical extensions to the official IANA registered protocol schemes as long as there is no connection url This is an unofficial extensions to the official IANA registered protocol schemes as long as there is no connection url
for Sentinels available. for Sentinels available.
Example: `redis_url = "redis+sentinel://:secret-pass@sentinel-host:26379/mymaster/0"` Example: `redis_url = "redis+sentinel://:secret-pass@sentinel-host:26379/mymaster/0"`

View File

@ -1,17 +1,18 @@
# Vectara # Vectara
>[Vectara](https://docs.vectara.com/docs/) is a GenAI platform for developers. It provides a simple API to build Grounded Generation
What is Vectara? >(aka Retrieval-augmented-generation or RAG) applications.
**Vectara Overview:** **Vectara Overview:**
- Vectara is developer-first API platform for building GenAI applications - `Vectara` is developer-first API platform for building GenAI applications
- To use Vectara - first [sign up](https://console.vectara.com/signup) and create an account. Then create a corpus and an API key for indexing and searching. - To use Vectara - first [sign up](https://console.vectara.com/signup) and create an account. Then create a corpus and an API key for indexing and searching.
- You can use Vectara's [indexing API](https://docs.vectara.com/docs/indexing-apis/indexing) to add documents into Vectara's index - You can use Vectara's [indexing API](https://docs.vectara.com/docs/indexing-apis/indexing) to add documents into Vectara's index
- You can use Vectara's [Search API](https://docs.vectara.com/docs/search-apis/search) to query Vectara's index (which also supports Hybrid search implicitly). - You can use Vectara's [Search API](https://docs.vectara.com/docs/search-apis/search) to query Vectara's index (which also supports Hybrid search implicitly).
- You can use Vectara's integration with LangChain as a Vector store or using the Retriever abstraction. - You can use Vectara's integration with LangChain as a Vector store or using the Retriever abstraction.
## Installation and Setup ## Installation and Setup
To use Vectara with LangChain no special installation steps are required.
To use `Vectara` with LangChain no special installation steps are required.
To get started, follow our [quickstart](https://docs.vectara.com/docs/quickstart) guide to create an account, a corpus and an API key. To get started, follow our [quickstart](https://docs.vectara.com/docs/quickstart) guide to create an account, a corpus and an API key.
Once you have these, you can provide them as arguments to the Vectara vectorstore, or you can set them as environment variables. Once you have these, you can provide them as arguments to the Vectara vectorstore, or you can set them as environment variables.
@ -19,9 +20,8 @@ Once you have these, you can provide them as arguments to the Vectara vectorstor
- export `VECTARA_CORPUS_ID`="your_corpus_id" - export `VECTARA_CORPUS_ID`="your_corpus_id"
- export `VECTARA_API_KEY`="your-vectara-api-key" - export `VECTARA_API_KEY`="your-vectara-api-key"
## Usage
### VectorStore ## Vector Store
There exists a wrapper around the Vectara platform, allowing you to use it as a vectorstore, whether for semantic search or example selection. There exists a wrapper around the Vectara platform, allowing you to use it as a vectorstore, whether for semantic search or example selection.

View File

@ -1,10 +1,10 @@
# Weaviate # Weaviate
This page covers how to use the Weaviate ecosystem within LangChain. >[Weaviate](https://weaviate.io/) is an open-source vector database. It allows you to store data objects and vector embeddings from
>your favorite ML models, and scale seamlessly into billions of data objects.
What is Weaviate?
**Weaviate in a nutshell:** What is `Weaviate`?
- Weaviate is an open-source database of the type vector search engine. - Weaviate is an open-source database of the type vector search engine.
- Weaviate allows you to store JSON documents in a class property-like fashion while attaching machine learning vectors to these documents to represent them in vector space. - Weaviate allows you to store JSON documents in a class property-like fashion while attaching machine learning vectors to these documents to represent them in vector space.
- Weaviate can be used stand-alone (aka bring your vectors) or with a variety of modules that can do the vectorization for you and extend the core capabilities. - Weaviate can be used stand-alone (aka bring your vectors) or with a variety of modules that can do the vectorization for you and extend the core capabilities.
@ -14,15 +14,20 @@ What is Weaviate?
**Weaviate in detail:** **Weaviate in detail:**
Weaviate is a low-latency vector search engine with out-of-the-box support for different media types (text, images, etc.). It offers Semantic Search, Question-Answer Extraction, Classification, Customizable Models (PyTorch/TensorFlow/Keras), etc. Built from scratch in Go, Weaviate stores both objects and vectors, allowing for combining vector search with structured filtering and the fault tolerance of a cloud-native database. It is all accessible through GraphQL, REST, and various client-side programming languages. `Weaviate` is a low-latency vector search engine with out-of-the-box support for different media types (text, images, etc.). It offers Semantic Search, Question-Answer Extraction, Classification, Customizable Models (PyTorch/TensorFlow/Keras), etc. Built from scratch in Go, Weaviate stores both objects and vectors, allowing for combining vector search with structured filtering and the fault tolerance of a cloud-native database. It is all accessible through GraphQL, REST, and various client-side programming languages.
## Installation and Setup ## Installation and Setup
- Install the Python SDK with `pip install weaviate-client`
## Wrappers
### VectorStore Install the Python SDK:
There exists a wrapper around Weaviate indexes, allowing you to use it as a vectorstore, ```bash
pip install weaviate-client
```
## Vector Store
There exists a wrapper around `Weaviate` indexes, allowing you to use it as a vectorstore,
whether for semantic search or example selection. whether for semantic search or example selection.
To import this vectorstore: To import this vectorstore:

View File

@ -6,11 +6,14 @@
"id": "13afcae7", "id": "13afcae7",
"metadata": {}, "metadata": {},
"source": [ "source": [
"# Deep Lake self-querying \n", "# Deep Lake\n",
"\n", "\n",
">[Deep Lake](https://www.activeloop.ai) is a multimodal database for building AI applications.\n", ">[Deep Lake](https://www.activeloop.ai) is a multimodal database for building AI applications\n",
">[Deep Lake](https://github.com/activeloopai/deeplake) is a database for AI.\n",
">Store Vectors, Images, Texts, Videos, etc. Use with LLMs/LangChain. Store, query, version,\n",
"> & visualize any AI data. Stream data in real time to PyTorch/TensorFlow.\n",
"\n", "\n",
"In the notebook we'll demo the `SelfQueryRetriever` wrapped around a Deep Lake vector store. " "In the notebook, we'll demo the `SelfQueryRetriever` wrapped around a `Deep Lake` vector store. "
] ]
}, },
{ {

View File

@ -5,11 +5,11 @@
"id": "13afcae7", "id": "13afcae7",
"metadata": {}, "metadata": {},
"source": [ "source": [
"# Chroma self-querying \n", "# Chroma\n",
"\n", "\n",
">[Chroma](https://docs.trychroma.com/getting-started) is a database for building AI applications with embeddings.\n", ">[Chroma](https://docs.trychroma.com/getting-started) is a database for building AI applications with embeddings.\n",
"\n", "\n",
"In the notebook we'll demo the `SelfQueryRetriever` wrapped around a Chroma vector store. " "In the notebook, we'll demo the `SelfQueryRetriever` wrapped around a `Chroma` vector store. "
] ]
}, },
{ {
@ -447,7 +447,7 @@
"name": "python", "name": "python",
"nbconvert_exporter": "python", "nbconvert_exporter": "python",
"pygments_lexer": "ipython3", "pygments_lexer": "ipython3",
"version": "3.10.6" "version": "3.10.12"
} }
}, },
"nbformat": 4, "nbformat": 4,

View File

@ -2,20 +2,36 @@
"cells": [ "cells": [
{ {
"cell_type": "markdown", "cell_type": "markdown",
"source": [ "id": "59895c73d1a0f3ca",
"# DashVector self-querying\n",
"\n",
"> [DashVector](https://help.aliyun.com/document_detail/2510225.html) is a fully-managed vectorDB service that supports high-dimension dense and sparse vectors, real-time insertion and filtered search. It is built to scale automatically and can adapt to different application requirements.\n",
"\n",
"In this notebook we'll demo the `SelfQueryRetriever` with a `DashVector` vector store."
],
"metadata": { "metadata": {
"collapsed": false "collapsed": false,
"jupyter": {
"outputs_hidden": false
}
}, },
"id": "59895c73d1a0f3ca" "source": [
"# DashVector\n",
"\n",
"> [DashVector](https://help.aliyun.com/document_detail/2510225.html) is a fully managed vector DB service that supports high-dimension dense and sparse vectors, real-time insertion and filtered search. It is built to scale automatically and can adapt to different application requirements.\n",
"> The vector retrieval service `DashVector` is based on the `Proxima` core of the efficient vector engine independently developed by `DAMO Academy`,\n",
"> and provides a cloud-native, fully managed vector retrieval service with horizontal expansion capabilities.\n",
"> `DashVector` exposes its powerful vector management, vector query and other diversified capabilities through a simple and\n",
"> easy-to-use SDK/API interface, which can be quickly integrated by upper-layer AI applications, thereby providing services\n",
"> including large model ecology, multi-modal AI search, molecular structure A variety of application scenarios, including analysis,\n",
"> provide the required efficient vector retrieval capabilities.\n",
"\n",
"In this notebook, we'll demo the `SelfQueryRetriever` with a `DashVector` vector store."
]
}, },
{ {
"cell_type": "markdown", "cell_type": "markdown",
"id": "539ae9367e45a178",
"metadata": {
"collapsed": false,
"jupyter": {
"outputs_hidden": false
}
},
"source": [ "source": [
"## Create DashVector vectorstore\n", "## Create DashVector vectorstore\n",
"\n", "\n",
@ -24,46 +40,55 @@
"To use DashVector, you have to have `dashvector` package installed, and you must have an API key and an Environment. Here are the [installation instructions](https://help.aliyun.com/document_detail/2510223.html).\n", "To use DashVector, you have to have `dashvector` package installed, and you must have an API key and an Environment. Here are the [installation instructions](https://help.aliyun.com/document_detail/2510223.html).\n",
"\n", "\n",
"NOTE: The self-query retriever requires you to have `lark` package installed." "NOTE: The self-query retriever requires you to have `lark` package installed."
], ]
"metadata": {
"collapsed": false
},
"id": "539ae9367e45a178"
}, },
{ {
"cell_type": "code", "cell_type": "code",
"execution_count": 1, "execution_count": 1,
"id": "67df7e1f8dc8cdd0",
"metadata": {
"collapsed": false,
"jupyter": {
"outputs_hidden": false
}
},
"outputs": [], "outputs": [],
"source": [ "source": [
"# !pip install lark dashvector" "# !pip install lark dashvector"
], ]
"metadata": {
"collapsed": false
},
"id": "67df7e1f8dc8cdd0"
}, },
{ {
"cell_type": "code", "cell_type": "code",
"execution_count": 1, "execution_count": 1,
"id": "ff61eaf13973b5fe",
"metadata": {
"ExecuteTime": {
"end_time": "2023-08-24T02:58:46.905337Z",
"start_time": "2023-08-24T02:58:46.252566Z"
},
"collapsed": false,
"jupyter": {
"outputs_hidden": false
}
},
"outputs": [], "outputs": [],
"source": [ "source": [
"import os\n", "import os\n",
"import dashvector\n", "import dashvector\n",
"\n", "\n",
"client = dashvector.Client(api_key=os.environ[\"DASHVECTOR_API_KEY\"])" "client = dashvector.Client(api_key=os.environ[\"DASHVECTOR_API_KEY\"])"
], ]
"metadata": {
"collapsed": false,
"ExecuteTime": {
"end_time": "2023-08-24T02:58:46.905337Z",
"start_time": "2023-08-24T02:58:46.252566Z"
}
},
"id": "ff61eaf13973b5fe"
}, },
{ {
"cell_type": "code", "cell_type": "code",
"execution_count": null, "execution_count": null,
"id": "de5c77957ee42d14",
"metadata": {
"collapsed": false,
"jupyter": {
"outputs_hidden": false
}
},
"outputs": [], "outputs": [],
"source": [ "source": [
"from langchain.schema import Document\n", "from langchain.schema import Document\n",
@ -74,15 +99,22 @@
"\n", "\n",
"# create DashVector collection\n", "# create DashVector collection\n",
"client.create(\"langchain-self-retriever-demo\", dimension=1536)" "client.create(\"langchain-self-retriever-demo\", dimension=1536)"
], ]
"metadata": {
"collapsed": false
},
"id": "de5c77957ee42d14"
}, },
{ {
"cell_type": "code", "cell_type": "code",
"execution_count": 3, "execution_count": 3,
"id": "8f40605548a4550",
"metadata": {
"ExecuteTime": {
"end_time": "2023-08-24T02:59:08.090031Z",
"start_time": "2023-08-24T02:59:05.660295Z"
},
"collapsed": false,
"jupyter": {
"outputs_hidden": false
}
},
"outputs": [], "outputs": [],
"source": [ "source": [
"docs = [\n", "docs = [\n",
@ -119,31 +151,37 @@
"vectorstore = DashVector.from_documents(\n", "vectorstore = DashVector.from_documents(\n",
" docs, embeddings, collection_name=\"langchain-self-retriever-demo\"\n", " docs, embeddings, collection_name=\"langchain-self-retriever-demo\"\n",
")" ")"
], ]
"metadata": {
"collapsed": false,
"ExecuteTime": {
"end_time": "2023-08-24T02:59:08.090031Z",
"start_time": "2023-08-24T02:59:05.660295Z"
}
},
"id": "8f40605548a4550"
}, },
{ {
"cell_type": "markdown", "cell_type": "markdown",
"id": "eb1340adafac8993",
"metadata": {
"collapsed": false,
"jupyter": {
"outputs_hidden": false
}
},
"source": [ "source": [
"## Create your self-querying retriever\n", "## Create your self-querying retriever\n",
"\n", "\n",
"Now we can instantiate our retriever. To do this we'll need to provide some information upfront about the metadata fields that our documents support and a short description of the document contents." "Now we can instantiate our retriever. To do this we'll need to provide some information upfront about the metadata fields that our documents support and a short description of the document contents."
], ]
"metadata": {
"collapsed": false
},
"id": "eb1340adafac8993"
}, },
{ {
"cell_type": "code", "cell_type": "code",
"execution_count": 4, "execution_count": 4,
"id": "d65233dc044f95a7",
"metadata": {
"ExecuteTime": {
"end_time": "2023-08-24T02:59:11.003940Z",
"start_time": "2023-08-24T02:59:10.476722Z"
},
"collapsed": false,
"jupyter": {
"outputs_hidden": false
}
},
"outputs": [], "outputs": [],
"source": [ "source": [
"from langchain.llms import Tongyi\n", "from langchain.llms import Tongyi\n",
@ -175,31 +213,37 @@
"retriever = SelfQueryRetriever.from_llm(\n", "retriever = SelfQueryRetriever.from_llm(\n",
" llm, vectorstore, document_content_description, metadata_field_info, verbose=True\n", " llm, vectorstore, document_content_description, metadata_field_info, verbose=True\n",
")" ")"
], ]
"metadata": {
"collapsed": false,
"ExecuteTime": {
"end_time": "2023-08-24T02:59:11.003940Z",
"start_time": "2023-08-24T02:59:10.476722Z"
}
},
"id": "d65233dc044f95a7"
}, },
{ {
"cell_type": "markdown", "cell_type": "markdown",
"id": "a54af0d67b473db6",
"metadata": {
"collapsed": false,
"jupyter": {
"outputs_hidden": false
}
},
"source": [ "source": [
"## Testing it out\n", "## Testing it out\n",
"\n", "\n",
"And now we can try actually using our retriever!" "And now we can try actually using our retriever!"
], ]
"metadata": {
"collapsed": false
},
"id": "a54af0d67b473db6"
}, },
{ {
"cell_type": "code", "cell_type": "code",
"execution_count": 6, "execution_count": 6,
"id": "dad9da670a267fe7",
"metadata": {
"ExecuteTime": {
"end_time": "2023-08-24T02:59:28.577901Z",
"start_time": "2023-08-24T02:59:26.780184Z"
},
"collapsed": false,
"jupyter": {
"outputs_hidden": false
}
},
"outputs": [ "outputs": [
{ {
"name": "stdout", "name": "stdout",
@ -210,7 +254,12 @@
}, },
{ {
"data": { "data": {
"text/plain": "[Document(page_content='A bunch of scientists bring back dinosaurs and mayhem breaks loose', metadata={'year': 1993, 'rating': 7.699999809265137, 'genre': 'action'}),\n Document(page_content='Toys come alive and have a blast doing so', metadata={'year': 1995, 'genre': 'animated'}),\n Document(page_content='Leo DiCaprio gets lost in a dream within a dream within a dream within a ...', metadata={'year': 2010, 'director': 'Christopher Nolan', 'rating': 8.199999809265137}),\n Document(page_content='A psychologist / detective gets lost in a series of dreams within dreams within dreams and Inception reused the idea', metadata={'year': 2006, 'director': 'Satoshi Kon', 'rating': 8.600000381469727})]" "text/plain": [
"[Document(page_content='A bunch of scientists bring back dinosaurs and mayhem breaks loose', metadata={'year': 1993, 'rating': 7.699999809265137, 'genre': 'action'}),\n",
" Document(page_content='Toys come alive and have a blast doing so', metadata={'year': 1995, 'genre': 'animated'}),\n",
" Document(page_content='Leo DiCaprio gets lost in a dream within a dream within a dream within a ...', metadata={'year': 2010, 'director': 'Christopher Nolan', 'rating': 8.199999809265137}),\n",
" Document(page_content='A psychologist / detective gets lost in a series of dreams within dreams within dreams and Inception reused the idea', metadata={'year': 2006, 'director': 'Satoshi Kon', 'rating': 8.600000381469727})]"
]
}, },
"execution_count": 6, "execution_count": 6,
"metadata": {}, "metadata": {},
@ -220,19 +269,22 @@
"source": [ "source": [
"# This example only specifies a relevant query\n", "# This example only specifies a relevant query\n",
"retriever.get_relevant_documents(\"What are some movies about dinosaurs\")" "retriever.get_relevant_documents(\"What are some movies about dinosaurs\")"
], ]
"metadata": {
"collapsed": false,
"ExecuteTime": {
"end_time": "2023-08-24T02:59:28.577901Z",
"start_time": "2023-08-24T02:59:26.780184Z"
}
},
"id": "dad9da670a267fe7"
}, },
{ {
"cell_type": "code", "cell_type": "code",
"execution_count": 7, "execution_count": 7,
"id": "d486a64316153d52",
"metadata": {
"ExecuteTime": {
"end_time": "2023-08-24T02:59:32.370774Z",
"start_time": "2023-08-24T02:59:30.614252Z"
},
"collapsed": false,
"jupyter": {
"outputs_hidden": false
}
},
"outputs": [ "outputs": [
{ {
"name": "stdout", "name": "stdout",
@ -243,7 +295,10 @@
}, },
{ {
"data": { "data": {
"text/plain": "[Document(page_content='Three men walk into the Zone, three men walk out of the Zone', metadata={'year': 1979, 'director': 'Andrei Tarkovsky', 'rating': 9.899999618530273, 'genre': 'science fiction'}),\n Document(page_content='A psychologist / detective gets lost in a series of dreams within dreams within dreams and Inception reused the idea', metadata={'year': 2006, 'director': 'Satoshi Kon', 'rating': 8.600000381469727})]" "text/plain": [
"[Document(page_content='Three men walk into the Zone, three men walk out of the Zone', metadata={'year': 1979, 'director': 'Andrei Tarkovsky', 'rating': 9.899999618530273, 'genre': 'science fiction'}),\n",
" Document(page_content='A psychologist / detective gets lost in a series of dreams within dreams within dreams and Inception reused the idea', metadata={'year': 2006, 'director': 'Satoshi Kon', 'rating': 8.600000381469727})]"
]
}, },
"execution_count": 7, "execution_count": 7,
"metadata": {}, "metadata": {},
@ -253,19 +308,22 @@
"source": [ "source": [
"# This example only specifies a filter\n", "# This example only specifies a filter\n",
"retriever.get_relevant_documents(\"I want to watch a movie rated higher than 8.5\")" "retriever.get_relevant_documents(\"I want to watch a movie rated higher than 8.5\")"
], ]
"metadata": {
"collapsed": false,
"ExecuteTime": {
"end_time": "2023-08-24T02:59:32.370774Z",
"start_time": "2023-08-24T02:59:30.614252Z"
}
},
"id": "d486a64316153d52"
}, },
{ {
"cell_type": "code", "cell_type": "code",
"execution_count": 8, "execution_count": 8,
"id": "e05919cdead7bd4a",
"metadata": {
"ExecuteTime": {
"end_time": "2023-08-24T02:59:35.353439Z",
"start_time": "2023-08-24T02:59:33.278255Z"
},
"collapsed": false,
"jupyter": {
"outputs_hidden": false
}
},
"outputs": [ "outputs": [
{ {
"name": "stdout", "name": "stdout",
@ -276,7 +334,9 @@
}, },
{ {
"data": { "data": {
"text/plain": "[Document(page_content='A bunch of normal-sized women are supremely wholesome and some men pine after them', metadata={'year': 2019, 'director': 'Greta Gerwig', 'rating': 8.300000190734863})]" "text/plain": [
"[Document(page_content='A bunch of normal-sized women are supremely wholesome and some men pine after them', metadata={'year': 2019, 'director': 'Greta Gerwig', 'rating': 8.300000190734863})]"
]
}, },
"execution_count": 8, "execution_count": 8,
"metadata": {}, "metadata": {},
@ -286,19 +346,22 @@
"source": [ "source": [
"# This example specifies a query and a filter\n", "# This example specifies a query and a filter\n",
"retriever.get_relevant_documents(\"Has Greta Gerwig directed any movies about women\")" "retriever.get_relevant_documents(\"Has Greta Gerwig directed any movies about women\")"
], ]
"metadata": {
"collapsed": false,
"ExecuteTime": {
"end_time": "2023-08-24T02:59:35.353439Z",
"start_time": "2023-08-24T02:59:33.278255Z"
}
},
"id": "e05919cdead7bd4a"
}, },
{ {
"cell_type": "code", "cell_type": "code",
"execution_count": 9, "execution_count": 9,
"id": "ac2c7012379e918e",
"metadata": {
"ExecuteTime": {
"end_time": "2023-08-24T02:59:38.913707Z",
"start_time": "2023-08-24T02:59:36.659271Z"
},
"collapsed": false,
"jupyter": {
"outputs_hidden": false
}
},
"outputs": [ "outputs": [
{ {
"name": "stdout", "name": "stdout",
@ -309,7 +372,9 @@
}, },
{ {
"data": { "data": {
"text/plain": "[Document(page_content='Three men walk into the Zone, three men walk out of the Zone', metadata={'year': 1979, 'director': 'Andrei Tarkovsky', 'rating': 9.899999618530273, 'genre': 'science fiction'})]" "text/plain": [
"[Document(page_content='Three men walk into the Zone, three men walk out of the Zone', metadata={'year': 1979, 'director': 'Andrei Tarkovsky', 'rating': 9.899999618530273, 'genre': 'science fiction'})]"
]
}, },
"execution_count": 9, "execution_count": 9,
"metadata": {}, "metadata": {},
@ -319,33 +384,39 @@
"source": [ "source": [
"# This example specifies a composite filter\n", "# This example specifies a composite filter\n",
"retriever.get_relevant_documents(\"What's a highly rated (above 8.5) science fiction film?\")" "retriever.get_relevant_documents(\"What's a highly rated (above 8.5) science fiction film?\")"
], ]
"metadata": {
"collapsed": false,
"ExecuteTime": {
"end_time": "2023-08-24T02:59:38.913707Z",
"start_time": "2023-08-24T02:59:36.659271Z"
}
},
"id": "ac2c7012379e918e"
}, },
{ {
"cell_type": "markdown", "cell_type": "markdown",
"id": "af6aa93ae44af414",
"metadata": {
"collapsed": false,
"jupyter": {
"outputs_hidden": false
}
},
"source": [ "source": [
"## Filter k\n", "## Filter k\n",
"\n", "\n",
"We can also use the self query retriever to specify `k`: the number of documents to fetch.\n", "We can also use the self query retriever to specify `k`: the number of documents to fetch.\n",
"\n", "\n",
"We can do this by passing `enable_limit=True` to the constructor." "We can do this by passing `enable_limit=True` to the constructor."
], ]
"metadata": {
"collapsed": false
},
"id": "af6aa93ae44af414"
}, },
{ {
"cell_type": "code", "cell_type": "code",
"execution_count": 10, "execution_count": 10,
"id": "a8c8f09bf5702767",
"metadata": {
"ExecuteTime": {
"end_time": "2023-08-24T02:59:41.594073Z",
"start_time": "2023-08-24T02:59:41.563323Z"
},
"collapsed": false,
"jupyter": {
"outputs_hidden": false
}
},
"outputs": [], "outputs": [],
"source": [ "source": [
"retriever = SelfQueryRetriever.from_llm(\n", "retriever = SelfQueryRetriever.from_llm(\n",
@ -356,19 +427,22 @@
" enable_limit=True,\n", " enable_limit=True,\n",
" verbose=True,\n", " verbose=True,\n",
")" ")"
], ]
"metadata": {
"collapsed": false,
"ExecuteTime": {
"end_time": "2023-08-24T02:59:41.594073Z",
"start_time": "2023-08-24T02:59:41.563323Z"
}
},
"id": "a8c8f09bf5702767"
}, },
{ {
"cell_type": "code", "cell_type": "code",
"execution_count": 11, "execution_count": 11,
"id": "b1089a6043980b84",
"metadata": {
"ExecuteTime": {
"end_time": "2023-08-24T02:59:48.450506Z",
"start_time": "2023-08-24T02:59:46.252944Z"
},
"collapsed": false,
"jupyter": {
"outputs_hidden": false
}
},
"outputs": [ "outputs": [
{ {
"name": "stdout", "name": "stdout",
@ -379,7 +453,10 @@
}, },
{ {
"data": { "data": {
"text/plain": "[Document(page_content='A bunch of scientists bring back dinosaurs and mayhem breaks loose', metadata={'year': 1993, 'rating': 7.699999809265137, 'genre': 'action'}),\n Document(page_content='Toys come alive and have a blast doing so', metadata={'year': 1995, 'genre': 'animated'})]" "text/plain": [
"[Document(page_content='A bunch of scientists bring back dinosaurs and mayhem breaks loose', metadata={'year': 1993, 'rating': 7.699999809265137, 'genre': 'action'}),\n",
" Document(page_content='Toys come alive and have a blast doing so', metadata={'year': 1995, 'genre': 'animated'})]"
]
}, },
"execution_count": 11, "execution_count": 11,
"metadata": {}, "metadata": {},
@ -389,44 +466,39 @@
"source": [ "source": [
"# This example only specifies a relevant query\n", "# This example only specifies a relevant query\n",
"retriever.get_relevant_documents(\"what are two movies about dinosaurs\")" "retriever.get_relevant_documents(\"what are two movies about dinosaurs\")"
], ]
"metadata": {
"collapsed": false,
"ExecuteTime": {
"end_time": "2023-08-24T02:59:48.450506Z",
"start_time": "2023-08-24T02:59:46.252944Z"
}
},
"id": "b1089a6043980b84"
}, },
{ {
"cell_type": "code", "cell_type": "code",
"execution_count": null, "execution_count": null,
"outputs": [], "id": "6d2d64e2ebb17d30",
"source": [],
"metadata": { "metadata": {
"collapsed": false "collapsed": false,
"jupyter": {
"outputs_hidden": false
}
}, },
"id": "6d2d64e2ebb17d30" "outputs": [],
"source": []
} }
], ],
"metadata": { "metadata": {
"kernelspec": { "kernelspec": {
"display_name": "Python 3", "display_name": "Python 3 (ipykernel)",
"language": "python", "language": "python",
"name": "python3" "name": "python3"
}, },
"language_info": { "language_info": {
"codemirror_mode": { "codemirror_mode": {
"name": "ipython", "name": "ipython",
"version": 2 "version": 3
}, },
"file_extension": ".py", "file_extension": ".py",
"mimetype": "text/x-python", "mimetype": "text/x-python",
"name": "python", "name": "python",
"nbconvert_exporter": "python", "nbconvert_exporter": "python",
"pygments_lexer": "ipython2", "pygments_lexer": "ipython3",
"version": "2.7.6" "version": "3.10.12"
} }
}, },
"nbformat": 4, "nbformat": 4,

View File

@ -5,7 +5,13 @@
"id": "13afcae7", "id": "13afcae7",
"metadata": {}, "metadata": {},
"source": [ "source": [
"# Elasticsearch self-querying " "# Elasticsearch\n",
"\n",
"> [Elasticsearch](https://www.elastic.co/elasticsearch/) is a distributed, RESTful search and analytics engine.\n",
"> It provides a distributed, multi-tenant-capable full-text search engine with an HTTP web interface and schema-free\n",
"> JSON documents.\n",
"\n",
"In this notebook, we'll demo the `SelfQueryRetriever` with an `Elasticsearch` vector store."
] ]
}, },
{ {
@ -13,8 +19,9 @@
"id": "68e75fb9", "id": "68e75fb9",
"metadata": {}, "metadata": {},
"source": [ "source": [
"## Creating a Elasticsearch vector store\n", "## Creating an Elasticsearch vector store\n",
"First we'll want to create a Elasticsearch vector store and seed it with some data. We've created a small demo set of documents that contain summaries of movies.\n", "\n",
"First, we'll want to create an `Elasticsearch` vector store and seed it with some data. We've created a small demo set of documents that contain summaries of movies.\n",
"\n", "\n",
"**Note:** The self-query retriever requires you to have `lark` installed (`pip install lark`). We also need the `elasticsearch` package." "**Note:** The self-query retriever requires you to have `lark` installed (`pip install lark`). We also need the `elasticsearch` package."
] ]
@ -354,7 +361,7 @@
"name": "python", "name": "python",
"nbconvert_exporter": "python", "nbconvert_exporter": "python",
"pygments_lexer": "ipython3", "pygments_lexer": "ipython3",
"version": "3.10.3" "version": "3.10.12"
} }
}, },
"nbformat": 4, "nbformat": 4,

View File

@ -4,9 +4,11 @@
"cell_type": "markdown", "cell_type": "markdown",
"metadata": {}, "metadata": {},
"source": [ "source": [
"# Self-querying with Milvus\n", "# Milvus\n",
"\n", "\n",
"In the walkthrough we'll demo the `SelfQueryRetriever` with a `Milvus` vector store." ">[Milvus](https://milvus.io/docs/overview.md) is a database that stores, indexes, and manages massive embedding vectors generated by deep neural networks and other machine learning (ML) models.\n",
"\n",
"In the walkthrough, we'll demo the `SelfQueryRetriever` with a `Milvus` vector store."
] ]
}, },
{ {
@ -352,7 +354,7 @@
], ],
"metadata": { "metadata": {
"kernelspec": { "kernelspec": {
"display_name": "Python 3", "display_name": "Python 3 (ipykernel)",
"language": "python", "language": "python",
"name": "python3" "name": "python3"
}, },
@ -366,10 +368,9 @@
"name": "python", "name": "python",
"nbconvert_exporter": "python", "nbconvert_exporter": "python",
"pygments_lexer": "ipython3", "pygments_lexer": "ipython3",
"version": "3.11.4" "version": "3.10.12"
}, }
"orig_nbformat": 4
}, },
"nbformat": 4, "nbformat": 4,
"nbformat_minor": 2 "nbformat_minor": 4
} }

View File

@ -5,12 +5,15 @@
"id": "13afcae7", "id": "13afcae7",
"metadata": {}, "metadata": {},
"source": [ "source": [
"# Self-querying with MyScale\n", "# MyScale\n",
"\n", "\n",
">[MyScale](https://docs.myscale.com/en/) is an integrated vector database. You can access your database in SQL and also from here, LangChain. MyScale can make a use of [various data types and functions for filters](https://blog.myscale.com/2023/06/06/why-integrated-database-solution-can-boost-your-llm-apps/#filter-on-anything-without-constraints). It will boost up your LLM app no matter if you are scaling up your data or expand your system to broader application.\n", ">[MyScale](https://docs.myscale.com/en/) is an integrated vector database. You can access your database in SQL and also from here, LangChain.\n",
">`MyScale` can make use of [various data types and functions for filters](https://blog.myscale.com/2023/06/06/why-integrated-database-solution-can-boost-your-llm-apps/#filter-on-anything-without-constraints). It will boost up your LLM app no matter if you are scaling up your data or expand your system to broader application.\n",
"\n", "\n",
"In the notebook we'll demo the `SelfQueryRetriever` wrapped around a MyScale vector store with some extra pieces we contributed to LangChain. In short, it can be condensed into 4 points:\n", "In the notebook, we'll demo the `SelfQueryRetriever` wrapped around a `MyScale` vector store with some extra pieces we contributed to LangChain. \n",
"1. Add `contain` comparator to match list of any if there is more than one element matched\n", "\n",
"In short, it can be condensed into 4 points:\n",
"1. Add `contain` comparator to match the list of any if there is more than one element matched\n",
"2. Add `timestamp` data type for datetime match (ISO-format, or YYYY-MM-DD)\n", "2. Add `timestamp` data type for datetime match (ISO-format, or YYYY-MM-DD)\n",
"3. Add `like` comparator for string pattern search\n", "3. Add `like` comparator for string pattern search\n",
"4. Add arbitrary function capability" "4. Add arbitrary function capability"
@ -221,9 +224,7 @@
"cell_type": "code", "cell_type": "code",
"execution_count": null, "execution_count": null,
"id": "fc3f1e6e", "id": "fc3f1e6e",
"metadata": { "metadata": {},
"scrolled": false
},
"outputs": [], "outputs": [],
"source": [ "source": [
"# This example only specifies a filter\n", "# This example only specifies a filter\n",
@ -384,7 +385,7 @@
"name": "python", "name": "python",
"nbconvert_exporter": "python", "nbconvert_exporter": "python",
"pygments_lexer": "ipython3", "pygments_lexer": "ipython3",
"version": "3.11.3" "version": "3.10.12"
} }
}, },
"nbformat": 4, "nbformat": 4,

View File

@ -5,9 +5,11 @@
"id": "13afcae7", "id": "13afcae7",
"metadata": {}, "metadata": {},
"source": [ "source": [
"# Self-querying with Pinecone\n", "# Pinecone\n",
"\n", "\n",
"In the walkthrough we'll demo the `SelfQueryRetriever` with a `Pinecone` vector store." ">[Pinecone](https://docs.pinecone.io/docs/overview) is a vector database with broad functionality.\n",
"\n",
"In the walkthrough, we'll demo the `SelfQueryRetriever` with a `Pinecone` vector store."
] ]
}, },
{ {
@ -395,7 +397,7 @@
"name": "python", "name": "python",
"nbconvert_exporter": "python", "nbconvert_exporter": "python",
"pygments_lexer": "ipython3", "pygments_lexer": "ipython3",
"version": "3.11.3" "version": "3.10.12"
} }
}, },
"nbformat": 4, "nbformat": 4,

View File

@ -6,11 +6,11 @@
"id": "13afcae7", "id": "13afcae7",
"metadata": {}, "metadata": {},
"source": [ "source": [
"# Qdrant self-querying \n", "# Qdrant\n",
"\n", "\n",
">[Qdrant](https://qdrant.tech/documentation/) (read: quadrant) is a vector similarity search engine. It provides a production-ready service with a convenient API to store, search, and manage points - vectors with an additional payload. `Qdrant` is tailored to extended filtering support.\n", ">[Qdrant](https://qdrant.tech/documentation/) (read: quadrant) is a vector similarity search engine. It provides a production-ready service with a convenient API to store, search, and manage points - vectors with an additional payload. `Qdrant` is tailored to extended filtering support.\n",
"\n", "\n",
"In the notebook we'll demo the `SelfQueryRetriever` wrapped around a Qdrant vector store. " "In the notebook, we'll demo the `SelfQueryRetriever` wrapped around a `Qdrant` vector store. "
] ]
}, },
{ {
@ -419,7 +419,7 @@
"name": "python", "name": "python",
"nbconvert_exporter": "python", "nbconvert_exporter": "python",
"pygments_lexer": "ipython3", "pygments_lexer": "ipython3",
"version": "3.10.6" "version": "3.10.12"
} }
}, },
"nbformat": 4, "nbformat": 4,

View File

@ -5,11 +5,11 @@
"id": "13afcae7", "id": "13afcae7",
"metadata": {}, "metadata": {},
"source": [ "source": [
"# Redis self-querying \n", "# Redis\n",
"\n", "\n",
">[Redis](https://redis.com) is an open-source key-value store that can be used as a cache, message broker, database, vector database and more.\n", ">[Redis](https://redis.com) is an open-source key-value store that can be used as a cache, message broker, database, vector database and more.\n",
"\n", "\n",
"In the notebook we'll demo the `SelfQueryRetriever` wrapped around a Redis vector store. " "In the notebook, we'll demo the `SelfQueryRetriever` wrapped around a `Redis` vector store. "
] ]
}, },
{ {
@ -450,9 +450,9 @@
], ],
"metadata": { "metadata": {
"kernelspec": { "kernelspec": {
"display_name": "poetry-venv", "display_name": "Python 3 (ipykernel)",
"language": "python", "language": "python",
"name": "poetry-venv" "name": "python3"
}, },
"language_info": { "language_info": {
"codemirror_mode": { "codemirror_mode": {
@ -464,7 +464,7 @@
"name": "python", "name": "python",
"nbconvert_exporter": "python", "nbconvert_exporter": "python",
"pygments_lexer": "ipython3", "pygments_lexer": "ipython3",
"version": "3.9.1" "version": "3.10.12"
} }
}, },
"nbformat": 4, "nbformat": 4,

View File

@ -5,19 +5,22 @@
"id": "13afcae7", "id": "13afcae7",
"metadata": {}, "metadata": {},
"source": [ "source": [
"# Supabase Vector self-querying \n", "# Supabase\n",
"\n", "\n",
">[Supabase](https://supabase.com/docs) is an open source `Firebase` alternative. \n", ">[Supabase](https://supabase.com/docs) is an open-source `Firebase` alternative. \n",
"> `Supabase` is built on top of `PostgreSQL`, which offers strong `SQL` \n", "> `Supabase` is built on top of `PostgreSQL`, which offers strong `SQL` \n",
"> querying capabilities and enables a simple interface with already-existing tools and frameworks.\n", "> querying capabilities and enables a simple interface with already-existing tools and frameworks.\n",
"\n", "\n",
">[PostgreSQL](https://en.wikipedia.org/wiki/PostgreSQL) also known as `Postgres`,\n", ">[PostgreSQL](https://en.wikipedia.org/wiki/PostgreSQL) also known as `Postgres`,\n",
"> is a free and open-source relational database management system (RDBMS) \n", "> is a free and open-source relational database management system (RDBMS) \n",
"> emphasizing extensibility and `SQL` compliance.\n", "> emphasizing extensibility and `SQL` compliance.\n",
">\n",
">[Supabase](https://supabase.com/docs/guides/ai) provides an open-source toolkit for developing AI applications\n",
">using Postgres and pgvector. Use the Supabase client libraries to store, index, and query your vector embeddings at scale.\n",
"\n", "\n",
"In the notebook we'll demo the `SelfQueryRetriever` wrapped around a Supabase vector store.\n", "In the notebook, we'll demo the `SelfQueryRetriever` wrapped around a `Supabase` vector store.\n",
"\n", "\n",
"Specifically we will:\n", "Specifically, we will:\n",
"1. Create a Supabase database\n", "1. Create a Supabase database\n",
"2. Enable the `pgvector` extension\n", "2. Enable the `pgvector` extension\n",
"3. Create a `documents` table and `match_documents` function that will be used by `SupabaseVectorStore`\n", "3. Create a `documents` table and `match_documents` function that will be used by `SupabaseVectorStore`\n",
@ -569,7 +572,7 @@
"name": "python", "name": "python",
"nbconvert_exporter": "python", "nbconvert_exporter": "python",
"pygments_lexer": "ipython3", "pygments_lexer": "ipython3",
"version": "3.9.1" "version": "3.10.12"
} }
}, },
"nbformat": 4, "nbformat": 4,

View File

@ -5,11 +5,12 @@
"id": "13afcae7", "id": "13afcae7",
"metadata": {}, "metadata": {},
"source": [ "source": [
"# Vectara self-querying \n", "# Vectara\n",
"\n", "\n",
">[Vectara](https://docs.vectara.com/docs/) is a GenAI platform for developers. It provides a simple API to build Grounded Generation (aka Retrieval-augmented-generation) applications.\n", ">[Vectara](https://docs.vectara.com/docs/) is a GenAI platform for developers. It provides a simple API to build Grounded Generation\n",
">(aka Retrieval-augmented-generation or RAG) applications.\n",
"\n", "\n",
"In the notebook we'll demo the `SelfQueryRetriever` wrapped around a Vectara vector store. " "In the notebook, we'll demo the `SelfQueryRetriever` wrapped around a Vectara vector store. "
] ]
}, },
{ {
@ -432,7 +433,7 @@
"name": "python", "name": "python",
"nbconvert_exporter": "python", "nbconvert_exporter": "python",
"pygments_lexer": "ipython3", "pygments_lexer": "ipython3",
"version": "3.10.9" "version": "3.10.12"
} }
}, },
"nbformat": 4, "nbformat": 4,

View File

@ -5,7 +5,12 @@
"id": "13afcae7", "id": "13afcae7",
"metadata": {}, "metadata": {},
"source": [ "source": [
"# Weaviate self-querying " "# Weaviate\n",
"\n",
">[Weaviate](https://weaviate.io/) is an open-source vector database. It allows you to store data objects and vector embeddings from\n",
">your favorite ML models, and scale seamlessly into billions of data objects.\n",
"\n",
"In the notebook, we'll demo the `SelfQueryRetriever` wrapped around a `Weaviate` vector store. "
] ]
}, },
{ {
@ -293,7 +298,7 @@
"name": "python", "name": "python",
"nbconvert_exporter": "python", "nbconvert_exporter": "python",
"pygments_lexer": "ipython3", "pygments_lexer": "ipython3",
"version": "3.10.6" "version": "3.10.12"
} }
}, },
"nbformat": 4, "nbformat": 4,