mirror of
https://github.com/hwchase17/langchain.git
synced 2025-09-27 14:26:48 +00:00
community[minor]: Adds a vector store for Azure Cosmos DB for NoSQL (#21676)
This PR add supports for Azure Cosmos DB for NoSQL vector store. Summary: Description: added vector store integration for Azure Cosmos DB for NoSQL Vector Store, Dependencies: azure-cosmos dependency, Tag maintainer: @hwchase17, @baskaryan @efriis @eyurtsev --------- Co-authored-by: Bagatur <baskaryan@gmail.com> Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
This commit is contained in:
@@ -60,7 +60,7 @@
|
||||
" * document addition by id (`add_documents` method with `ids` argument)\n",
|
||||
" * delete by id (`delete` method with `ids` argument)\n",
|
||||
"\n",
|
||||
"Compatible Vectorstores: `Aerospike`, `AnalyticDB`, `AstraDB`, `AwaDB`, `Bagel`, `Cassandra`, `Chroma`, `CouchbaseVectorStore`, `DashVector`, `DatabricksVectorSearch`, `DeepLake`, `Dingo`, `ElasticVectorSearch`, `ElasticsearchStore`, `FAISS`, `HanaDB`, `Milvus`, `MyScale`, `OpenSearchVectorSearch`, `PGVector`, `Pinecone`, `Qdrant`, `Redis`, `Rockset`, `ScaNN`, `SupabaseVectorStore`, `SurrealDBStore`, `TimescaleVector`, `Vald`, `VDMS`, `Vearch`, `VespaStore`, `Weaviate`, `Yellowbrick`, `ZepVectorStore`, `TencentVectorDB`, `OpenSearchVectorSearch`.\n",
|
||||
"Compatible Vectorstores: `Aerospike`, `AnalyticDB`, `AstraDB`, `AwaDB`, `AzureCosmosDBNoSqlVectorSearch`, `AzureCosmosDBVectorSearch`, `Bagel`, `Cassandra`, `Chroma`, `CouchbaseVectorStore`, `DashVector`, `DatabricksVectorSearch`, `DeepLake`, `Dingo`, `ElasticVectorSearch`, `ElasticsearchStore`, `FAISS`, `HanaDB`, `Milvus`, `MyScale`, `OpenSearchVectorSearch`, `PGVector`, `Pinecone`, `Qdrant`, `Redis`, `Rockset`, `ScaNN`, `SupabaseVectorStore`, `SurrealDBStore`, `TimescaleVector`, `Vald`, `VDMS`, `Vearch`, `VespaStore`, `Weaviate`, `Yellowbrick`, `ZepVectorStore`, `TencentVectorDB`, `OpenSearchVectorSearch`.\n",
|
||||
" \n",
|
||||
"## Caution\n",
|
||||
"\n",
|
||||
|
@@ -225,7 +225,7 @@ from langchain_community.document_loaders.onenote import OneNoteLoader
|
||||
|
||||
## Vector stores
|
||||
|
||||
### Azure Cosmos DB
|
||||
### Azure Cosmos DB MongoDB vCore
|
||||
|
||||
>[Azure Cosmos DB for MongoDB vCore](https://learn.microsoft.com/en-us/azure/cosmos-db/mongodb/vcore/) makes it easy to create a database with full native MongoDB support.
|
||||
> You can apply your MongoDB experience and continue to use your favorite MongoDB drivers, SDKs, and tools by pointing your application to the API for MongoDB vCore account's connection string.
|
||||
@@ -255,6 +255,38 @@ See a [usage example](/docs/integrations/vectorstores/azure_cosmos_db).
|
||||
from langchain_community.vectorstores import AzureCosmosDBVectorSearch
|
||||
```
|
||||
|
||||
### Azure Cosmos DB NoSQL
|
||||
|
||||
>[Azure Cosmos DB for NoSQL](https://learn.microsoft.com/en-us/azure/cosmos-db/nosql/vector-search) now offers vector indexing and search in preview.
|
||||
This feature is designed to handle high-dimensional vectors, enabling efficient and accurate vector search at any scale. You can now store vectors
|
||||
directly in the documents alongside your data. This means that each document in your database can contain not only traditional schema-free data,
|
||||
but also high-dimensional vectors as other properties of the documents. This colocation of data and vectors allows for efficient indexing and searching,
|
||||
as the vectors are stored in the same logical unit as the data they represent. This simplifies data management, AI application architectures, and the
|
||||
efficiency of vector-based operations.
|
||||
|
||||
#### Installation and Setup
|
||||
|
||||
See [detail configuration instructions](/docs/integrations/vectorstores/azure_cosmos_db_no_sql).
|
||||
|
||||
We need to install `azure-cosmos` python package.
|
||||
|
||||
```bash
|
||||
pip install azure-cosmos
|
||||
```
|
||||
|
||||
#### Deploy Azure Cosmos DB on Microsoft Azure
|
||||
|
||||
Azure Cosmos DB offers a solution for modern apps and intelligent workloads by being very responsive with dynamic and elastic autoscale. It is available
|
||||
in every Azure region and can automatically replicate data closer to users. It has SLA guaranteed low-latency and high availability.
|
||||
|
||||
[Sign Up](https://learn.microsoft.com/en-us/azure/cosmos-db/nosql/quickstart-python?pivots=devcontainer-codespace) for free to get started today.
|
||||
|
||||
See a [usage example](/docs/integrations/vectorstores/azure_cosmos_db_no_sql).
|
||||
|
||||
```python
|
||||
from langchain_community.vectorstores import AzureCosmosDBNoSQLVectorSearch
|
||||
```
|
||||
|
||||
## Retrievers
|
||||
### Azure AI Search
|
||||
|
||||
|
@@ -3,11 +3,9 @@
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"id": "245c0aa70db77606",
|
||||
"metadata": {
|
||||
"collapsed": false
|
||||
},
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"# Azure Cosmos DB\n",
|
||||
"# Azure Cosmos DB Mongo vCore\n",
|
||||
"\n",
|
||||
"This notebook shows you how to leverage this integrated [vector database](https://learn.microsoft.com/en-us/azure/cosmos-db/vector-database) to store documents in collections, create indicies and perform vector search queries using approximate nearest neighbor algorithms such as COS (cosine distance), L2 (Euclidean distance), and IP (inner product) to locate documents close to the query vectors. \n",
|
||||
" \n",
|
||||
@@ -22,9 +20,7 @@
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"id": "8c493e205ce1dda5",
|
||||
"metadata": {
|
||||
"collapsed": false
|
||||
},
|
||||
"metadata": {},
|
||||
"source": []
|
||||
},
|
||||
{
|
||||
@@ -35,8 +31,7 @@
|
||||
"ExecuteTime": {
|
||||
"end_time": "2024-02-08T18:25:05.278480Z",
|
||||
"start_time": "2024-02-08T18:24:51.560677Z"
|
||||
},
|
||||
"collapsed": false
|
||||
}
|
||||
},
|
||||
"outputs": [
|
||||
{
|
||||
@@ -62,8 +57,7 @@
|
||||
"ExecuteTime": {
|
||||
"end_time": "2024-02-08T18:25:56.926147Z",
|
||||
"start_time": "2024-02-08T18:25:56.900087Z"
|
||||
},
|
||||
"collapsed": false
|
||||
}
|
||||
},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
@@ -78,9 +72,7 @@
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"id": "f2e66b097c6ce2e3",
|
||||
"metadata": {
|
||||
"collapsed": false
|
||||
},
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"We want to use `OpenAIEmbeddings` so we need to set up our Azure OpenAI API Key alongside other environment variables. "
|
||||
]
|
||||
@@ -93,8 +85,7 @@
|
||||
"ExecuteTime": {
|
||||
"end_time": "2024-02-08T18:26:06.558294Z",
|
||||
"start_time": "2024-02-08T18:26:06.550008Z"
|
||||
},
|
||||
"collapsed": false
|
||||
}
|
||||
},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
@@ -114,9 +105,7 @@
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"id": "ebaa28c6e2b35063",
|
||||
"metadata": {
|
||||
"collapsed": false
|
||||
},
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"Now, we need to load the documents into the collection, create the index and then run our queries against the index to retrieve matches.\n",
|
||||
"\n",
|
||||
@@ -131,8 +120,7 @@
|
||||
"ExecuteTime": {
|
||||
"end_time": "2024-02-08T18:27:00.782280Z",
|
||||
"start_time": "2024-02-08T18:26:47.339151Z"
|
||||
},
|
||||
"collapsed": false
|
||||
}
|
||||
},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
@@ -172,8 +160,7 @@
|
||||
"ExecuteTime": {
|
||||
"end_time": "2024-02-08T18:31:13.486173Z",
|
||||
"start_time": "2024-02-08T18:30:54.175890Z"
|
||||
},
|
||||
"collapsed": false
|
||||
}
|
||||
},
|
||||
"outputs": [
|
||||
{
|
||||
@@ -236,8 +223,7 @@
|
||||
"ExecuteTime": {
|
||||
"end_time": "2024-02-08T18:31:47.468902Z",
|
||||
"start_time": "2024-02-08T18:31:46.053602Z"
|
||||
},
|
||||
"collapsed": false
|
||||
}
|
||||
},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
@@ -254,8 +240,7 @@
|
||||
"ExecuteTime": {
|
||||
"end_time": "2024-02-08T18:31:50.982598Z",
|
||||
"start_time": "2024-02-08T18:31:50.977605Z"
|
||||
},
|
||||
"collapsed": false
|
||||
}
|
||||
},
|
||||
"outputs": [
|
||||
{
|
||||
@@ -279,9 +264,7 @@
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"id": "37e4df8c7d7db851",
|
||||
"metadata": {
|
||||
"collapsed": false
|
||||
},
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"Once the documents have been loaded and the index has been created, you can now instantiate the vector store directly and run queries against the index"
|
||||
]
|
||||
@@ -294,8 +277,7 @@
|
||||
"ExecuteTime": {
|
||||
"end_time": "2024-02-08T18:32:14.299599Z",
|
||||
"start_time": "2024-02-08T18:32:12.923464Z"
|
||||
},
|
||||
"collapsed": false
|
||||
}
|
||||
},
|
||||
"outputs": [
|
||||
{
|
||||
@@ -332,8 +314,7 @@
|
||||
"ExecuteTime": {
|
||||
"end_time": "2024-02-08T18:32:24.021434Z",
|
||||
"start_time": "2024-02-08T18:32:22.867658Z"
|
||||
},
|
||||
"collapsed": false
|
||||
}
|
||||
},
|
||||
"outputs": [
|
||||
{
|
||||
@@ -366,30 +347,28 @@
|
||||
"cell_type": "code",
|
||||
"execution_count": null,
|
||||
"id": "b63c73c7e905001c",
|
||||
"metadata": {
|
||||
"collapsed": false
|
||||
},
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": []
|
||||
}
|
||||
],
|
||||
"metadata": {
|
||||
"kernelspec": {
|
||||
"display_name": "Python 3",
|
||||
"display_name": "Python 3 (ipykernel)",
|
||||
"language": "python",
|
||||
"name": "python3"
|
||||
},
|
||||
"language_info": {
|
||||
"codemirror_mode": {
|
||||
"name": "ipython",
|
||||
"version": 2
|
||||
"version": 3
|
||||
},
|
||||
"file_extension": ".py",
|
||||
"mimetype": "text/x-python",
|
||||
"name": "python",
|
||||
"nbconvert_exporter": "python",
|
||||
"pygments_lexer": "ipython2",
|
||||
"version": "2.7.6"
|
||||
"pygments_lexer": "ipython3",
|
||||
"version": "3.11.4"
|
||||
}
|
||||
},
|
||||
"nbformat": 4,
|
||||
|
336
docs/docs/integrations/vectorstores/azure_cosmos_db_no_sql.ipynb
Normal file
336
docs/docs/integrations/vectorstores/azure_cosmos_db_no_sql.ipynb
Normal file
File diff suppressed because one or more lines are too long
Reference in New Issue
Block a user