diff --git a/docs/docs/integrations/vectorstores/mongodb_atlas.ipynb b/docs/docs/integrations/vectorstores/mongodb_atlas.ipynb index 004da71dac2..dd29cf56dc1 100644 --- a/docs/docs/integrations/vectorstores/mongodb_atlas.ipynb +++ b/docs/docs/integrations/vectorstores/mongodb_atlas.ipynb @@ -9,7 +9,7 @@ "\n", "This notebook covers how to MongoDB Atlas vector search in LangChain, using the `langchain-mongodb` package.\n", "\n", - ">[MongoDB Atlas](https://www.mongodb.com/docs/atlas/) is a fully-managed cloud database available in AWS, Azure, and GCP. It supports native Vector Search and full text search (BM25) on your MongoDB document data.\n", + ">[MongoDB Atlas](https://www.mongodb.com/docs/atlas/) is a fully-managed cloud database available in AWS, Azure, and GCP. It supports native Vector Search, full text search (BM25), and hybrid search on your MongoDB document data.\n", "\n", ">[MongoDB Atlas Vector Search](https://www.mongodb.com/products/platform/atlas-vector-search) allows to store your embeddings in MongoDB documents, create a vector search index, and perform KNN search with an approximate nearest neighbor algorithm (`Hierarchical Navigable Small Worlds`). It uses the [$vectorSearch MQL Stage](https://www.mongodb.com/docs/atlas/atlas-vector-search/vector-search-overview/). " ] @@ -23,7 +23,7 @@ "\n", ">*An Atlas cluster running MongoDB version 6.0.11, 7.0.2, or later (including RCs).\n", "\n", - "To use MongoDB Atlas, you must first deploy a cluster. We have a Forever-Free tier of clusters available. To get started head over to Atlas here: [quick start](https://www.mongodb.com/docs/atlas/getting-started/).\n", + "To use MongoDB Atlas, you must first deploy a cluster. We have a Forever-Free tier of cluster on a cloud of your choice available. To get started head over to Atlas here: [quick start](https://www.mongodb.com/docs/atlas/getting-started/).\n", "\n", "You'll need to install `langchain-mongodb` and `pymongo` to use this integration." ] @@ -104,7 +104,7 @@ "# | echo: false\n", "from langchain_openai import OpenAIEmbeddings\n", "\n", - "embeddings = OpenAIEmbeddings(model=\"text-embedding-3-small\")" + "embeddings = OpenAIEmbeddings()" ] }, { @@ -114,7 +114,7 @@ "metadata": {}, "outputs": [], "source": [ - "from langchain_mongodb.vectorstores import MongoDBAtlasVectorSearch\n", + "from langchain_mongodb import MongoDBAtlasVectorSearch\n", "from pymongo import MongoClient\n", "\n", "# initialize MongoDB python client\n", @@ -131,7 +131,31 @@ " embedding=embeddings,\n", " index_name=ATLAS_VECTOR_SEARCH_INDEX_NAME,\n", " relevance_score_fn=\"cosine\",\n", - ")" + ")\n", + "\n", + "# Create vector search index on the collection\n", + "# Since we are using the default OpenAI embedding model (ada-v2) we need to specify the dimensions as 1536\n", + "vector_store.create_vector_search_index(dimensions=1536)" + ] + }, + { + "cell_type": "markdown", + "id": "aec0bb58", + "metadata": {}, + "source": [ + "[OPTIONAL] Alternative to the `vector_store.create_vector_search_index` command above, you can also create the vector search index using the Atlas UI with the following index definition:\n", + "```json\n", + "{\n", + " \"fields\":[\n", + " {\n", + " \"type\": \"vector\",\n", + " \"path\": \"embedding\",\n", + " \"numDimensions\": 1536,\n", + " \"similarity\": \"cosine\"\n", + " }\n", + " ]\n", + "}\n", + "```" ] }, { @@ -367,6 +391,21 @@ "id": "dacac7b8", "metadata": {}, "source": [ + "\n", + "\n", + "To enable pre-filtering you need to update the index definition to include a filter field. In this example, we will use the `source` field as the filter field.\n", + "\n", + "This can be done programmatically using the `MongoDBAtlasVectorSearch.create_vector_search_index` method.\n", + "\n", + "```python\n", + "vectorstore.create_vector_search_index(\n", + " dimensions=1536,\n", + " filters=[{\"type\":\"filter\", \"path\":\"source\"}],\n", + " update=True\n", + ")\n", + "```\n", + "\n", + "Alternatively, you can also update the index using the Atlas UI with the following index definition:\n", "```json\n", "{\n", " \"fields\":[\n", @@ -384,20 +423,10 @@ "}\n", "```\n", "\n", - "You can also update the index programmatically using the `MongoDBAtlasVectorSearch.create_index` method.\n", - "\n", - "```python\n", - "vectorstore.create_index(\n", - " dimensions=1536,\n", - " filters=[{\"type\":\"filter\", \"path\":\"source\"}],\n", - " update=True\n", - ")\n", - "```\n", - "\n", "And then you can run a query with filter as follows:\n", "\n", "```python\n", - "results = vector_store.similarity_search(query=\"foo\",k=1,pre_filter={\"source\": {\"$eq\": \"https://example.com\"}})\n", + "results = vector_store.similarity_search(query=\"foo\", k=1, pre_filter={\"source\": {\"$eq\": \"https://example.com\"}})\n", "for doc in results:\n", " print(f\"* {doc.page_content} [{doc.metadata}]\")\n", "```" @@ -410,7 +439,7 @@ "source": [ "#### Other search methods\n", "\n", - "There are a variety of other search methods that are not covered in this notebook, such as MMR search or searching by vector. For a full list of the search abilities available for `AstraDBVectorStore` check out the [API reference](https://python.langchain.com/api_reference/astradb/vectorstores/langchain_astradb.vectorstores.AstraDBVectorStore.html)." + "There are a variety of other search methods that are not covered in this notebook, such as MMR search or searching by vector. For a full list of the search abilities available for `MongoDBAtlasVectorStore` check out the [API reference](https://api.python.langchain.com/en/latest/vectorstores/langchain_mongodb.vectorstores.MongoDBAtlasVectorSearch.html)." ] }, { @@ -470,7 +499,7 @@ "metadata": {}, "source": [ "# Other Notes\n", - ">* More documentation can be found at [LangChain-MongoDB](https://www.mongodb.com/docs/atlas/atlas-vector-search/ai-integrations/langchain/) site\n", + ">* More documentation can be found at [MongoDB's LangChain Docs](https://www.mongodb.com/docs/atlas/atlas-vector-search/ai-integrations/langchain/) site\n", ">* This feature is Generally Available and ready for production deployments.\n", ">* The langchain version 0.0.305 ([release notes](https://github.com/langchain-ai/langchain/releases/tag/v0.0.305)) introduces the support for $vectorSearch MQL stage, which is available with MongoDB Atlas 6.0.11 and 7.0.2. Users utilizing earlier versions of MongoDB Atlas need to pin their LangChain version to <=0.0.304\n", "> " @@ -509,3 +538,4 @@ "nbformat": 4, "nbformat_minor": 5 } +