mirror of
https://github.com/hwchase17/langchain.git
synced 2025-06-24 07:35:18 +00:00
qdrant: Documentation for the new QdrantVectorStore class (#24166)
## Description Follow up on #24165. Adds a page to document the latest usage of the new `QdrantVectorStore` class.
This commit is contained in:
parent
1244e66bd4
commit
d93ae756e6
@ -9,11 +9,13 @@
|
||||
"source": [
|
||||
"# Hybrid Search\n",
|
||||
"\n",
|
||||
"The standard search in LangChain is done by vector similarity. However, a number of vectorstores implementations (Astra DB, ElasticSearch, Neo4J, AzureSearch, ...) also support more advanced search combining vector similarity search and other search techniques (full-text, BM25, and so on). This is generally referred to as \"Hybrid\" search.\n",
|
||||
"The standard search in LangChain is done by vector similarity. However, a number of vectorstores implementations (Astra DB, ElasticSearch, Neo4J, AzureSearch, Qdrant...) also support more advanced search combining vector similarity search and other search techniques (full-text, BM25, and so on). This is generally referred to as \"Hybrid\" search.\n",
|
||||
"\n",
|
||||
"**Step 1: Make sure the vectorstore you are using supports hybrid search**\n",
|
||||
"\n",
|
||||
"At the moment, there is no unified way to perform hybrid search in LangChain. Each vectorstore may have their own way to do it. This is generally exposed as a keyword argument that is passed in during `similarity_search`. By reading the documentation or source code, figure out whether the vectorstore you are using supports hybrid search, and, if so, how to use it.\n",
|
||||
"At the moment, there is no unified way to perform hybrid search in LangChain. Each vectorstore may have their own way to do it. This is generally exposed as a keyword argument that is passed in during `similarity_search`.\n",
|
||||
"\n",
|
||||
"By reading the documentation or source code, figure out whether the vectorstore you are using supports hybrid search, and, if so, how to use it.\n",
|
||||
"\n",
|
||||
"**Step 2: Add that parameter as a configurable field for the chain**\n",
|
||||
"\n",
|
||||
|
@ -21,7 +21,7 @@ whether for semantic search or example selection.
|
||||
|
||||
To import this vectorstore:
|
||||
```python
|
||||
from langchain_qdrant import Qdrant
|
||||
from langchain_qdrant import QdrantVectorStore
|
||||
```
|
||||
|
||||
For a more detailed walkthrough of the Qdrant wrapper, see [this notebook](/docs/integrations/vectorstores/qdrant)
|
||||
|
@ -8,14 +8,15 @@
|
||||
"source": [
|
||||
"# Qdrant\n",
|
||||
"\n",
|
||||
">[Qdrant](https://qdrant.tech/documentation/) (read: quadrant ) is a vector similarity search engine. It provides a production-ready service with a convenient API to store, search, and manage points - vectors with an additional payload. `Qdrant` is tailored to extended filtering support. It makes it useful for all sorts of neural network or semantic-based matching, faceted search, and other applications.\n",
|
||||
">[Qdrant](https://qdrant.tech/documentation/) (read: quadrant ) is a vector similarity search engine. It provides a production-ready service with a convenient API to store, search, and manage vectors with additional payload and extended filtering support. It makes it useful for all sorts of neural network or semantic-based matching, faceted search, and other applications.\n",
|
||||
"\n",
|
||||
"This documentation demonstrates how to use Qdrant with Langchain for dense/sparse and hybrid retrieval.\n",
|
||||
"\n",
|
||||
"This notebook shows how to use functionality related to the `Qdrant` vector database. \n",
|
||||
"> This page documents the `QdrantVectorStore` class that supports multiple retrieval modes via Qdrant's new [Query API](https://qdrant.tech/blog/qdrant-1.10.x/). It requires you to run Qdrant v1.10.0 or above.\n",
|
||||
"\n",
|
||||
"There are various modes of how to run `Qdrant`, and depending on the chosen one, there will be some subtle differences. The options include:\n",
|
||||
"- Local mode, no server required\n",
|
||||
"- On-premise server deployment\n",
|
||||
"- Docker deployments\n",
|
||||
"- Qdrant Cloud\n",
|
||||
"\n",
|
||||
"See the [installation instructions](https://qdrant.tech/documentation/install/)."
|
||||
@ -30,7 +31,7 @@
|
||||
},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"%pip install --upgrade --quiet langchain-qdrant langchain-openai langchain langchain-community"
|
||||
"%pip install langchain-qdrant langchain-openai langchain"
|
||||
]
|
||||
},
|
||||
{
|
||||
@ -39,30 +40,7 @@
|
||||
"id": "7b2f111b-357a-4f42-9730-ef0603bdc1b5",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"We want to use `OpenAIEmbeddings` so we have to get the OpenAI API Key."
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 2,
|
||||
"id": "082e7e8b-ac52-430c-98d6-8f0924457642",
|
||||
"metadata": {
|
||||
"tags": []
|
||||
},
|
||||
"outputs": [
|
||||
{
|
||||
"name": "stdout",
|
||||
"output_type": "stream",
|
||||
"text": [
|
||||
"OpenAI API Key: ········\n"
|
||||
]
|
||||
}
|
||||
],
|
||||
"source": [
|
||||
"import getpass\n",
|
||||
"import os\n",
|
||||
"\n",
|
||||
"os.environ[\"OPENAI_API_KEY\"] = getpass.getpass(\"OpenAI API Key:\")"
|
||||
"We will use `OpenAIEmbeddings` for demonstration."
|
||||
]
|
||||
},
|
||||
{
|
||||
@ -80,7 +58,7 @@
|
||||
"source": [
|
||||
"from langchain_community.document_loaders import TextLoader\n",
|
||||
"from langchain_openai import OpenAIEmbeddings\n",
|
||||
"from langchain_qdrant import Qdrant\n",
|
||||
"from langchain_qdrant import QdrantVectorStore\n",
|
||||
"from langchain_text_splitters import CharacterTextSplitter"
|
||||
]
|
||||
},
|
||||
@ -97,7 +75,7 @@
|
||||
},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"loader = TextLoader(\"../../how_to/state_of_the_union.txt\")\n",
|
||||
"loader = TextLoader(\"some-file.txt\")\n",
|
||||
"documents = loader.load()\n",
|
||||
"text_splitter = CharacterTextSplitter(chunk_size=1000, chunk_overlap=0)\n",
|
||||
"docs = text_splitter.split_documents(documents)\n",
|
||||
@ -115,7 +93,7 @@
|
||||
"\n",
|
||||
"### Local mode\n",
|
||||
"\n",
|
||||
"Python client allows you to run the same code in local mode without running the Qdrant server. That's great for testing things out and debugging or if you plan to store just a small amount of vectors. The embeddings might be fully kepy in memory or persisted on disk.\n",
|
||||
"Python client allows you to run the same code in local mode without running the Qdrant server. That's great for testing things out and debugging or storing just a small amount of vectors. The embeddings might be fully kept in memory or persisted on disk.\n",
|
||||
"\n",
|
||||
"#### In-memory\n",
|
||||
"\n",
|
||||
@ -135,7 +113,7 @@
|
||||
},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"qdrant = Qdrant.from_documents(\n",
|
||||
"qdrant = QdrantVectorStore.from_documents(\n",
|
||||
" docs,\n",
|
||||
" embeddings,\n",
|
||||
" location=\":memory:\", # Local mode with in-memory storage only\n",
|
||||
@ -151,7 +129,7 @@
|
||||
"source": [
|
||||
"#### On-disk storage\n",
|
||||
"\n",
|
||||
"Local mode, without using the Qdrant server, may also store your vectors on disk so they're persisted between runs."
|
||||
"Local mode, without using the Qdrant server, may also store your vectors on disk so they persist between runs."
|
||||
]
|
||||
},
|
||||
{
|
||||
@ -167,7 +145,7 @@
|
||||
},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"qdrant = Qdrant.from_documents(\n",
|
||||
"qdrant = QdrantVectorStore.from_documents(\n",
|
||||
" docs,\n",
|
||||
" embeddings,\n",
|
||||
" path=\"/tmp/local_qdrant\",\n",
|
||||
@ -199,7 +177,7 @@
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"url = \"<---qdrant url here --->\"\n",
|
||||
"qdrant = Qdrant.from_documents(\n",
|
||||
"qdrant = QdrantVectorStore.from_documents(\n",
|
||||
" docs,\n",
|
||||
" embeddings,\n",
|
||||
" url=url,\n",
|
||||
@ -233,7 +211,7 @@
|
||||
"source": [
|
||||
"url = \"<---qdrant cloud cluster url here --->\"\n",
|
||||
"api_key = \"<---api key here--->\"\n",
|
||||
"qdrant = Qdrant.from_documents(\n",
|
||||
"qdrant = QdrantVectorStore.from_documents(\n",
|
||||
" docs,\n",
|
||||
" embeddings,\n",
|
||||
" url=url,\n",
|
||||
@ -266,7 +244,7 @@
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"qdrant = Qdrant.from_existing_collection(\n",
|
||||
"qdrant = QdrantVectorStore.from_existing_collection(\n",
|
||||
" embeddings=embeddings,\n",
|
||||
" collection_name=\"my_documents\",\n",
|
||||
" url=\"http://localhost:6333\",\n",
|
||||
@ -297,7 +275,7 @@
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"url = \"<---qdrant url here --->\"\n",
|
||||
"qdrant = Qdrant.from_documents(\n",
|
||||
"qdrant = QdrantVectorStore.from_documents(\n",
|
||||
" docs,\n",
|
||||
" embeddings,\n",
|
||||
" url=url,\n",
|
||||
@ -320,12 +298,31 @@
|
||||
"source": [
|
||||
"## Similarity search\n",
|
||||
"\n",
|
||||
"The simplest scenario for using Qdrant vector store is to perform a similarity search. Under the hood, our query will be encoded with the `embedding_function` and used to find similar documents in Qdrant collection."
|
||||
"The simplest scenario for using Qdrant vector store is to perform a similarity search. Under the hood, our query will be encoded into vector embeddings and used to find similar documents in Qdrant collection.\n",
|
||||
"\n",
|
||||
"`QdrantVectorStore` supports 3 modes for similarity searches. They can be configured using the `retrieval_mode` parameter when setting up the class.\n",
|
||||
"\n",
|
||||
"- Dense Vector Search(Default)\n",
|
||||
"- Sparse Vector Search\n",
|
||||
"- Hybrid Search"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"id": "b3a78d46",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"### Dense Vector Search\n",
|
||||
"\n",
|
||||
"To search with only dense vectors,\n",
|
||||
"\n",
|
||||
"- The `retrieval_mode` parameter should be set to `RetrievalMode.DENSE`(default).\n",
|
||||
"- A [dense embeddings provider](https://python.langchain.com/v0.2/docs/integrations/text_embedding/) value should be a provided for the `embedding` parameter."
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 7,
|
||||
"execution_count": null,
|
||||
"id": "a8c513ab",
|
||||
"metadata": {
|
||||
"ExecuteTime": {
|
||||
@ -336,38 +333,108 @@
|
||||
},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"from langchain_qdrant import RetrievalMode\n",
|
||||
"\n",
|
||||
"qdrant = QdrantVectorStore.from_documents(\n",
|
||||
" docs,\n",
|
||||
" embedding=embeddings,\n",
|
||||
" location=\":memory:\",\n",
|
||||
" collection_name=\"my_documents\",\n",
|
||||
" retrieval_mode=RetrievalMode.DENSE,\n",
|
||||
")\n",
|
||||
"\n",
|
||||
"query = \"What did the president say about Ketanji Brown Jackson\"\n",
|
||||
"found_docs = qdrant.similarity_search(query)"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 8,
|
||||
"id": "fc516993",
|
||||
"metadata": {
|
||||
"ExecuteTime": {
|
||||
"end_time": "2023-04-04T10:51:25.220984Z",
|
||||
"start_time": "2023-04-04T10:51:25.213943Z"
|
||||
},
|
||||
"tags": []
|
||||
},
|
||||
"outputs": [
|
||||
{
|
||||
"name": "stdout",
|
||||
"output_type": "stream",
|
||||
"text": [
|
||||
"Tonight. I call on the Senate to: Pass the Freedom to Vote Act. Pass the John Lewis Voting Rights Act. And while you’re at it, pass the Disclose Act so Americans can know who is funding our elections. \n",
|
||||
"\n",
|
||||
"Tonight, I’d like to honor someone who has dedicated his life to serve this country: Justice Stephen Breyer—an Army veteran, Constitutional scholar, and retiring Justice of the United States Supreme Court. Justice Breyer, thank you for your service. \n",
|
||||
"\n",
|
||||
"One of the most serious constitutional responsibilities a President has is nominating someone to serve on the United States Supreme Court. \n",
|
||||
"\n",
|
||||
"And I did that 4 days ago, when I nominated Circuit Court of Appeals Judge Ketanji Brown Jackson. One of our nation’s top legal minds, who will continue Justice Breyer’s legacy of excellence.\n"
|
||||
]
|
||||
}
|
||||
],
|
||||
"cell_type": "markdown",
|
||||
"id": "dbd93d85",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"print(found_docs[0].page_content)"
|
||||
"### Sparse Vector Search\n",
|
||||
"\n",
|
||||
"To search with only sparse vectors,\n",
|
||||
"\n",
|
||||
"- The `retrieval_mode` parameter should be set to `RetrievalMode.SPARSE`.\n",
|
||||
"- An implementation of the [`SparseEmbeddings`](https://github.com/langchain-ai/langchain/blob/master/libs/partners/qdrant/langchain_qdrant/sparse_embeddings.py) interface using any sparse embeddings provider has to be provided as value to the `sparse_embedding` parameter.\n",
|
||||
"\n",
|
||||
"The `langchain-qdrant` package provides a [FastEmbed](https://github.com/qdrant/fastembed) based implementation out of the box.\n",
|
||||
"\n",
|
||||
"To use it, install the FastEmbed package."
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": null,
|
||||
"id": "ceb493a3",
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"%pip install fastembed"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": null,
|
||||
"id": "052e3412",
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"from langchain_qdrant import FastEmbedSparse, RetrievalMode\n",
|
||||
"\n",
|
||||
"sparse_embeddings = FastEmbedSparse(model_name=\"Qdrant/BM25\")\n",
|
||||
"\n",
|
||||
"qdrant = QdrantVectorStore.from_documents(\n",
|
||||
" docs,\n",
|
||||
" sparse_embedding=sparse_embeddings,\n",
|
||||
" location=\":memory:\",\n",
|
||||
" collection_name=\"my_documents\",\n",
|
||||
" retrieval_mode=RetrievalMode.SPARSE,\n",
|
||||
")\n",
|
||||
"\n",
|
||||
"query = \"What did the president say about Ketanji Brown Jackson\"\n",
|
||||
"found_docs = qdrant.similarity_search(query)"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"id": "f4b6c456",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"### Hybrid Vector Search\n",
|
||||
"\n",
|
||||
"To perform a hybrid search using dense and sparse vectors with score fusion,\n",
|
||||
"\n",
|
||||
"- The `retrieval_mode` parameter should be set to `RetrievalMode.HYBRID`.\n",
|
||||
"- A [dense embeddings provider](https://python.langchain.com/v0.2/docs/integrations/text_embedding/) value should be provider for the `embedding` parameter.\n",
|
||||
"- An implementation of the [`SparseEmbeddings`](https://github.com/langchain-ai/langchain/blob/master/libs/partners/qdrant/langchain_qdrant/sparse_embeddings.py) interface using any sparse embeddings provider has to be provided as value to the `sparse_embedding` parameter.\n",
|
||||
"\n",
|
||||
"Note that if you've added documents with the `HYBRID` mode, you can switch to any retrieval mode when searching. Since both the dense and sparse vectors are available in the collection."
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": null,
|
||||
"id": "ce56f6e9",
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"from langchain_qdrant import FastEmbedSparse, RetrievalMode\n",
|
||||
"\n",
|
||||
"sparse_embeddings = FastEmbedSparse(model_name=\"Qdrant/BM25\")\n",
|
||||
"\n",
|
||||
"qdrant = QdrantVectorStore.from_documents(\n",
|
||||
" docs,\n",
|
||||
" embedding=embeddings,\n",
|
||||
" sparse_embedding=sparse_embeddings,\n",
|
||||
" location=\":memory:\",\n",
|
||||
" collection_name=\"my_documents\",\n",
|
||||
" retrieval_mode=RetrievalMode.HYBRID,\n",
|
||||
")\n",
|
||||
"\n",
|
||||
"query = \"What did the president say about Ketanji Brown Jackson\"\n",
|
||||
"found_docs = qdrant.similarity_search(query)"
|
||||
]
|
||||
},
|
||||
{
|
||||
@ -400,7 +467,7 @@
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 12,
|
||||
"execution_count": null,
|
||||
"id": "756a6887",
|
||||
"metadata": {
|
||||
"ExecuteTime": {
|
||||
@ -408,23 +475,7 @@
|
||||
"start_time": "2023-04-04T10:51:25.635947Z"
|
||||
}
|
||||
},
|
||||
"outputs": [
|
||||
{
|
||||
"name": "stdout",
|
||||
"output_type": "stream",
|
||||
"text": [
|
||||
"Tonight. I call on the Senate to: Pass the Freedom to Vote Act. Pass the John Lewis Voting Rights Act. And while you’re at it, pass the Disclose Act so Americans can know who is funding our elections. \n",
|
||||
"\n",
|
||||
"Tonight, I’d like to honor someone who has dedicated his life to serve this country: Justice Stephen Breyer—an Army veteran, Constitutional scholar, and retiring Justice of the United States Supreme Court. Justice Breyer, thank you for your service. \n",
|
||||
"\n",
|
||||
"One of the most serious constitutional responsibilities a President has is nominating someone to serve on the United States Supreme Court. \n",
|
||||
"\n",
|
||||
"And I did that 4 days ago, when I nominated Circuit Court of Appeals Judge Ketanji Brown Jackson. One of our nation’s top legal minds, who will continue Justice Breyer’s legacy of excellence.\n",
|
||||
"\n",
|
||||
"Score: 0.8153784913324512\n"
|
||||
]
|
||||
}
|
||||
],
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"document, score = found_docs[0]\n",
|
||||
"print(document.page_content)\n",
|
||||
@ -449,10 +500,10 @@
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"```python\n",
|
||||
"from qdrant_client.http import models as rest\n",
|
||||
"from qdrant_client.http import models\n",
|
||||
"\n",
|
||||
"query = \"What did the president say about Ketanji Brown Jackson\"\n",
|
||||
"found_docs = qdrant.similarity_search_with_score(query, filter=rest.Filter(...))\n",
|
||||
"found_docs = qdrant.similarity_search_with_score(query, filter=models.Filter(...))\n",
|
||||
"```"
|
||||
]
|
||||
},
|
||||
@ -469,7 +520,9 @@
|
||||
"source": [
|
||||
"## Maximum marginal relevance search (MMR)\n",
|
||||
"\n",
|
||||
"If you'd like to look up for some similar documents, but you'd also like to receive diverse results, MMR is method you should consider. Maximal marginal relevance optimizes for similarity to query AND diversity among selected documents."
|
||||
"If you'd like to look up some similar documents, but you'd also like to receive diverse results, MMR is the method you should consider. Maximal marginal relevance optimizes for similarity to query AND diversity among selected documents.\n",
|
||||
"\n",
|
||||
"Note that MMR search is only available if you've added documents with `DENSE` or `HYBRID` modes. Since it requires dense vectors."
|
||||
]
|
||||
},
|
||||
{
|
||||
@ -490,7 +543,7 @@
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 14,
|
||||
"execution_count": null,
|
||||
"id": "80c6db11",
|
||||
"metadata": {
|
||||
"ExecuteTime": {
|
||||
@ -498,40 +551,7 @@
|
||||
"start_time": "2023-04-04T10:51:26.013329Z"
|
||||
}
|
||||
},
|
||||
"outputs": [
|
||||
{
|
||||
"name": "stdout",
|
||||
"output_type": "stream",
|
||||
"text": [
|
||||
"1. Tonight. I call on the Senate to: Pass the Freedom to Vote Act. Pass the John Lewis Voting Rights Act. And while you’re at it, pass the Disclose Act so Americans can know who is funding our elections. \n",
|
||||
"\n",
|
||||
"Tonight, I’d like to honor someone who has dedicated his life to serve this country: Justice Stephen Breyer—an Army veteran, Constitutional scholar, and retiring Justice of the United States Supreme Court. Justice Breyer, thank you for your service. \n",
|
||||
"\n",
|
||||
"One of the most serious constitutional responsibilities a President has is nominating someone to serve on the United States Supreme Court. \n",
|
||||
"\n",
|
||||
"And I did that 4 days ago, when I nominated Circuit Court of Appeals Judge Ketanji Brown Jackson. One of our nation’s top legal minds, who will continue Justice Breyer’s legacy of excellence. \n",
|
||||
"\n",
|
||||
"2. We can’t change how divided we’ve been. But we can change how we move forward—on COVID-19 and other issues we must face together. \n",
|
||||
"\n",
|
||||
"I recently visited the New York City Police Department days after the funerals of Officer Wilbert Mora and his partner, Officer Jason Rivera. \n",
|
||||
"\n",
|
||||
"They were responding to a 9-1-1 call when a man shot and killed them with a stolen gun. \n",
|
||||
"\n",
|
||||
"Officer Mora was 27 years old. \n",
|
||||
"\n",
|
||||
"Officer Rivera was 22. \n",
|
||||
"\n",
|
||||
"Both Dominican Americans who’d grown up on the same streets they later chose to patrol as police officers. \n",
|
||||
"\n",
|
||||
"I spoke with their families and told them that we are forever in debt for their sacrifice, and we will carry on their mission to restore the trust and safety every community deserves. \n",
|
||||
"\n",
|
||||
"I’ve worked on these issues a long time. \n",
|
||||
"\n",
|
||||
"I know what works: Investing in crime prevention and community police officers who’ll walk the beat, who’ll know the neighborhood, and who can restore trust and safety. \n",
|
||||
"\n"
|
||||
]
|
||||
}
|
||||
],
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"for i, doc in enumerate(found_docs):\n",
|
||||
" print(f\"{i + 1}.\", doc.page_content, \"\\n\")"
|
||||
@ -545,7 +565,7 @@
|
||||
"source": [
|
||||
"## Qdrant as a Retriever\n",
|
||||
"\n",
|
||||
"Qdrant, as all the other vector stores, is a LangChain Retriever, by using cosine similarity. "
|
||||
"Qdrant, as all the other vector stores, is a LangChain Retriever. "
|
||||
]
|
||||
},
|
||||
{
|
||||
@ -589,7 +609,7 @@
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 17,
|
||||
"execution_count": null,
|
||||
"id": "f3c70c31",
|
||||
"metadata": {
|
||||
"ExecuteTime": {
|
||||
@ -597,18 +617,7 @@
|
||||
"start_time": "2023-04-04T10:51:26.046407Z"
|
||||
}
|
||||
},
|
||||
"outputs": [
|
||||
{
|
||||
"data": {
|
||||
"text/plain": [
|
||||
"Document(page_content='Tonight. I call on the Senate to: Pass the Freedom to Vote Act. Pass the John Lewis Voting Rights Act. And while you’re at it, pass the Disclose Act so Americans can know who is funding our elections. \\n\\nTonight, I’d like to honor someone who has dedicated his life to serve this country: Justice Stephen Breyer—an Army veteran, Constitutional scholar, and retiring Justice of the United States Supreme Court. Justice Breyer, thank you for your service. \\n\\nOne of the most serious constitutional responsibilities a President has is nominating someone to serve on the United States Supreme Court. \\n\\nAnd I did that 4 days ago, when I nominated Circuit Court of Appeals Judge Ketanji Brown Jackson. One of our nation’s top legal minds, who will continue Justice Breyer’s legacy of excellence.', metadata={'source': '../../../state_of_the_union.txt'})"
|
||||
]
|
||||
},
|
||||
"execution_count": 17,
|
||||
"metadata": {},
|
||||
"output_type": "execute_result"
|
||||
}
|
||||
],
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"query = \"What did the president say about Ketanji Brown Jackson\"\n",
|
||||
"retriever.invoke(query)[0]"
|
||||
@ -622,11 +631,11 @@
|
||||
"source": [
|
||||
"## Customizing Qdrant\n",
|
||||
"\n",
|
||||
"There are some options to use an existing Qdrant collection within your Langchain application. In such cases you may need to define how to map Qdrant point into the Langchain `Document`.\n",
|
||||
"There are options to use an existing Qdrant collection within your Langchain application. In such cases, you may need to define how to map Qdrant point into the Langchain `Document`.\n",
|
||||
"\n",
|
||||
"### Named vectors\n",
|
||||
"\n",
|
||||
"Qdrant supports [multiple vectors per point](https://qdrant.tech/documentation/concepts/collections/#collection-with-multiple-vectors) by named vectors. Langchain requires just a single embedding per document and, by default, uses a single vector. However, if you work with a collection created externally or want to have the named vector used, you can configure it by providing its name.\n"
|
||||
"Qdrant supports [multiple vectors per point](https://qdrant.tech/documentation/concepts/collections/#collection-with-multiple-vectors) by named vectors. If you work with a collection created externally or want to have the differently named vector used, you can configure it by providing its name.\n"
|
||||
]
|
||||
},
|
||||
{
|
||||
@ -638,25 +647,18 @@
|
||||
},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"Qdrant.from_documents(\n",
|
||||
"QdrantVectorStore.from_documents(\n",
|
||||
" docs,\n",
|
||||
" embeddings,\n",
|
||||
" embedding=embeddings,\n",
|
||||
" sparse_embedding=sparse_embeddings,\n",
|
||||
" location=\":memory:\",\n",
|
||||
" collection_name=\"my_documents_2\",\n",
|
||||
" retrieval_mode=RetrievalMode.HYBRID,\n",
|
||||
" vector_name=\"custom_vector\",\n",
|
||||
" sparse_vector_name=\"custom_sparse_vector\",\n",
|
||||
")"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"id": "b34f5230",
|
||||
"metadata": {
|
||||
"collapsed": false
|
||||
},
|
||||
"source": [
|
||||
"As a Langchain user, you won't see any difference whether you use named vectors or not. Qdrant integration will handle the conversion under the hood."
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"id": "b2350093",
|
||||
@ -694,7 +696,7 @@
|
||||
},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"Qdrant.from_documents(\n",
|
||||
"QdrantVectorStore.from_documents(\n",
|
||||
" docs,\n",
|
||||
" embeddings,\n",
|
||||
" location=\":memory:\",\n",
|
||||
@ -729,7 +731,7 @@
|
||||
"name": "python",
|
||||
"nbconvert_exporter": "python",
|
||||
"pygments_lexer": "ipython3",
|
||||
"version": "3.11.3"
|
||||
"version": "3.11.8"
|
||||
}
|
||||
},
|
||||
"nbformat": 4,
|
||||
|
Loading…
Reference in New Issue
Block a user