mirror of
https://github.com/hwchase17/langchain.git
synced 2025-07-06 05:08:20 +00:00
docs: Fix Milvus vector store initialization (#29511)
- [x] **PR title**: - [x] **PR message**: - A change in the Milvus API has caused an issue with the local vector store initialization. Having used an Ollama embedding model, the vector store initialization results in the following error: <img width="978" alt="image" src="https://github.com/user-attachments/assets/d57e495c-1764-4fbe-ab8c-21ee44f1e686" /> - This is fixed by setting the index type explicitly: `vector_store = Milvus(embedding_function=embeddings, connection_args={"uri": URI}, index_params={"index_type": "FLAT", "metric_type": "L2"},)` Other small documentation edits were also made. - [x] **Add tests and docs**: N/A - [x] **Lint and test**: Run `make format`, `make lint` and `make test` from the root of the package(s) you've modified. See contribution guidelines for more: https://python.langchain.com/docs/contributing/ Additional guidelines: - Make sure optional dependencies are imported within a function. - Please do not add dependencies to pyproject.toml files (even optional ones) unless they are required for unit tests. - Most PRs should not touch more than one package. - Changes should be backwards compatible. - If you are adding something to community, do not re-import it in langchain. If no one reviews your PR within a few days, please @-mention one of baskaryan, efriis, eyurtsev, ccurme, vbarda, hwchase17. --------- Co-authored-by: Chester Curme <chester.curme@gmail.com>
This commit is contained in:
parent
0c405245c4
commit
b8e218b09f
@ -3,7 +3,9 @@
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"id": "683953b3",
|
||||
"metadata": {},
|
||||
"metadata": {
|
||||
"id": "683953b3"
|
||||
},
|
||||
"source": [
|
||||
"# Milvus\n",
|
||||
"\n",
|
||||
@ -21,6 +23,7 @@
|
||||
"execution_count": null,
|
||||
"id": "a62cff8a-bcf7-4e33-bbbc-76999c2e3e20",
|
||||
"metadata": {
|
||||
"id": "a62cff8a-bcf7-4e33-bbbc-76999c2e3e20",
|
||||
"tags": []
|
||||
},
|
||||
"outputs": [],
|
||||
@ -31,9 +34,11 @@
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"id": "633addc3",
|
||||
"metadata": {},
|
||||
"metadata": {
|
||||
"id": "633addc3"
|
||||
},
|
||||
"source": [
|
||||
"The latest version of pymilvus comes with a local vector database Milvus Lite, good for prototyping. If you have large scale of data such as more than a million docs, we recommend setting up a more performant Milvus server on [docker or kubernetes](https://milvus.io/docs/install_standalone-docker.md#Start-Milvus).\n",
|
||||
"The latest version of `pymilvus` comes with a local vector database called Milvus Lite, which is good for prototyping. If you have a large amount of data (e.g., more than a million vectors), we recommend setting up a more performant Milvus server on [Docker](https://milvus.io/docs/install_standalone-docker.md#Start-Milvus) or [Kubernetes](https://milvus.io/docs/install_cluster-milvusoperator.md).\n",
|
||||
"\n",
|
||||
"### Credentials\n",
|
||||
"\n",
|
||||
@ -48,9 +53,11 @@
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 25,
|
||||
"execution_count": null,
|
||||
"id": "a7dd253f",
|
||||
"metadata": {},
|
||||
"metadata": {
|
||||
"id": "a7dd253f"
|
||||
},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"# | output: false\n",
|
||||
@ -62,9 +69,10 @@
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 28,
|
||||
"execution_count": null,
|
||||
"id": "dcf88bdf",
|
||||
"metadata": {
|
||||
"id": "dcf88bdf",
|
||||
"tags": []
|
||||
},
|
||||
"outputs": [],
|
||||
@ -78,32 +86,40 @@
|
||||
"vector_store = Milvus(\n",
|
||||
" embedding_function=embeddings,\n",
|
||||
" connection_args={\"uri\": URI},\n",
|
||||
" # Set index_params if needed\n",
|
||||
" index_params={\"index_type\": \"FLAT\", \"metric_type\": \"L2\"},\n",
|
||||
")"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"id": "cae1a7d5",
|
||||
"metadata": {},
|
||||
"metadata": {
|
||||
"id": "cae1a7d5"
|
||||
},
|
||||
"source": [
|
||||
"### Compartmentalize the data with Milvus Collections\n",
|
||||
"\n",
|
||||
"You can store different unrelated documents in different collections within same Milvus instance to maintain the context"
|
||||
"You can store unrelated documents in different collections within the same Milvus instance."
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"id": "c07cd24b",
|
||||
"metadata": {},
|
||||
"metadata": {
|
||||
"id": "c07cd24b"
|
||||
},
|
||||
"source": [
|
||||
"Here's how you can create a new collection"
|
||||
"Here's how you can create a new collection:"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 29,
|
||||
"execution_count": null,
|
||||
"id": "c6f4973d",
|
||||
"metadata": {},
|
||||
"metadata": {
|
||||
"id": "c6f4973d"
|
||||
},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"from langchain_core.documents import Document\n",
|
||||
@ -119,16 +135,20 @@
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"id": "3b12df8c",
|
||||
"metadata": {},
|
||||
"metadata": {
|
||||
"id": "3b12df8c"
|
||||
},
|
||||
"source": [
|
||||
"And here is how you retrieve that stored collection"
|
||||
"And here is how you retrieve that stored collection:"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 30,
|
||||
"execution_count": null,
|
||||
"id": "12817d16",
|
||||
"metadata": {},
|
||||
"metadata": {
|
||||
"id": "12817d16"
|
||||
},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"vector_store_loaded = Milvus(\n",
|
||||
@ -141,7 +161,9 @@
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"id": "f1fc3818",
|
||||
"metadata": {},
|
||||
"metadata": {
|
||||
"id": "f1fc3818"
|
||||
},
|
||||
"source": [
|
||||
"## Manage vector store\n",
|
||||
"\n",
|
||||
@ -154,9 +176,12 @@
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 31,
|
||||
"execution_count": null,
|
||||
"id": "3ced24f6",
|
||||
"metadata": {},
|
||||
"metadata": {
|
||||
"id": "3ced24f6",
|
||||
"outputId": "9c57a6bb-86eb-456c-f007-6cabd6865299"
|
||||
},
|
||||
"outputs": [
|
||||
{
|
||||
"data": {
|
||||
@ -253,16 +278,21 @@
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"id": "e23c22d8",
|
||||
"metadata": {},
|
||||
"metadata": {
|
||||
"id": "e23c22d8"
|
||||
},
|
||||
"source": [
|
||||
"### Delete items from vector store"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 32,
|
||||
"execution_count": null,
|
||||
"id": "1f387fa8",
|
||||
"metadata": {},
|
||||
"metadata": {
|
||||
"id": "1f387fa8",
|
||||
"outputId": "62fee30d-92c9-4efd-df8a-453545ff61d0"
|
||||
},
|
||||
"outputs": [
|
||||
{
|
||||
"data": {
|
||||
@ -282,11 +312,13 @@
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"id": "fb12fa75",
|
||||
"metadata": {},
|
||||
"metadata": {
|
||||
"id": "fb12fa75"
|
||||
},
|
||||
"source": [
|
||||
"## Query vector store\n",
|
||||
"\n",
|
||||
"Once your vector store has been created and the relevant documents have been added you will most likely wish to query it during the running of your chain or agent. \n",
|
||||
"Once your vector store has been created and the relevant documents have been added, you will most likely wish to query it during the running of your chain or agent.\n",
|
||||
"\n",
|
||||
"### Query directly\n",
|
||||
"\n",
|
||||
@ -297,9 +329,12 @@
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 33,
|
||||
"execution_count": null,
|
||||
"id": "35801a55",
|
||||
"metadata": {},
|
||||
"metadata": {
|
||||
"id": "35801a55",
|
||||
"outputId": "13865abb-11a2-41ae-9ad7-44e8586fd099"
|
||||
},
|
||||
"outputs": [
|
||||
{
|
||||
"name": "stdout",
|
||||
@ -323,7 +358,9 @@
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"id": "35574409",
|
||||
"metadata": {},
|
||||
"metadata": {
|
||||
"id": "35574409"
|
||||
},
|
||||
"source": [
|
||||
"#### Similarity search with score\n",
|
||||
"\n",
|
||||
@ -332,9 +369,12 @@
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 22,
|
||||
"execution_count": null,
|
||||
"id": "c360af3d",
|
||||
"metadata": {},
|
||||
"metadata": {
|
||||
"id": "c360af3d",
|
||||
"outputId": "16cb1961-9f4a-494a-9500-27b98a1158d8"
|
||||
},
|
||||
"outputs": [
|
||||
{
|
||||
"name": "stdout",
|
||||
@ -355,20 +395,25 @@
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"id": "14db337f",
|
||||
"metadata": {},
|
||||
"metadata": {
|
||||
"id": "14db337f"
|
||||
},
|
||||
"source": [
|
||||
"For a full list of all the search options available when using the `Milvus` vector store, you can visit the [API reference](https://python.langchain.com/api_reference/milvus/vectorstores/langchain_milvus.vectorstores.milvus.Milvus.html).\n",
|
||||
"\n",
|
||||
"### Query by turning into retriever\n",
|
||||
"\n",
|
||||
"You can also transform the vector store into a retriever for easier usage in your chains. "
|
||||
"You can also transform the vector store into a retriever for easier usage in your chains."
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 34,
|
||||
"execution_count": null,
|
||||
"id": "f6d9357c",
|
||||
"metadata": {},
|
||||
"metadata": {
|
||||
"id": "f6d9357c",
|
||||
"outputId": "bcaa7620-a1c0-418f-9f54-684a472b0b55"
|
||||
},
|
||||
"outputs": [
|
||||
{
|
||||
"data": {
|
||||
@ -389,7 +434,9 @@
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"id": "8ac953f1",
|
||||
"metadata": {},
|
||||
"metadata": {
|
||||
"id": "8ac953f1"
|
||||
},
|
||||
"source": [
|
||||
"## Usage for retrieval-augmented generation\n",
|
||||
"\n",
|
||||
@ -404,6 +451,7 @@
|
||||
"cell_type": "markdown",
|
||||
"id": "7fb27b941602401d91542211134fc71a",
|
||||
"metadata": {
|
||||
"id": "7fb27b941602401d91542211134fc71a",
|
||||
"pycharm": {
|
||||
"name": "#%% md\n"
|
||||
}
|
||||
@ -411,17 +459,18 @@
|
||||
"source": [
|
||||
"### Per-User Retrieval\n",
|
||||
"\n",
|
||||
"When building a retrieval app, you often have to build it with multiple users in mind. This means that you may be storing data not just for one user, but for many different users, and they should not be able to see eachother’s data.\n",
|
||||
"When building a retrieval app, you often have to build it with multiple users in mind. This means that you may be storing data not just for one user, but for many different users, and they should not be able to see each other’s data.\n",
|
||||
"\n",
|
||||
"Milvus recommends using [partition_key](https://milvus.io/docs/multi_tenancy.md#Partition-key-based-multi-tenancy) to implement multi-tenancy, here is an example.\n",
|
||||
"> The feature of Partition key is now not available in Milvus Lite, if you want to use it, you need to start Milvus server from [docker or kubernetes](https://milvus.io/docs/install_standalone-docker.md#Start-Milvus)."
|
||||
"Milvus recommends using [partition_key](https://milvus.io/docs/multi_tenancy.md#Partition-key-based-multi-tenancy) to implement multi-tenancy. Here is an example:\n",
|
||||
"> The feature of Partition key is now not available in Milvus Lite, if you want to use it, you need to start Milvus server, as mentioned above."
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 2,
|
||||
"execution_count": null,
|
||||
"id": "acae54e37e7d407bbb7b55eff062a284",
|
||||
"metadata": {
|
||||
"id": "acae54e37e7d407bbb7b55eff062a284",
|
||||
"pycharm": {
|
||||
"name": "#%%\n"
|
||||
}
|
||||
@ -447,6 +496,7 @@
|
||||
"cell_type": "markdown",
|
||||
"id": "9a63283cbaf04dbcab1f6479b197f3a8",
|
||||
"metadata": {
|
||||
"id": "9a63283cbaf04dbcab1f6479b197f3a8",
|
||||
"pycharm": {
|
||||
"name": "#%% md\n"
|
||||
}
|
||||
@ -465,9 +515,11 @@
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 3,
|
||||
"execution_count": null,
|
||||
"id": "8dd0d8092fe74a7c96281538738b07e2",
|
||||
"metadata": {
|
||||
"id": "8dd0d8092fe74a7c96281538738b07e2",
|
||||
"outputId": "e38ff0ea-1425-4f12-cfb5-7767d040397b",
|
||||
"pycharm": {
|
||||
"name": "#%%\n"
|
||||
}
|
||||
@ -493,9 +545,11 @@
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 4,
|
||||
"execution_count": null,
|
||||
"id": "72eea5119410473aa328ad9291626812",
|
||||
"metadata": {
|
||||
"id": "72eea5119410473aa328ad9291626812",
|
||||
"outputId": "9d3ad63e-fcb9-4f9a-bdf1-1bc263ce832b",
|
||||
"pycharm": {
|
||||
"name": "#%%\n"
|
||||
}
|
||||
@ -522,7 +576,9 @@
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"id": "f1a873c5",
|
||||
"metadata": {},
|
||||
"metadata": {
|
||||
"id": "f1a873c5"
|
||||
},
|
||||
"source": [
|
||||
"## API reference\n",
|
||||
"\n",
|
||||
@ -531,6 +587,9 @@
|
||||
}
|
||||
],
|
||||
"metadata": {
|
||||
"colab": {
|
||||
"provenance": []
|
||||
},
|
||||
"kernelspec": {
|
||||
"display_name": "Python 3 (ipykernel)",
|
||||
"language": "python",
|
||||
|
Loading…
Reference in New Issue
Block a user