mirror of
https://github.com/hwchase17/langchain.git
synced 2025-09-10 07:21:03 +00:00
community[patch]: Make some functions work with Milvus (#10695)
**Description** Make some functions work with Milvus: 1. get_ids: Get primary keys by field in the metadata 2. delete: Delete one or more entities by ids 3. upsert: Update/Insert one or more entities **Issue** None **Dependencies** None **Tag maintainer:** @hwchase17 **Twitter handle:** None --------- Co-authored-by: HoaNQ9 <hoanq.1811@gmail.com> Co-authored-by: Erick Friis <erick@langchain.dev>
This commit is contained in:
@@ -204,23 +204,29 @@
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {
|
||||
"collapsed": false,
|
||||
"pycharm": {
|
||||
"name": "#%% md\n"
|
||||
}
|
||||
},
|
||||
"source": [
|
||||
"### Per-User Retrieval\n",
|
||||
"\n",
|
||||
"When building a retrieval app, you often have to build it with multiple users in mind. This means that you may be storing data not just for one user, but for many different users, and they should not be able to see eachother’s data.\n",
|
||||
"\n",
|
||||
"Milvus recommends using [partition_key](https://milvus.io/docs/multi_tenancy.md#Partition-key-based-multi-tenancy) to implement multi-tenancy, here is an example."
|
||||
],
|
||||
"metadata": {
|
||||
"collapsed": false,
|
||||
"pycharm": {
|
||||
"name": "#%% md\n"
|
||||
}
|
||||
}
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 2,
|
||||
"metadata": {
|
||||
"collapsed": false,
|
||||
"pycharm": {
|
||||
"name": "#%%\n"
|
||||
}
|
||||
},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"from langchain_core.documents import Document\n",
|
||||
@@ -236,16 +242,16 @@
|
||||
" drop_old=True,\n",
|
||||
" partition_key_field=\"namespace\", # Use the \"namespace\" field as the partition key\n",
|
||||
")"
|
||||
],
|
||||
"metadata": {
|
||||
"collapsed": false,
|
||||
"pycharm": {
|
||||
"name": "#%%\n"
|
||||
}
|
||||
}
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {
|
||||
"collapsed": false,
|
||||
"pycharm": {
|
||||
"name": "#%% md\n"
|
||||
}
|
||||
},
|
||||
"source": [
|
||||
"To conduct a search using the partition key, you should include either of the following in the boolean expression of the search request:\n",
|
||||
"\n",
|
||||
@@ -256,21 +262,23 @@
|
||||
"Do replace `<partition_key>` with the name of the field that is designated as the partition key.\n",
|
||||
"\n",
|
||||
"Milvus changes to a partition based on the specified partition key, filters entities according to the partition key, and searches among the filtered entities.\n"
|
||||
],
|
||||
"metadata": {
|
||||
"collapsed": false,
|
||||
"pycharm": {
|
||||
"name": "#%% md\n"
|
||||
}
|
||||
}
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 3,
|
||||
"metadata": {
|
||||
"collapsed": false,
|
||||
"pycharm": {
|
||||
"name": "#%%\n"
|
||||
}
|
||||
},
|
||||
"outputs": [
|
||||
{
|
||||
"data": {
|
||||
"text/plain": "[Document(page_content='i worked at facebook', metadata={'namespace': 'ankush'})]"
|
||||
"text/plain": [
|
||||
"[Document(page_content='i worked at facebook', metadata={'namespace': 'ankush'})]"
|
||||
]
|
||||
},
|
||||
"execution_count": 3,
|
||||
"metadata": {},
|
||||
@@ -282,21 +290,23 @@
|
||||
"vectorstore.as_retriever(\n",
|
||||
" search_kwargs={\"expr\": 'namespace == \"ankush\"'}\n",
|
||||
").get_relevant_documents(\"where did i work?\")"
|
||||
],
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 4,
|
||||
"metadata": {
|
||||
"collapsed": false,
|
||||
"pycharm": {
|
||||
"name": "#%%\n"
|
||||
}
|
||||
}
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 4,
|
||||
},
|
||||
"outputs": [
|
||||
{
|
||||
"data": {
|
||||
"text/plain": "[Document(page_content='i worked at kensho', metadata={'namespace': 'harrison'})]"
|
||||
"text/plain": [
|
||||
"[Document(page_content='i worked at kensho', metadata={'namespace': 'harrison'})]"
|
||||
]
|
||||
},
|
||||
"execution_count": 4,
|
||||
"metadata": {},
|
||||
@@ -308,13 +318,52 @@
|
||||
"vectorstore.as_retriever(\n",
|
||||
" search_kwargs={\"expr\": 'namespace == \"harrison\"'}\n",
|
||||
").get_relevant_documents(\"where did i work?\")"
|
||||
],
|
||||
"metadata": {
|
||||
"collapsed": false,
|
||||
"pycharm": {
|
||||
"name": "#%%\n"
|
||||
}
|
||||
}
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"id": "89756e9e",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"**To delete or upsert (update/insert) one or more entities:**"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": null,
|
||||
"id": "21c4edcf",
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"from langchain.docstore.document import Document\n",
|
||||
"\n",
|
||||
"# Insert data sample\n",
|
||||
"docs = [\n",
|
||||
" Document(page_content=\"foo\", metadata={\"id\": 1}),\n",
|
||||
" Document(page_content=\"bar\", metadata={\"id\": 2}),\n",
|
||||
" Document(page_content=\"baz\", metadata={\"id\": 3}),\n",
|
||||
"]\n",
|
||||
"vector_db = Milvus.from_documents(\n",
|
||||
" docs,\n",
|
||||
" embeddings,\n",
|
||||
" connection_args={\"host\": \"127.0.0.1\", \"port\": \"19530\"},\n",
|
||||
")\n",
|
||||
"\n",
|
||||
"# Search pks (primary keys) using expression\n",
|
||||
"expr = \"id in [1,2]\"\n",
|
||||
"pks = vector_db.get_pks(expr)\n",
|
||||
"\n",
|
||||
"# Delete entities by pks\n",
|
||||
"result = vector_db.delete(pks)\n",
|
||||
"\n",
|
||||
"# Upsert (Update/Insert)\n",
|
||||
"new_docs = [\n",
|
||||
" Document(page_content=\"new_foo\", metadata={\"id\": 1}),\n",
|
||||
" Document(page_content=\"new_bar\", metadata={\"id\": 2}),\n",
|
||||
" Document(page_content=\"upserted_bak\", metadata={\"id\": 3}),\n",
|
||||
"]\n",
|
||||
"upserted_pks = vector_db.upsert(pks, new_docs)"
|
||||
]
|
||||
}
|
||||
],
|
||||
"metadata": {
|
||||
@@ -338,4 +387,4 @@
|
||||
},
|
||||
"nbformat": 4,
|
||||
"nbformat_minor": 5
|
||||
}
|
||||
}
|
||||
|
Reference in New Issue
Block a user