mirror of
https://github.com/hwchase17/langchain.git
synced 2025-09-02 03:26:17 +00:00
community[minor]: Add TablestoreVectorStore (#25767)
Thank you for contributing to LangChain! - [x] **PR title**: community: add TablestoreVectorStore - [x] **PR message**: - **Description:** add TablestoreVectorStore - **Dependencies:** none - [x] **Add tests and docs**: If you're adding a new integration, please include 1. a test for the integration: yes 2. an example notebook showing its use: yes If no one reviews your PR within a few days, please @-mention one of baskaryan, efriis, eyurtsev, ccurme, vbarda, hwchase17. --------- Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com> Co-authored-by: Bagatur <baskaryan@gmail.com>
This commit is contained in:
@@ -89,3 +89,11 @@ See [installation instructions and a usage example](/docs/integrations/vectorsto
|
||||
```python
|
||||
from langchain_community.vectorstores import Hologres
|
||||
```
|
||||
|
||||
### Tablestore
|
||||
|
||||
See [installation instructions and a usage example](/docs/integrations/vectorstores/tablestore).
|
||||
|
||||
```python
|
||||
from langchain_community.vectorstores import TablestoreVectorStore
|
||||
```
|
385
docs/docs/integrations/vectorstores/tablestore.ipynb
Normal file
385
docs/docs/integrations/vectorstores/tablestore.ipynb
Normal file
@@ -0,0 +1,385 @@
|
||||
{
|
||||
"cells": [
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {
|
||||
"pycharm": {
|
||||
"name": "#%% md\n"
|
||||
}
|
||||
},
|
||||
"source": [
|
||||
"# TablestoreVectorStore\n",
|
||||
"\n",
|
||||
"> [Tablestore](https://www.aliyun.com/product/ots) is a fully managed NoSQL cloud database service that enables storage of a massive amount of structured\n",
|
||||
"and semi-structured data.\n",
|
||||
"\n",
|
||||
"This notebook shows how to use functionality related to the `Tablestore` vector database.\n",
|
||||
"\n",
|
||||
"To use Tablestore, you must create an instance.\n",
|
||||
"Here are the [creating instance instructions](https://help.aliyun.com/zh/tablestore/getting-started/manage-the-wide-column-model-in-the-tablestore-console)."
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {
|
||||
"pycharm": {
|
||||
"name": "#%% md\n"
|
||||
}
|
||||
},
|
||||
"source": "## Setup"
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": null,
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"%pip install --upgrade --quiet langchain-community tablestore"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": "## Initialization"
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 1,
|
||||
"metadata": {
|
||||
"ExecuteTime": {
|
||||
"end_time": "2024-08-20T11:10:04.469458Z",
|
||||
"start_time": "2024-08-20T11:09:49.541150Z"
|
||||
},
|
||||
"pycharm": {
|
||||
"is_executing": true,
|
||||
"name": "#%%\n"
|
||||
}
|
||||
},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"import getpass\n",
|
||||
"import os\n",
|
||||
"\n",
|
||||
"os.environ[\"end_point\"] = getpass.getpass(\"Tablestore end_point:\")\n",
|
||||
"os.environ[\"instance_name\"] = getpass.getpass(\"Tablestore instance_name:\")\n",
|
||||
"os.environ[\"access_key_id\"] = getpass.getpass(\"Tablestore access_key_id:\")\n",
|
||||
"os.environ[\"access_key_secret\"] = getpass.getpass(\"Tablestore access_key_secret:\")"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": "Create vector store. "
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 2,
|
||||
"metadata": {
|
||||
"ExecuteTime": {
|
||||
"end_time": "2024-08-20T11:10:07.911086Z",
|
||||
"start_time": "2024-08-20T11:10:07.351293Z"
|
||||
}
|
||||
},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"import tablestore\n",
|
||||
"from langchain_community.embeddings import FakeEmbeddings\n",
|
||||
"from langchain_community.vectorstores import TablestoreVectorStore\n",
|
||||
"from langchain_core.documents import Document\n",
|
||||
"\n",
|
||||
"test_embedding_dimension_size = 4\n",
|
||||
"embeddings = FakeEmbeddings(size=test_embedding_dimension_size)\n",
|
||||
"\n",
|
||||
"store = TablestoreVectorStore(\n",
|
||||
" embedding=embeddings,\n",
|
||||
" endpoint=os.getenv(\"end_point\"),\n",
|
||||
" instance_name=os.getenv(\"instance_name\"),\n",
|
||||
" access_key_id=os.getenv(\"access_key_id\"),\n",
|
||||
" access_key_secret=os.getenv(\"access_key_secret\"),\n",
|
||||
" vector_dimension=test_embedding_dimension_size,\n",
|
||||
" # metadata mapping is used to filter non-vector fields.\n",
|
||||
" metadata_mappings=[\n",
|
||||
" tablestore.FieldSchema(\n",
|
||||
" \"type\", tablestore.FieldType.KEYWORD, index=True, enable_sort_and_agg=True\n",
|
||||
" ),\n",
|
||||
" tablestore.FieldSchema(\n",
|
||||
" \"time\", tablestore.FieldType.LONG, index=True, enable_sort_and_agg=True\n",
|
||||
" ),\n",
|
||||
" ],\n",
|
||||
")"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": "## Manage vector store"
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": "Create table and index."
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 3,
|
||||
"metadata": {
|
||||
"ExecuteTime": {
|
||||
"end_time": "2024-08-20T11:10:10.875422Z",
|
||||
"start_time": "2024-08-20T11:10:10.566400Z"
|
||||
}
|
||||
},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"store.create_table_if_not_exist()\n",
|
||||
"store.create_search_index_if_not_exist()"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": "Add documents."
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 4,
|
||||
"metadata": {
|
||||
"ExecuteTime": {
|
||||
"end_time": "2024-08-20T11:10:14.974253Z",
|
||||
"start_time": "2024-08-20T11:10:14.894190Z"
|
||||
},
|
||||
"pycharm": {
|
||||
"is_executing": true,
|
||||
"name": "#%%\n"
|
||||
}
|
||||
},
|
||||
"outputs": [
|
||||
{
|
||||
"data": {
|
||||
"text/plain": [
|
||||
"['1', '2', '3', '4', '5']"
|
||||
]
|
||||
},
|
||||
"execution_count": 4,
|
||||
"metadata": {},
|
||||
"output_type": "execute_result"
|
||||
}
|
||||
],
|
||||
"source": [
|
||||
"store.add_documents(\n",
|
||||
" [\n",
|
||||
" Document(\n",
|
||||
" id=\"1\", page_content=\"1 hello world\", metadata={\"type\": \"pc\", \"time\": 2000}\n",
|
||||
" ),\n",
|
||||
" Document(\n",
|
||||
" id=\"2\", page_content=\"abc world\", metadata={\"type\": \"pc\", \"time\": 2009}\n",
|
||||
" ),\n",
|
||||
" Document(\n",
|
||||
" id=\"3\", page_content=\"3 text world\", metadata={\"type\": \"sky\", \"time\": 2010}\n",
|
||||
" ),\n",
|
||||
" Document(\n",
|
||||
" id=\"4\", page_content=\"hi world\", metadata={\"type\": \"sky\", \"time\": 2030}\n",
|
||||
" ),\n",
|
||||
" Document(\n",
|
||||
" id=\"5\", page_content=\"hi world\", metadata={\"type\": \"sky\", \"time\": 2030}\n",
|
||||
" ),\n",
|
||||
" ]\n",
|
||||
")"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {
|
||||
"pycharm": {
|
||||
"name": "#%% md\n"
|
||||
}
|
||||
},
|
||||
"source": "Delete document."
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 5,
|
||||
"metadata": {
|
||||
"ExecuteTime": {
|
||||
"end_time": "2024-08-20T11:10:17.408739Z",
|
||||
"start_time": "2024-08-20T11:10:17.269593Z"
|
||||
},
|
||||
"pycharm": {
|
||||
"name": "#%%\n"
|
||||
}
|
||||
},
|
||||
"outputs": [
|
||||
{
|
||||
"data": {
|
||||
"text/plain": [
|
||||
"True"
|
||||
]
|
||||
},
|
||||
"execution_count": 5,
|
||||
"metadata": {},
|
||||
"output_type": "execute_result"
|
||||
}
|
||||
],
|
||||
"source": [
|
||||
"store.delete([\"3\"])"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {
|
||||
"pycharm": {
|
||||
"name": "#%% md\n"
|
||||
}
|
||||
},
|
||||
"source": "Get documents."
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": "## Query vector store"
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 6,
|
||||
"metadata": {
|
||||
"ExecuteTime": {
|
||||
"end_time": "2024-08-20T11:10:19.379617Z",
|
||||
"start_time": "2024-08-20T11:10:19.339970Z"
|
||||
},
|
||||
"pycharm": {
|
||||
"name": "#%%\n"
|
||||
}
|
||||
},
|
||||
"outputs": [
|
||||
{
|
||||
"data": {
|
||||
"text/plain": [
|
||||
"[Document(id='1', metadata={'embedding': '[1.3296732307905934, 0.0037521341868022385, 0.9821875819319514, 2.5644103644492393]', 'time': 2000, 'type': 'pc'}, page_content='1 hello world'),\n",
|
||||
" None,\n",
|
||||
" Document(id='5', metadata={'embedding': '[1.4558082172139821, -1.6441137122167426, -0.13113098640337423, -1.889685473174525]', 'time': 2030, 'type': 'sky'}, page_content='hi world')]"
|
||||
]
|
||||
},
|
||||
"execution_count": 6,
|
||||
"metadata": {},
|
||||
"output_type": "execute_result"
|
||||
}
|
||||
],
|
||||
"source": [
|
||||
"store.get_by_ids([\"1\", \"3\", \"5\"])"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": "Similarity search."
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 7,
|
||||
"metadata": {
|
||||
"ExecuteTime": {
|
||||
"end_time": "2024-08-20T11:10:21.306717Z",
|
||||
"start_time": "2024-08-20T11:10:21.284244Z"
|
||||
}
|
||||
},
|
||||
"outputs": [
|
||||
{
|
||||
"data": {
|
||||
"text/plain": [
|
||||
"[Document(id='1', metadata={'embedding': [1.3296732307905934, 0.0037521341868022385, 0.9821875819319514, 2.5644103644492393], 'time': 2000, 'type': 'pc'}, page_content='1 hello world'),\n",
|
||||
" Document(id='4', metadata={'embedding': [-0.3310144199800685, 0.29250046478723635, -0.0646862290377582, -0.23664360156781225], 'time': 2030, 'type': 'sky'}, page_content='hi world')]"
|
||||
]
|
||||
},
|
||||
"execution_count": 7,
|
||||
"metadata": {},
|
||||
"output_type": "execute_result"
|
||||
}
|
||||
],
|
||||
"source": [
|
||||
"store.similarity_search(query=\"hello world\", k=2)"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": "Similarity search with filters. "
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 8,
|
||||
"metadata": {
|
||||
"ExecuteTime": {
|
||||
"end_time": "2024-08-20T11:10:23.231425Z",
|
||||
"start_time": "2024-08-20T11:10:23.213046Z"
|
||||
}
|
||||
},
|
||||
"outputs": [
|
||||
{
|
||||
"data": {
|
||||
"text/plain": [
|
||||
"[Document(id='5', metadata={'embedding': [1.4558082172139821, -1.6441137122167426, -0.13113098640337423, -1.889685473174525], 'time': 2030, 'type': 'sky'}, page_content='hi world'),\n",
|
||||
" Document(id='4', metadata={'embedding': [-0.3310144199800685, 0.29250046478723635, -0.0646862290377582, -0.23664360156781225], 'time': 2030, 'type': 'sky'}, page_content='hi world')]"
|
||||
]
|
||||
},
|
||||
"execution_count": 8,
|
||||
"metadata": {},
|
||||
"output_type": "execute_result"
|
||||
}
|
||||
],
|
||||
"source": [
|
||||
"store.similarity_search(\n",
|
||||
" query=\"hello world\",\n",
|
||||
" k=10,\n",
|
||||
" tablestore_filter_query=tablestore.BoolQuery(\n",
|
||||
" must_queries=[tablestore.TermQuery(field_name=\"type\", column_value=\"sky\")],\n",
|
||||
" should_queries=[tablestore.RangeQuery(field_name=\"time\", range_from=2020)],\n",
|
||||
" must_not_queries=[tablestore.TermQuery(field_name=\"type\", column_value=\"pc\")],\n",
|
||||
" ),\n",
|
||||
")"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"## Usage for retrieval-augmented generation\n",
|
||||
"\n",
|
||||
"For guides on how to use this vector store for retrieval-augmented generation (RAG), see the following sections:\n",
|
||||
"\n",
|
||||
"- [Tutorials](/docs/tutorials/)\n",
|
||||
"- [How-to: Question and answer with RAG](https://python.langchain.com/docs/how_to/#qa-with-rag)\n",
|
||||
"- [Retrieval conceptual docs](https://python.langchain.com/docs/concepts/retrieval)"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"## API reference\n",
|
||||
"\n",
|
||||
"For detailed documentation of all `TablestoreVectorStore` features and configurations head to the API reference:\n",
|
||||
" https://python.langchain.com/api_reference/community/vectorstores/langchain_community.vectorstores.tablestore.TablestoreVectorStore.html"
|
||||
]
|
||||
}
|
||||
],
|
||||
"metadata": {
|
||||
"kernelspec": {
|
||||
"display_name": "Python 3 (ipykernel)",
|
||||
"language": "python",
|
||||
"name": "python3"
|
||||
},
|
||||
"language_info": {
|
||||
"codemirror_mode": {
|
||||
"name": "ipython",
|
||||
"version": 3
|
||||
},
|
||||
"file_extension": ".py",
|
||||
"mimetype": "text/x-python",
|
||||
"name": "python",
|
||||
"nbconvert_exporter": "python",
|
||||
"pygments_lexer": "ipython3",
|
||||
"version": "3.11.6"
|
||||
}
|
||||
},
|
||||
"nbformat": 4,
|
||||
"nbformat_minor": 1
|
||||
}
|
Reference in New Issue
Block a user