mirror of
https://github.com/hwchase17/langchain.git
synced 2025-06-26 16:43:35 +00:00
partners: langchain-oceanbase Integration (#28782)
Hi, langchain team! I'm a maintainer of [OceanBase](https://github.com/oceanbase/oceanbase). With the integration guidance, I create a python lib named [langchain-oceanbase](https://github.com/oceanbase/langchain-oceanbase) to integrate `Oceanbase Vector Store` with `Langchain`. So I'd like to add the required docs. I will appreciate your feedback. Thank you! --------- Signed-off-by: shanhaikang.shk <shanhaikang.shk@oceanbase.com> Co-authored-by: Erick Friis <erick@langchain.dev>
This commit is contained in:
parent
986b752fc8
commit
33b1fb95b8
31
docs/docs/integrations/providers/oceanbase.mdx
Normal file
31
docs/docs/integrations/providers/oceanbase.mdx
Normal file
@ -0,0 +1,31 @@
|
|||||||
|
# OceanBase
|
||||||
|
|
||||||
|
[OceanBase Database](https://github.com/oceanbase/oceanbase) is a distributed relational database.
|
||||||
|
It is developed entirely by Ant Group. The OceanBase Database is built on a common server cluster.
|
||||||
|
Based on the Paxos protocol and its distributed structure, the OceanBase Database provides high availability and linear scalability.
|
||||||
|
|
||||||
|
OceanBase currently has the ability to store vectors. Users can easily perform the following operations with SQL:
|
||||||
|
|
||||||
|
- Create a table containing vector type fields;
|
||||||
|
- Create a vector index table based on the HNSW algorithm;
|
||||||
|
- Perform vector approximate nearest neighbor queries;
|
||||||
|
- ...
|
||||||
|
|
||||||
|
## Installation
|
||||||
|
|
||||||
|
```bash
|
||||||
|
pip install -U langchain-oceanbase
|
||||||
|
```
|
||||||
|
|
||||||
|
We recommend using Docker to deploy OceanBase:
|
||||||
|
|
||||||
|
```shell
|
||||||
|
docker run --name=ob433 -e MODE=slim -p 2881:2881 -d oceanbase/oceanbase-ce:4.3.3.0-100000132024100711
|
||||||
|
```
|
||||||
|
|
||||||
|
[More methods to deploy OceanBase cluster](https://github.com/oceanbase/oceanbase-doc/blob/V4.3.1/en-US/400.deploy/500.deploy-oceanbase-database-community-edition/100.deployment-overview.md)
|
||||||
|
|
||||||
|
### Usage
|
||||||
|
|
||||||
|
For a more detailed walkthrough of the OceanBase Wrapper, see [this notebook](https://github.com/oceanbase/langchain-oceanbase/blob/main/docs/vectorstores.ipynb)
|
||||||
|
|
383
docs/docs/integrations/vectorstores/oceanbase.ipynb
Normal file
383
docs/docs/integrations/vectorstores/oceanbase.ipynb
Normal file
@ -0,0 +1,383 @@
|
|||||||
|
{
|
||||||
|
"cells": [
|
||||||
|
{
|
||||||
|
"cell_type": "raw",
|
||||||
|
"id": "1957f5cb",
|
||||||
|
"metadata": {},
|
||||||
|
"source": [
|
||||||
|
"---\n",
|
||||||
|
"sidebar_label: Oceanbase\n",
|
||||||
|
"---"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"id": "ef1f0986",
|
||||||
|
"metadata": {},
|
||||||
|
"source": [
|
||||||
|
"# OceanbaseVectorStore\n",
|
||||||
|
"\n",
|
||||||
|
"This notebook covers how to get started with the Oceanbase vector store."
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"id": "36fdc060",
|
||||||
|
"metadata": {},
|
||||||
|
"source": [
|
||||||
|
"## Setup\n",
|
||||||
|
"\n",
|
||||||
|
"To access Oceanbase vector stores you'll need to deploy a standalone OceanBase server:"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "raw",
|
||||||
|
"id": "a7b92118",
|
||||||
|
"metadata": {},
|
||||||
|
"source": [
|
||||||
|
"%docker run --name=ob433 -e MODE=mini -e OB_SERVER_IP=127.0.0.1 -p 2881:2881 -d quay.io/oceanbase/oceanbase-ce:4.3.3.1-101000012024102216"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"id": "5b687bdc",
|
||||||
|
"metadata": {},
|
||||||
|
"source": [
|
||||||
|
"And install the `langchain-oceanbase` integration package."
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "raw",
|
||||||
|
"id": "64e28aa6",
|
||||||
|
"metadata": {
|
||||||
|
"vscode": {
|
||||||
|
"languageId": "raw"
|
||||||
|
}
|
||||||
|
},
|
||||||
|
"source": [
|
||||||
|
"%pip install -qU \"langchain-oceanbase\""
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"id": "ea850342",
|
||||||
|
"metadata": {},
|
||||||
|
"source": [
|
||||||
|
"Check the connection to OceanBase and set the memory usage ratio for vector data:"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "code",
|
||||||
|
"execution_count": 14,
|
||||||
|
"id": "066bbc79",
|
||||||
|
"metadata": {},
|
||||||
|
"outputs": [
|
||||||
|
{
|
||||||
|
"data": {
|
||||||
|
"text/plain": [
|
||||||
|
"<sqlalchemy.engine.cursor.CursorResult at 0x12696f2a0>"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
"execution_count": 14,
|
||||||
|
"metadata": {},
|
||||||
|
"output_type": "execute_result"
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"source": [
|
||||||
|
"from pyobvector import ObVecClient\n",
|
||||||
|
"\n",
|
||||||
|
"tmp_client = ObVecClient()\n",
|
||||||
|
"tmp_client.perform_raw_text_sql(\"ALTER SYSTEM ob_vector_memory_limit_percentage = 30\")"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"id": "93df377e",
|
||||||
|
"metadata": {},
|
||||||
|
"source": [
|
||||||
|
"## Initialization\n",
|
||||||
|
"\n",
|
||||||
|
"Configure the API key of the embedded model. Here we use `DashScopeEmbeddings` as an example. When deploying `Oceanbase` with a Docker image as described above, simply follow the script below to set the `host`, `port`, `user`, `password`, and `database name`. For other deployment methods, set these parameters according to the actual situation."
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "raw",
|
||||||
|
"id": "ff29e3b7",
|
||||||
|
"metadata": {},
|
||||||
|
"source": [
|
||||||
|
"%pip install dashscope"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "code",
|
||||||
|
"execution_count": 15,
|
||||||
|
"id": "dc37144c-208d-4ab3-9f3a-0407a69fe052",
|
||||||
|
"metadata": {
|
||||||
|
"tags": []
|
||||||
|
},
|
||||||
|
"outputs": [],
|
||||||
|
"source": [
|
||||||
|
"import os\n",
|
||||||
|
"\n",
|
||||||
|
"from langchain_community.embeddings import DashScopeEmbeddings\n",
|
||||||
|
"from langchain_oceanbase.vectorstores import OceanbaseVectorStore\n",
|
||||||
|
"\n",
|
||||||
|
"DASHSCOPE_API = os.environ.get(\"DASHSCOPE_API_KEY\", \"\")\n",
|
||||||
|
"connection_args = {\n",
|
||||||
|
" \"host\": \"127.0.0.1\",\n",
|
||||||
|
" \"port\": \"2881\",\n",
|
||||||
|
" \"user\": \"root@test\",\n",
|
||||||
|
" \"password\": \"\",\n",
|
||||||
|
" \"db_name\": \"test\",\n",
|
||||||
|
"}\n",
|
||||||
|
"\n",
|
||||||
|
"embeddings = DashScopeEmbeddings(\n",
|
||||||
|
" model=\"text-embedding-v1\", dashscope_api_key=DASHSCOPE_API\n",
|
||||||
|
")\n",
|
||||||
|
"\n",
|
||||||
|
"vector_store = OceanbaseVectorStore(\n",
|
||||||
|
" embedding_function=embeddings,\n",
|
||||||
|
" table_name=\"langchain_vector\",\n",
|
||||||
|
" connection_args=connection_args,\n",
|
||||||
|
" vidx_metric_type=\"l2\",\n",
|
||||||
|
" drop_old=True,\n",
|
||||||
|
")"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"id": "ac6071d4",
|
||||||
|
"metadata": {},
|
||||||
|
"source": [
|
||||||
|
"## Manage vector store\n",
|
||||||
|
"\n",
|
||||||
|
"### Add items to vector store\n",
|
||||||
|
"\n",
|
||||||
|
"- TODO: Edit and then run code cell to generate output"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "code",
|
||||||
|
"execution_count": 16,
|
||||||
|
"id": "17f5efc0",
|
||||||
|
"metadata": {},
|
||||||
|
"outputs": [
|
||||||
|
{
|
||||||
|
"data": {
|
||||||
|
"text/plain": [
|
||||||
|
"['1', '2', '3']"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
"execution_count": 16,
|
||||||
|
"metadata": {},
|
||||||
|
"output_type": "execute_result"
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"source": [
|
||||||
|
"from langchain_core.documents import Document\n",
|
||||||
|
"\n",
|
||||||
|
"document_1 = Document(page_content=\"foo\", metadata={\"source\": \"https://foo.com\"})\n",
|
||||||
|
"document_2 = Document(page_content=\"bar\", metadata={\"source\": \"https://bar.com\"})\n",
|
||||||
|
"document_3 = Document(page_content=\"baz\", metadata={\"source\": \"https://baz.com\"})\n",
|
||||||
|
"\n",
|
||||||
|
"documents = [document_1, document_2, document_3]\n",
|
||||||
|
"\n",
|
||||||
|
"vector_store.add_documents(documents=documents, ids=[\"1\", \"2\", \"3\"])"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"id": "c738c3e0",
|
||||||
|
"metadata": {},
|
||||||
|
"source": [
|
||||||
|
"### Update items in vector store"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "code",
|
||||||
|
"execution_count": 17,
|
||||||
|
"id": "f0aa8b71",
|
||||||
|
"metadata": {},
|
||||||
|
"outputs": [
|
||||||
|
{
|
||||||
|
"data": {
|
||||||
|
"text/plain": [
|
||||||
|
"['1']"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
"execution_count": 17,
|
||||||
|
"metadata": {},
|
||||||
|
"output_type": "execute_result"
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"source": [
|
||||||
|
"updated_document = Document(\n",
|
||||||
|
" page_content=\"qux\", metadata={\"source\": \"https://another-example.com\"}\n",
|
||||||
|
")\n",
|
||||||
|
"\n",
|
||||||
|
"vector_store.add_documents(documents=[updated_document], ids=[\"1\"])"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"id": "dcf1b905",
|
||||||
|
"metadata": {},
|
||||||
|
"source": [
|
||||||
|
"### Delete items from vector store"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "code",
|
||||||
|
"execution_count": 18,
|
||||||
|
"id": "ef61e188",
|
||||||
|
"metadata": {},
|
||||||
|
"outputs": [],
|
||||||
|
"source": [
|
||||||
|
"vector_store.delete(ids=[\"3\"])"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"id": "c3620501",
|
||||||
|
"metadata": {},
|
||||||
|
"source": [
|
||||||
|
"## Query vector store\n",
|
||||||
|
"\n",
|
||||||
|
"Once your vector store has been created and the relevant documents have been added you will most likely wish to query it during the running of your chain or agent. \n",
|
||||||
|
"\n",
|
||||||
|
"### Query directly\n",
|
||||||
|
"\n",
|
||||||
|
"Performing a simple similarity search can be done as follows:"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "code",
|
||||||
|
"execution_count": 19,
|
||||||
|
"id": "aa0a16fa",
|
||||||
|
"metadata": {},
|
||||||
|
"outputs": [
|
||||||
|
{
|
||||||
|
"name": "stdout",
|
||||||
|
"output_type": "stream",
|
||||||
|
"text": [
|
||||||
|
"* bar [{'source': 'https://bar.com'}]\n"
|
||||||
|
]
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"source": [
|
||||||
|
"results = vector_store.similarity_search(\n",
|
||||||
|
" query=\"thud\", k=1, filter={\"source\": \"https://another-example.com\"}\n",
|
||||||
|
")\n",
|
||||||
|
"for doc in results:\n",
|
||||||
|
" print(f\"* {doc.page_content} [{doc.metadata}]\")"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"id": "3ed9d733",
|
||||||
|
"metadata": {},
|
||||||
|
"source": [
|
||||||
|
"If you want to execute a similarity search and receive the corresponding scores you can run:"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "code",
|
||||||
|
"execution_count": 20,
|
||||||
|
"id": "5efd2eaa",
|
||||||
|
"metadata": {},
|
||||||
|
"outputs": [
|
||||||
|
{
|
||||||
|
"name": "stdout",
|
||||||
|
"output_type": "stream",
|
||||||
|
"text": [
|
||||||
|
"* [SIM=133.452299] bar [{'source': 'https://bar.com'}]\n"
|
||||||
|
]
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"source": [
|
||||||
|
"results = vector_store.similarity_search_with_score(\n",
|
||||||
|
" query=\"thud\", k=1, filter={\"source\": \"https://example.com\"}\n",
|
||||||
|
")\n",
|
||||||
|
"for doc, score in results:\n",
|
||||||
|
" print(f\"* [SIM={score:3f}] {doc.page_content} [{doc.metadata}]\")"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"id": "0c235cdc",
|
||||||
|
"metadata": {},
|
||||||
|
"source": [
|
||||||
|
"### Query by turning into retriever\n",
|
||||||
|
"\n",
|
||||||
|
"You can also transform the vector store into a retriever for easier usage in your chains. "
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "code",
|
||||||
|
"execution_count": 21,
|
||||||
|
"id": "f3460093",
|
||||||
|
"metadata": {},
|
||||||
|
"outputs": [
|
||||||
|
{
|
||||||
|
"data": {
|
||||||
|
"text/plain": [
|
||||||
|
"[Document(metadata={'source': 'https://bar.com'}, page_content='bar')]"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
"execution_count": 21,
|
||||||
|
"metadata": {},
|
||||||
|
"output_type": "execute_result"
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"source": [
|
||||||
|
"retriever = vector_store.as_retriever(search_kwargs={\"k\": 1})\n",
|
||||||
|
"retriever.invoke(\"thud\")"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"id": "901c75dc",
|
||||||
|
"metadata": {},
|
||||||
|
"source": [
|
||||||
|
"## Usage for retrieval-augmented generation\n",
|
||||||
|
"\n",
|
||||||
|
"For guides on how to use this vector store for retrieval-augmented generation (RAG), see the following sections:\n",
|
||||||
|
"\n",
|
||||||
|
"- [Tutorials](/docs/tutorials/)\n",
|
||||||
|
"- [How-to: Question and answer with RAG](https://python.langchain.com/docs/how_to/#qa-with-rag)\n",
|
||||||
|
"- [Retrieval conceptual docs](https://python.langchain.com/docs/concepts/#retrieval)"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"id": "8a27244f",
|
||||||
|
"metadata": {},
|
||||||
|
"source": [
|
||||||
|
"## API reference\n",
|
||||||
|
"\n",
|
||||||
|
"For detailed documentation of all OceanbaseVectorStore features and configurations head to the API reference: https://python.langchain.com/docs/integrations/vectorstores/oceanbase"
|
||||||
|
]
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"metadata": {
|
||||||
|
"kernelspec": {
|
||||||
|
"display_name": "Python 3 (ipykernel)",
|
||||||
|
"language": "python",
|
||||||
|
"name": "python3"
|
||||||
|
},
|
||||||
|
"language_info": {
|
||||||
|
"codemirror_mode": {
|
||||||
|
"name": "ipython",
|
||||||
|
"version": 3
|
||||||
|
},
|
||||||
|
"file_extension": ".py",
|
||||||
|
"mimetype": "text/x-python",
|
||||||
|
"name": "python",
|
||||||
|
"nbconvert_exporter": "python",
|
||||||
|
"pygments_lexer": "ipython3",
|
||||||
|
"version": "3.11.10"
|
||||||
|
}
|
||||||
|
},
|
||||||
|
"nbformat": 4,
|
||||||
|
"nbformat_minor": 5
|
||||||
|
}
|
@ -158,3 +158,6 @@ packages:
|
|||||||
- name: langchain-linkup
|
- name: langchain-linkup
|
||||||
repo: LinkupPlatform/langchain-linkup
|
repo: LinkupPlatform/langchain-linkup
|
||||||
path: .
|
path: .
|
||||||
|
- name: langchain-oceanbase
|
||||||
|
repo: oceanbase/langchain-oceanbase
|
||||||
|
path: .
|
||||||
|
Loading…
Reference in New Issue
Block a user