mirror of
https://github.com/hwchase17/langchain.git
synced 2025-06-26 08:33:49 +00:00
MariaDB vector store documentation addition (#30229)
### New Feature Since version 11.7.1, MariaDB support vector. This is a super fast implementation (see [some perf blog](https://smalldatum.blogspot.com/2025/01/evaluating-vector-indexes-in-mariadb.html) The goal is to support MariaDB with langchain Implementation is done in https://github.com/mariadb-corporation/langchain-mariadb, published in https://pypi.org/project/langchain-mariadb/ This concerns the doc addition (initial PR https://github.com/langchain-ai/langchain/pull/29989) --------- Co-authored-by: ccurme <chester.curme@gmail.com> Co-authored-by: Oskar Stark <oskarstark@googlemail.com>
This commit is contained in:
parent
1cdea6ab07
commit
aa37893c00
41
docs/docs/integrations/providers/mariadb.mdx
Normal file
41
docs/docs/integrations/providers/mariadb.mdx
Normal file
@ -0,0 +1,41 @@
|
|||||||
|
# MariaDB
|
||||||
|
|
||||||
|
This page covers how to use the [MariaDB](https://github.com/mariadb/) ecosystem within LangChain.
|
||||||
|
It is broken into two parts: installation and setup, and then references to specific PGVector wrappers.
|
||||||
|
|
||||||
|
## Installation
|
||||||
|
- Install c/c connector
|
||||||
|
|
||||||
|
on Debian, Ubuntu
|
||||||
|
```bash
|
||||||
|
sudo apt install libmariadb3 libmariadb-dev
|
||||||
|
```
|
||||||
|
|
||||||
|
on CentOS, RHEL, Rocky Linux
|
||||||
|
```bash
|
||||||
|
sudo yum install MariaDB-shared MariaDB-devel
|
||||||
|
```
|
||||||
|
|
||||||
|
- Install the Python connector package with `pip install mariadb`
|
||||||
|
|
||||||
|
|
||||||
|
## Setup
|
||||||
|
1. The first step is to have a MariaDB 11.7.1 or later installed.
|
||||||
|
|
||||||
|
The docker image is the easiest way to get started.
|
||||||
|
|
||||||
|
## Wrappers
|
||||||
|
|
||||||
|
### VectorStore
|
||||||
|
|
||||||
|
There exists a wrapper around MariaDB vector databases, allowing you to use it as a vectorstore,
|
||||||
|
whether for semantic search or example selection.
|
||||||
|
|
||||||
|
To import this vectorstore:
|
||||||
|
```python
|
||||||
|
from langchain_mariadb import MariaDBStore
|
||||||
|
```
|
||||||
|
|
||||||
|
### Usage
|
||||||
|
|
||||||
|
For a more detailed walkthrough of the MariaDB wrapper, see [this notebook](/docs/integrations/vectorstores/mariadb)
|
298
docs/docs/integrations/vectorstores/mariadb.ipynb
Normal file
298
docs/docs/integrations/vectorstores/mariadb.ipynb
Normal file
@ -0,0 +1,298 @@
|
|||||||
|
{
|
||||||
|
"cells": [
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"id": "7679dd7b-7ed4-4755-a499-824deadba708",
|
||||||
|
"metadata": {},
|
||||||
|
"source": [
|
||||||
|
"# MariaDB\n",
|
||||||
|
"\n",
|
||||||
|
"LangChain's MariaDB integration (langchain-mariadb) provides vector capabilities for working with MariaDB version 11.7.1 and above, distributed under the MIT license. Users can use the provided implementations as-is or customize them for specific needs.\n",
|
||||||
|
" Key features include:\n",
|
||||||
|
"\n",
|
||||||
|
" * Built-in vector similarity search\n",
|
||||||
|
" * Support for cosine and euclidean distance metrics\n",
|
||||||
|
" * Robust metadata filtering options\n",
|
||||||
|
" * Performance optimization through connection pooling\n",
|
||||||
|
" * Configurable table and column settings\n",
|
||||||
|
"\n",
|
||||||
|
"## Setup\n",
|
||||||
|
"\n",
|
||||||
|
"Launch a MariaDB Docker container with:"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "code",
|
||||||
|
"execution_count": null,
|
||||||
|
"id": "92df32f0",
|
||||||
|
"metadata": {},
|
||||||
|
"outputs": [],
|
||||||
|
"source": [
|
||||||
|
"!docker run --name mariadb-container -e MARIADB_ROOT_PASSWORD=langchain -e MARIADB_DATABASE=langchain -p 3306:3306 -d mariadb:11.7"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {},
|
||||||
|
"source": [
|
||||||
|
"### Installing the Package\n",
|
||||||
|
"\n",
|
||||||
|
"The package uses SQLAlchemy but works best with the MariaDB connector, which requires C/C++ components:\n"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "code",
|
||||||
|
"execution_count": null,
|
||||||
|
"id": "2acbaf9b",
|
||||||
|
"metadata": {},
|
||||||
|
"outputs": [],
|
||||||
|
"source": [
|
||||||
|
"# Debian, Ubuntu\n",
|
||||||
|
"!sudo apt install libmariadb3 libmariadb-dev\n",
|
||||||
|
"\n",
|
||||||
|
"# CentOS, RHEL, Rocky Linux\n",
|
||||||
|
"!sudo yum install MariaDB-shared MariaDB-devel\n",
|
||||||
|
"\n",
|
||||||
|
"# Install Python connector\n",
|
||||||
|
"!pip install -U mariadb"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"id": "0dd87fcc",
|
||||||
|
"metadata": {},
|
||||||
|
"source": [
|
||||||
|
"Then install `langchain-mariadb` package\n"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "code",
|
||||||
|
"execution_count": null,
|
||||||
|
"id": "2acbaf9b",
|
||||||
|
"metadata": {},
|
||||||
|
"outputs": [],
|
||||||
|
"source": [
|
||||||
|
"pip install -U langchain-mariadb\n"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"id": "0dd87fca",
|
||||||
|
"metadata": {},
|
||||||
|
"source": [
|
||||||
|
"VectorStore works along with an LLM model, here using `langchain-openai` as example.\n"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "code",
|
||||||
|
"execution_count": null,
|
||||||
|
"id": "2acbaf9b",
|
||||||
|
"metadata": {},
|
||||||
|
"outputs": [],
|
||||||
|
"source": [
|
||||||
|
"pip install langchain-openai\n",
|
||||||
|
"export OPENAI_API_KEY=...\n"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"id": "0dd87fcc",
|
||||||
|
"metadata": {},
|
||||||
|
"source": [
|
||||||
|
"## Initialization"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "code",
|
||||||
|
"execution_count": 1,
|
||||||
|
"id": "94f5c129",
|
||||||
|
"metadata": {},
|
||||||
|
"outputs": [],
|
||||||
|
"source": [
|
||||||
|
"from langchain_core.documents import Document\n",
|
||||||
|
"from langchain_mariadb import MariaDBStore\n",
|
||||||
|
"from langchain_openai import OpenAIEmbeddings\n",
|
||||||
|
"\n",
|
||||||
|
"# connection string\n",
|
||||||
|
"url = f\"mariadb+mariadbconnector://myuser:mypassword@localhost/langchain\"\n",
|
||||||
|
"\n",
|
||||||
|
"# Initialize vector store\n",
|
||||||
|
"vectorstore = MariaDBStore(\n",
|
||||||
|
" embeddings=OpenAIEmbeddings(),\n",
|
||||||
|
" embedding_length=1536,\n",
|
||||||
|
" datasource=url,\n",
|
||||||
|
" collection_name=\"my_docs\",\n",
|
||||||
|
")"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"id": "61a224a1-d70b-4daf-86ba-ab6e43c08b50",
|
||||||
|
"metadata": {},
|
||||||
|
"source": [
|
||||||
|
"## Manage vector store\n",
|
||||||
|
"\n",
|
||||||
|
"### Adding Data\n",
|
||||||
|
"You can add data as documents with metadata:\n"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "code",
|
||||||
|
"execution_count": 1,
|
||||||
|
"id": "94f5d129",
|
||||||
|
"metadata": {},
|
||||||
|
"outputs": [],
|
||||||
|
"source": [
|
||||||
|
"docs = [\n",
|
||||||
|
" Document(\n",
|
||||||
|
" page_content=\"there are cats in the pond\",\n",
|
||||||
|
" metadata={\"id\": 1, \"location\": \"pond\", \"topic\": \"animals\"},\n",
|
||||||
|
" ),\n",
|
||||||
|
" Document(\n",
|
||||||
|
" page_content=\"ducks are also found in the pond\",\n",
|
||||||
|
" metadata={\"id\": 2, \"location\": \"pond\", \"topic\": \"animals\"},\n",
|
||||||
|
" ),\n",
|
||||||
|
" # More documents...\n",
|
||||||
|
"]\n",
|
||||||
|
"vectorstore.add_documents(docs)"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"id": "0c712fa3",
|
||||||
|
"metadata": {},
|
||||||
|
"source": [
|
||||||
|
"\n",
|
||||||
|
"Or as plain text with optional metadata:"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "code",
|
||||||
|
"execution_count": 14,
|
||||||
|
"id": "a5b2b71f-49eb-407d-b03a-dea4c0a517d6",
|
||||||
|
"metadata": {
|
||||||
|
"tags": []
|
||||||
|
},
|
||||||
|
"outputs": [],
|
||||||
|
"source": [
|
||||||
|
"texts = [\n",
|
||||||
|
" \"a sculpture exhibit is also at the museum\",\n",
|
||||||
|
" \"a new coffee shop opened on Main Street\",\n",
|
||||||
|
"]\n",
|
||||||
|
"metadatas = [\n",
|
||||||
|
" {\"id\": 6, \"location\": \"museum\", \"topic\": \"art\"},\n",
|
||||||
|
" {\"id\": 7, \"location\": \"Main Street\", \"topic\": \"food\"},\n",
|
||||||
|
"]\n",
|
||||||
|
"\n",
|
||||||
|
"vectorstore.add_texts(texts=texts, metadatas=metadatas)"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"id": "59f82250-7903-4279-8300-062542c83416",
|
||||||
|
"metadata": {},
|
||||||
|
"source": [
|
||||||
|
"## Query vector store"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "code",
|
||||||
|
"execution_count": 15,
|
||||||
|
"id": "f15a2359-6dc3-4099-8214-785f167a9ca4",
|
||||||
|
"metadata": {
|
||||||
|
"tags": []
|
||||||
|
},
|
||||||
|
"outputs": [],
|
||||||
|
"source": [
|
||||||
|
"# Basic similarity search\n",
|
||||||
|
"results = vectorstore.similarity_search(\"Hello\", k=2)\n",
|
||||||
|
"\n",
|
||||||
|
"# Search with metadata filtering\n",
|
||||||
|
"results = vectorstore.similarity_search(\"Hello\", filter={\"category\": \"greeting\"})"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"id": "7ecd77a0",
|
||||||
|
"metadata": {},
|
||||||
|
"source": [
|
||||||
|
"\n",
|
||||||
|
"### Filter Options\n",
|
||||||
|
"\n",
|
||||||
|
"The system supports various filtering operations on metadata:\n",
|
||||||
|
"\n",
|
||||||
|
"* Equality: $eq\n",
|
||||||
|
"* Inequality: $ne\n",
|
||||||
|
"* Comparisons: $lt, $lte, $gt, $gte\n",
|
||||||
|
"* List operations: $in, $nin\n",
|
||||||
|
"* Text matching: $like, $nlike\n",
|
||||||
|
"* Logical operations: $and, $or, $not\n",
|
||||||
|
"\n",
|
||||||
|
"Example:\n"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "code",
|
||||||
|
"execution_count": 15,
|
||||||
|
"id": "f15a2359-6dc3-4099-8214-785f167a9cb4",
|
||||||
|
"metadata": {
|
||||||
|
"tags": []
|
||||||
|
},
|
||||||
|
"outputs": [],
|
||||||
|
"source": [
|
||||||
|
"# Search with simple filter\n",
|
||||||
|
"results = vectorstore.similarity_search(\n",
|
||||||
|
" \"kitty\", k=10, filter={\"id\": {\"$in\": [1, 5, 2, 9]}}\n",
|
||||||
|
")\n",
|
||||||
|
"\n",
|
||||||
|
"# Search with multiple conditions (AND)\n",
|
||||||
|
"results = vectorstore.similarity_search(\n",
|
||||||
|
" \"ducks\",\n",
|
||||||
|
" k=10,\n",
|
||||||
|
" filter={\"id\": {\"$in\": [1, 5, 2, 9]}, \"location\": {\"$in\": [\"pond\", \"market\"]}},\n",
|
||||||
|
")"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"id": "90a65b31",
|
||||||
|
"metadata": {},
|
||||||
|
"source": [
|
||||||
|
"## Usage for retrieval-augmented generation\n",
|
||||||
|
"\n",
|
||||||
|
"TODO: document example"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"id": "f08d3a3c",
|
||||||
|
"metadata": {},
|
||||||
|
"source": [
|
||||||
|
"## API reference\n",
|
||||||
|
"\n",
|
||||||
|
"See the repo [here](https://github.com/mariadb-corporation/langchain-mariadb) for more detail."
|
||||||
|
]
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"metadata": {
|
||||||
|
"kernelspec": {
|
||||||
|
"display_name": "Python 3 (ipykernel)",
|
||||||
|
"language": "python",
|
||||||
|
"name": "python3"
|
||||||
|
},
|
||||||
|
"language_info": {
|
||||||
|
"codemirror_mode": {
|
||||||
|
"name": "ipython",
|
||||||
|
"version": 3
|
||||||
|
},
|
||||||
|
"file_extension": ".py",
|
||||||
|
"mimetype": "text/x-python",
|
||||||
|
"name": "python",
|
||||||
|
"nbconvert_exporter": "python",
|
||||||
|
"pygments_lexer": "ipython3",
|
||||||
|
"version": "3.11.9"
|
||||||
|
}
|
||||||
|
},
|
||||||
|
"nbformat": 4,
|
||||||
|
"nbformat_minor": 5
|
||||||
|
}
|
@ -576,3 +576,6 @@ packages:
|
|||||||
path: .
|
path: .
|
||||||
name_title: RunPod
|
name_title: RunPod
|
||||||
provider_page: runpod
|
provider_page: runpod
|
||||||
|
- name: langchain-mariadb
|
||||||
|
path: .
|
||||||
|
repo: mariadb-corporation/langchain-mariadb
|
||||||
|
Loading…
Reference in New Issue
Block a user