MariaDB vector store documentation addition (#30229)

### New Feature

Since version 11.7.1, MariaDB support vector. This is a super fast
implementation (see [some perf
blog](https://smalldatum.blogspot.com/2025/01/evaluating-vector-indexes-in-mariadb.html)
The goal is to support MariaDB with langchain

Implementation is done in
https://github.com/mariadb-corporation/langchain-mariadb, published in
https://pypi.org/project/langchain-mariadb/

This concerns the doc addition
 

(initial PR https://github.com/langchain-ai/langchain/pull/29989)

---------

Co-authored-by: ccurme <chester.curme@gmail.com>
Co-authored-by: Oskar Stark <oskarstark@googlemail.com>
This commit is contained in:
diego dupin 2025-04-04 16:56:25 +02:00 committed by GitHub
parent 1cdea6ab07
commit aa37893c00
No known key found for this signature in database
GPG Key ID: B5690EEEBB952194
3 changed files with 342 additions and 0 deletions

View File

@ -0,0 +1,41 @@
# MariaDB
This page covers how to use the [MariaDB](https://github.com/mariadb/) ecosystem within LangChain.
It is broken into two parts: installation and setup, and then references to specific PGVector wrappers.
## Installation
- Install c/c connector
on Debian, Ubuntu
```bash
sudo apt install libmariadb3 libmariadb-dev
```
on CentOS, RHEL, Rocky Linux
```bash
sudo yum install MariaDB-shared MariaDB-devel
```
- Install the Python connector package with `pip install mariadb`
## Setup
1. The first step is to have a MariaDB 11.7.1 or later installed.
The docker image is the easiest way to get started.
## Wrappers
### VectorStore
There exists a wrapper around MariaDB vector databases, allowing you to use it as a vectorstore,
whether for semantic search or example selection.
To import this vectorstore:
```python
from langchain_mariadb import MariaDBStore
```
### Usage
For a more detailed walkthrough of the MariaDB wrapper, see [this notebook](/docs/integrations/vectorstores/mariadb)

View File

@ -0,0 +1,298 @@
{
"cells": [
{
"cell_type": "markdown",
"id": "7679dd7b-7ed4-4755-a499-824deadba708",
"metadata": {},
"source": [
"# MariaDB\n",
"\n",
"LangChain's MariaDB integration (langchain-mariadb) provides vector capabilities for working with MariaDB version 11.7.1 and above, distributed under the MIT license. Users can use the provided implementations as-is or customize them for specific needs.\n",
" Key features include:\n",
"\n",
" * Built-in vector similarity search\n",
" * Support for cosine and euclidean distance metrics\n",
" * Robust metadata filtering options\n",
" * Performance optimization through connection pooling\n",
" * Configurable table and column settings\n",
"\n",
"## Setup\n",
"\n",
"Launch a MariaDB Docker container with:"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "92df32f0",
"metadata": {},
"outputs": [],
"source": [
"!docker run --name mariadb-container -e MARIADB_ROOT_PASSWORD=langchain -e MARIADB_DATABASE=langchain -p 3306:3306 -d mariadb:11.7"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Installing the Package\n",
"\n",
"The package uses SQLAlchemy but works best with the MariaDB connector, which requires C/C++ components:\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "2acbaf9b",
"metadata": {},
"outputs": [],
"source": [
"# Debian, Ubuntu\n",
"!sudo apt install libmariadb3 libmariadb-dev\n",
"\n",
"# CentOS, RHEL, Rocky Linux\n",
"!sudo yum install MariaDB-shared MariaDB-devel\n",
"\n",
"# Install Python connector\n",
"!pip install -U mariadb"
]
},
{
"cell_type": "markdown",
"id": "0dd87fcc",
"metadata": {},
"source": [
"Then install `langchain-mariadb` package\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "2acbaf9b",
"metadata": {},
"outputs": [],
"source": [
"pip install -U langchain-mariadb\n"
]
},
{
"cell_type": "markdown",
"id": "0dd87fca",
"metadata": {},
"source": [
"VectorStore works along with an LLM model, here using `langchain-openai` as example.\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "2acbaf9b",
"metadata": {},
"outputs": [],
"source": [
"pip install langchain-openai\n",
"export OPENAI_API_KEY=...\n"
]
},
{
"cell_type": "markdown",
"id": "0dd87fcc",
"metadata": {},
"source": [
"## Initialization"
]
},
{
"cell_type": "code",
"execution_count": 1,
"id": "94f5c129",
"metadata": {},
"outputs": [],
"source": [
"from langchain_core.documents import Document\n",
"from langchain_mariadb import MariaDBStore\n",
"from langchain_openai import OpenAIEmbeddings\n",
"\n",
"# connection string\n",
"url = f\"mariadb+mariadbconnector://myuser:mypassword@localhost/langchain\"\n",
"\n",
"# Initialize vector store\n",
"vectorstore = MariaDBStore(\n",
" embeddings=OpenAIEmbeddings(),\n",
" embedding_length=1536,\n",
" datasource=url,\n",
" collection_name=\"my_docs\",\n",
")"
]
},
{
"cell_type": "markdown",
"id": "61a224a1-d70b-4daf-86ba-ab6e43c08b50",
"metadata": {},
"source": [
"## Manage vector store\n",
"\n",
"### Adding Data\n",
"You can add data as documents with metadata:\n"
]
},
{
"cell_type": "code",
"execution_count": 1,
"id": "94f5d129",
"metadata": {},
"outputs": [],
"source": [
"docs = [\n",
" Document(\n",
" page_content=\"there are cats in the pond\",\n",
" metadata={\"id\": 1, \"location\": \"pond\", \"topic\": \"animals\"},\n",
" ),\n",
" Document(\n",
" page_content=\"ducks are also found in the pond\",\n",
" metadata={\"id\": 2, \"location\": \"pond\", \"topic\": \"animals\"},\n",
" ),\n",
" # More documents...\n",
"]\n",
"vectorstore.add_documents(docs)"
]
},
{
"cell_type": "markdown",
"id": "0c712fa3",
"metadata": {},
"source": [
"\n",
"Or as plain text with optional metadata:"
]
},
{
"cell_type": "code",
"execution_count": 14,
"id": "a5b2b71f-49eb-407d-b03a-dea4c0a517d6",
"metadata": {
"tags": []
},
"outputs": [],
"source": [
"texts = [\n",
" \"a sculpture exhibit is also at the museum\",\n",
" \"a new coffee shop opened on Main Street\",\n",
"]\n",
"metadatas = [\n",
" {\"id\": 6, \"location\": \"museum\", \"topic\": \"art\"},\n",
" {\"id\": 7, \"location\": \"Main Street\", \"topic\": \"food\"},\n",
"]\n",
"\n",
"vectorstore.add_texts(texts=texts, metadatas=metadatas)"
]
},
{
"cell_type": "markdown",
"id": "59f82250-7903-4279-8300-062542c83416",
"metadata": {},
"source": [
"## Query vector store"
]
},
{
"cell_type": "code",
"execution_count": 15,
"id": "f15a2359-6dc3-4099-8214-785f167a9ca4",
"metadata": {
"tags": []
},
"outputs": [],
"source": [
"# Basic similarity search\n",
"results = vectorstore.similarity_search(\"Hello\", k=2)\n",
"\n",
"# Search with metadata filtering\n",
"results = vectorstore.similarity_search(\"Hello\", filter={\"category\": \"greeting\"})"
]
},
{
"cell_type": "markdown",
"id": "7ecd77a0",
"metadata": {},
"source": [
"\n",
"### Filter Options\n",
"\n",
"The system supports various filtering operations on metadata:\n",
"\n",
"* Equality: $eq\n",
"* Inequality: $ne\n",
"* Comparisons: $lt, $lte, $gt, $gte\n",
"* List operations: $in, $nin\n",
"* Text matching: $like, $nlike\n",
"* Logical operations: $and, $or, $not\n",
"\n",
"Example:\n"
]
},
{
"cell_type": "code",
"execution_count": 15,
"id": "f15a2359-6dc3-4099-8214-785f167a9cb4",
"metadata": {
"tags": []
},
"outputs": [],
"source": [
"# Search with simple filter\n",
"results = vectorstore.similarity_search(\n",
" \"kitty\", k=10, filter={\"id\": {\"$in\": [1, 5, 2, 9]}}\n",
")\n",
"\n",
"# Search with multiple conditions (AND)\n",
"results = vectorstore.similarity_search(\n",
" \"ducks\",\n",
" k=10,\n",
" filter={\"id\": {\"$in\": [1, 5, 2, 9]}, \"location\": {\"$in\": [\"pond\", \"market\"]}},\n",
")"
]
},
{
"cell_type": "markdown",
"id": "90a65b31",
"metadata": {},
"source": [
"## Usage for retrieval-augmented generation\n",
"\n",
"TODO: document example"
]
},
{
"cell_type": "markdown",
"id": "f08d3a3c",
"metadata": {},
"source": [
"## API reference\n",
"\n",
"See the repo [here](https://github.com/mariadb-corporation/langchain-mariadb) for more detail."
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.11.9"
}
},
"nbformat": 4,
"nbformat_minor": 5
}

View File

@ -576,3 +576,6 @@ packages:
path: .
name_title: RunPod
provider_page: runpod
- name: langchain-mariadb
path: .
repo: mariadb-corporation/langchain-mariadb