mirror of
https://github.com/hwchase17/langchain.git
synced 2025-06-25 16:13:25 +00:00
MariaDB vector store documentation addition (#30229)
### New Feature Since version 11.7.1, MariaDB support vector. This is a super fast implementation (see [some perf blog](https://smalldatum.blogspot.com/2025/01/evaluating-vector-indexes-in-mariadb.html) The goal is to support MariaDB with langchain Implementation is done in https://github.com/mariadb-corporation/langchain-mariadb, published in https://pypi.org/project/langchain-mariadb/ This concerns the doc addition (initial PR https://github.com/langchain-ai/langchain/pull/29989) --------- Co-authored-by: ccurme <chester.curme@gmail.com> Co-authored-by: Oskar Stark <oskarstark@googlemail.com>
This commit is contained in:
parent
1cdea6ab07
commit
aa37893c00
41
docs/docs/integrations/providers/mariadb.mdx
Normal file
41
docs/docs/integrations/providers/mariadb.mdx
Normal file
@ -0,0 +1,41 @@
|
||||
# MariaDB
|
||||
|
||||
This page covers how to use the [MariaDB](https://github.com/mariadb/) ecosystem within LangChain.
|
||||
It is broken into two parts: installation and setup, and then references to specific PGVector wrappers.
|
||||
|
||||
## Installation
|
||||
- Install c/c connector
|
||||
|
||||
on Debian, Ubuntu
|
||||
```bash
|
||||
sudo apt install libmariadb3 libmariadb-dev
|
||||
```
|
||||
|
||||
on CentOS, RHEL, Rocky Linux
|
||||
```bash
|
||||
sudo yum install MariaDB-shared MariaDB-devel
|
||||
```
|
||||
|
||||
- Install the Python connector package with `pip install mariadb`
|
||||
|
||||
|
||||
## Setup
|
||||
1. The first step is to have a MariaDB 11.7.1 or later installed.
|
||||
|
||||
The docker image is the easiest way to get started.
|
||||
|
||||
## Wrappers
|
||||
|
||||
### VectorStore
|
||||
|
||||
There exists a wrapper around MariaDB vector databases, allowing you to use it as a vectorstore,
|
||||
whether for semantic search or example selection.
|
||||
|
||||
To import this vectorstore:
|
||||
```python
|
||||
from langchain_mariadb import MariaDBStore
|
||||
```
|
||||
|
||||
### Usage
|
||||
|
||||
For a more detailed walkthrough of the MariaDB wrapper, see [this notebook](/docs/integrations/vectorstores/mariadb)
|
298
docs/docs/integrations/vectorstores/mariadb.ipynb
Normal file
298
docs/docs/integrations/vectorstores/mariadb.ipynb
Normal file
@ -0,0 +1,298 @@
|
||||
{
|
||||
"cells": [
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"id": "7679dd7b-7ed4-4755-a499-824deadba708",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"# MariaDB\n",
|
||||
"\n",
|
||||
"LangChain's MariaDB integration (langchain-mariadb) provides vector capabilities for working with MariaDB version 11.7.1 and above, distributed under the MIT license. Users can use the provided implementations as-is or customize them for specific needs.\n",
|
||||
" Key features include:\n",
|
||||
"\n",
|
||||
" * Built-in vector similarity search\n",
|
||||
" * Support for cosine and euclidean distance metrics\n",
|
||||
" * Robust metadata filtering options\n",
|
||||
" * Performance optimization through connection pooling\n",
|
||||
" * Configurable table and column settings\n",
|
||||
"\n",
|
||||
"## Setup\n",
|
||||
"\n",
|
||||
"Launch a MariaDB Docker container with:"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": null,
|
||||
"id": "92df32f0",
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"!docker run --name mariadb-container -e MARIADB_ROOT_PASSWORD=langchain -e MARIADB_DATABASE=langchain -p 3306:3306 -d mariadb:11.7"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"### Installing the Package\n",
|
||||
"\n",
|
||||
"The package uses SQLAlchemy but works best with the MariaDB connector, which requires C/C++ components:\n"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": null,
|
||||
"id": "2acbaf9b",
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"# Debian, Ubuntu\n",
|
||||
"!sudo apt install libmariadb3 libmariadb-dev\n",
|
||||
"\n",
|
||||
"# CentOS, RHEL, Rocky Linux\n",
|
||||
"!sudo yum install MariaDB-shared MariaDB-devel\n",
|
||||
"\n",
|
||||
"# Install Python connector\n",
|
||||
"!pip install -U mariadb"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"id": "0dd87fcc",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"Then install `langchain-mariadb` package\n"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": null,
|
||||
"id": "2acbaf9b",
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"pip install -U langchain-mariadb\n"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"id": "0dd87fca",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"VectorStore works along with an LLM model, here using `langchain-openai` as example.\n"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": null,
|
||||
"id": "2acbaf9b",
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"pip install langchain-openai\n",
|
||||
"export OPENAI_API_KEY=...\n"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"id": "0dd87fcc",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"## Initialization"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 1,
|
||||
"id": "94f5c129",
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"from langchain_core.documents import Document\n",
|
||||
"from langchain_mariadb import MariaDBStore\n",
|
||||
"from langchain_openai import OpenAIEmbeddings\n",
|
||||
"\n",
|
||||
"# connection string\n",
|
||||
"url = f\"mariadb+mariadbconnector://myuser:mypassword@localhost/langchain\"\n",
|
||||
"\n",
|
||||
"# Initialize vector store\n",
|
||||
"vectorstore = MariaDBStore(\n",
|
||||
" embeddings=OpenAIEmbeddings(),\n",
|
||||
" embedding_length=1536,\n",
|
||||
" datasource=url,\n",
|
||||
" collection_name=\"my_docs\",\n",
|
||||
")"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"id": "61a224a1-d70b-4daf-86ba-ab6e43c08b50",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"## Manage vector store\n",
|
||||
"\n",
|
||||
"### Adding Data\n",
|
||||
"You can add data as documents with metadata:\n"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 1,
|
||||
"id": "94f5d129",
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"docs = [\n",
|
||||
" Document(\n",
|
||||
" page_content=\"there are cats in the pond\",\n",
|
||||
" metadata={\"id\": 1, \"location\": \"pond\", \"topic\": \"animals\"},\n",
|
||||
" ),\n",
|
||||
" Document(\n",
|
||||
" page_content=\"ducks are also found in the pond\",\n",
|
||||
" metadata={\"id\": 2, \"location\": \"pond\", \"topic\": \"animals\"},\n",
|
||||
" ),\n",
|
||||
" # More documents...\n",
|
||||
"]\n",
|
||||
"vectorstore.add_documents(docs)"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"id": "0c712fa3",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"\n",
|
||||
"Or as plain text with optional metadata:"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 14,
|
||||
"id": "a5b2b71f-49eb-407d-b03a-dea4c0a517d6",
|
||||
"metadata": {
|
||||
"tags": []
|
||||
},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"texts = [\n",
|
||||
" \"a sculpture exhibit is also at the museum\",\n",
|
||||
" \"a new coffee shop opened on Main Street\",\n",
|
||||
"]\n",
|
||||
"metadatas = [\n",
|
||||
" {\"id\": 6, \"location\": \"museum\", \"topic\": \"art\"},\n",
|
||||
" {\"id\": 7, \"location\": \"Main Street\", \"topic\": \"food\"},\n",
|
||||
"]\n",
|
||||
"\n",
|
||||
"vectorstore.add_texts(texts=texts, metadatas=metadatas)"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"id": "59f82250-7903-4279-8300-062542c83416",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"## Query vector store"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 15,
|
||||
"id": "f15a2359-6dc3-4099-8214-785f167a9ca4",
|
||||
"metadata": {
|
||||
"tags": []
|
||||
},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"# Basic similarity search\n",
|
||||
"results = vectorstore.similarity_search(\"Hello\", k=2)\n",
|
||||
"\n",
|
||||
"# Search with metadata filtering\n",
|
||||
"results = vectorstore.similarity_search(\"Hello\", filter={\"category\": \"greeting\"})"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"id": "7ecd77a0",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"\n",
|
||||
"### Filter Options\n",
|
||||
"\n",
|
||||
"The system supports various filtering operations on metadata:\n",
|
||||
"\n",
|
||||
"* Equality: $eq\n",
|
||||
"* Inequality: $ne\n",
|
||||
"* Comparisons: $lt, $lte, $gt, $gte\n",
|
||||
"* List operations: $in, $nin\n",
|
||||
"* Text matching: $like, $nlike\n",
|
||||
"* Logical operations: $and, $or, $not\n",
|
||||
"\n",
|
||||
"Example:\n"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 15,
|
||||
"id": "f15a2359-6dc3-4099-8214-785f167a9cb4",
|
||||
"metadata": {
|
||||
"tags": []
|
||||
},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"# Search with simple filter\n",
|
||||
"results = vectorstore.similarity_search(\n",
|
||||
" \"kitty\", k=10, filter={\"id\": {\"$in\": [1, 5, 2, 9]}}\n",
|
||||
")\n",
|
||||
"\n",
|
||||
"# Search with multiple conditions (AND)\n",
|
||||
"results = vectorstore.similarity_search(\n",
|
||||
" \"ducks\",\n",
|
||||
" k=10,\n",
|
||||
" filter={\"id\": {\"$in\": [1, 5, 2, 9]}, \"location\": {\"$in\": [\"pond\", \"market\"]}},\n",
|
||||
")"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"id": "90a65b31",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"## Usage for retrieval-augmented generation\n",
|
||||
"\n",
|
||||
"TODO: document example"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"id": "f08d3a3c",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"## API reference\n",
|
||||
"\n",
|
||||
"See the repo [here](https://github.com/mariadb-corporation/langchain-mariadb) for more detail."
|
||||
]
|
||||
}
|
||||
],
|
||||
"metadata": {
|
||||
"kernelspec": {
|
||||
"display_name": "Python 3 (ipykernel)",
|
||||
"language": "python",
|
||||
"name": "python3"
|
||||
},
|
||||
"language_info": {
|
||||
"codemirror_mode": {
|
||||
"name": "ipython",
|
||||
"version": 3
|
||||
},
|
||||
"file_extension": ".py",
|
||||
"mimetype": "text/x-python",
|
||||
"name": "python",
|
||||
"nbconvert_exporter": "python",
|
||||
"pygments_lexer": "ipython3",
|
||||
"version": "3.11.9"
|
||||
}
|
||||
},
|
||||
"nbformat": 4,
|
||||
"nbformat_minor": 5
|
||||
}
|
@ -576,3 +576,6 @@ packages:
|
||||
path: .
|
||||
name_title: RunPod
|
||||
provider_page: runpod
|
||||
- name: langchain-mariadb
|
||||
path: .
|
||||
repo: mariadb-corporation/langchain-mariadb
|
||||
|
Loading…
Reference in New Issue
Block a user