Files
langchain/docs/docs/integrations/vectorstores/zeusdb.ipynb
doubleinfinity b944bbc766 docs: add ZeusDB vector store integration (#32822)
## Description

This PR adds documentation for the new ZeusDB vector store integration
with LangChain.

## Motivation

ZeusDB is a high-performance vector database (Python/Rust backend)
designed for AI applications that need fast similarity search and
real-time vector ops. This integration brings ZeusDB's capabilities to
the LangChain ecosystem, giving developers another production-oriented
option for vector storage and retrieval.

**Key Features:**
- **User-Friendly Python API**: Intuitive interface that integrates
seamlessly with Python ML workflows
- **High Performance**: Powered by a robust Rust backend for
lightning-fast vector operations
- **Enterprise Logging**: Comprehensive logging capabilities for
monitoring and debugging production systems
- **Advanced Features**: Includes product quantization and persistence
capabilities
- **AI-Optimized**: Purpose-built for modern AI applications and RAG
pipelines

## Changes

- Added provider documentation:
`docs/docs/integrations/providers/zeusdb.mdx` (installation, setup).

- Added vector store documentation:
`docs/docs/integrations/vectorstores/zeusdb.ipynb` (quickstart for
creating/querying a ZeusDBVectorStore).

- Registered langchain-zeusdb in `libs/packages.yml` for discovery.

## Target users

- AI/ML engineers building RAG pipelines

- Data scientists working with large document collections

- Developers needing high-throughput vector search

- Teams requiring near real-time vector operations

## Testing

- Followed LangChain's "How to add standard tests to an integration"
guidance.
- Code passes format, lint, and test checks locally.
- Tested with LangChain Core 0.3.74
- Works with Python 3.10 to 3.13

## Package Information
**PyPI:** https://pypi.org/project/langchain-zeusdb
**Github:** https://github.com/ZeusDB/langchain-zeusdb
2025-09-15 09:55:14 -04:00

618 lines
14 KiB
Plaintext
Raw Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
{
"cells": [
{
"cell_type": "markdown",
"id": "ef1f0986",
"metadata": {},
"source": [
"# ⚡ ZeusDB Vector Store\n",
"\n",
"ZeusDB is a high-performance, Rust-powered vector database with enterprise features like quantization, persistence and logging.\n",
"\n",
"This notebook covers how to get started with the ZeusDB Vector Store to efficiently use ZeusDB with LangChain."
]
},
{
"cell_type": "markdown",
"id": "107c485d-13a3-4309-9fda-5a0440862d3c",
"metadata": {},
"source": [
"---"
]
},
{
"cell_type": "markdown",
"id": "36fdc060",
"metadata": {},
"source": [
"## Setup"
]
},
{
"cell_type": "markdown",
"id": "d978e3fd-d130-436f-841d-d133c0fae8fb",
"metadata": {},
"source": [
"Install the ZeusDB LangChain integration package from PyPi:"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "42ca8320-b866-4f37-944e-96eda54231d2",
"metadata": {},
"outputs": [],
"source": [
"pip install -qU langchain-zeusdb"
]
},
{
"cell_type": "markdown",
"id": "2a0e518a-ae8a-464b-8b47-9deb9d4ab063",
"metadata": {},
"source": [
"*Setup in Jupyter Notebooks*"
]
},
{
"cell_type": "markdown",
"id": "1d092ea6-8553-4686-9563-b8318225a04a",
"metadata": {},
"source": [
"> 💡 Tip: If youre working inside Jupyter or Google Colab, use the %pip magic command so the package is installed into the active kernel:"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "64e28aa6",
"metadata": {},
"outputs": [],
"source": [
"%pip install -qU langchain-zeusdb"
]
},
{
"cell_type": "markdown",
"id": "c12fe175-a299-47d3-869f-9367b6aa572d",
"metadata": {},
"source": [
"---"
]
},
{
"cell_type": "markdown",
"id": "31554e69-40b2-4201-9f92-57e73ac66d33",
"metadata": {},
"source": [
"## Getting Started"
]
},
{
"cell_type": "markdown",
"id": "b696b3dd-0fed-4ed2-a79a-5b32598508c0",
"metadata": {},
"source": [
"This example uses OpenAIEmbeddings, which requires an OpenAI API key [Get your OpenAI API key here](https://platform.openai.com/api-keys)"
]
},
{
"cell_type": "markdown",
"id": "2b79766e-7725-4be0-a183-4947b56892c5",
"metadata": {},
"source": [
"If you prefer, you can also use this package with any other embedding provider (Hugging Face, Cohere, custom functions, etc.)."
]
},
{
"cell_type": "markdown",
"id": "b5266cc7-28da-459e-a28d-128382ed5a20",
"metadata": {},
"source": [
"Install the LangChain OpenAI integration package from PyPi:"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "1ed941cd-5e06-4c61-9235-90bd0b0b0452",
"metadata": {},
"outputs": [],
"source": [
"pip install -qU langchain-openai\n",
"\n",
"# Use this command if inside Jupyter Notebooks\n",
"#%pip install -qU langchain-openai"
]
},
{
"cell_type": "markdown",
"id": "0f49b2ec-d047-455d-8c05-da041112dd8a",
"metadata": {},
"source": [
"#### Please choose an option below for your OpenAI key integration"
]
},
{
"cell_type": "markdown",
"id": "ed2d9bf6-be53-4fc1-9611-158f03fd71b7",
"metadata": {},
"source": [
"*Option 1: 🔑 Enter your API key each time* "
]
},
{
"cell_type": "markdown",
"id": "eff5b6a5-4c57-4531-896e-54bcb2b1dec2",
"metadata": {},
"source": [
"Use getpass in Jupyter to securely input your key for the current session:"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "08a50da9-5ed1-40dc-a390-07b031369761",
"metadata": {},
"outputs": [],
"source": [
"import os\n",
"import getpass\n",
"\n",
"os.environ[\"OPENAI_API_KEY\"] = getpass.getpass(\"OpenAI API Key:\")"
]
},
{
"cell_type": "markdown",
"id": "7321917e-8586-42e4-9822-b68cfd74f233",
"metadata": {},
"source": [
"*Option 2: 🗂️ Use a .env file*"
]
},
{
"cell_type": "markdown",
"id": "b9297b6b-bd7e-457f-95af-5b41c7ab9b41",
"metadata": {},
"source": [
"Keep your key in a local .env file and load it automatically with python-dotenv"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "85a139dc-f439-4e4e-bc46-76d9478c304d",
"metadata": {},
"outputs": [],
"source": [
"from dotenv import load_dotenv\n",
"\n",
"load_dotenv() # reads .env and sets OPENAI_API_KEY"
]
},
{
"cell_type": "markdown",
"id": "1af364e3-df59-4963-aaaa-0e83f6ec5e32",
"metadata": {},
"source": [
"🎉🎉 That's it! You are good to go."
]
},
{
"cell_type": "markdown",
"id": "3146180e-026e-4421-a490-ffd14ceabac3",
"metadata": {},
"source": [
"---"
]
},
{
"cell_type": "markdown",
"id": "93df377e",
"metadata": {},
"source": [
"## Initialization"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "fb55dfe8-2c98-45b6-ba90-7a3667ceee0c",
"metadata": {},
"outputs": [],
"source": [
"# Import required Packages and Classes\n",
"from langchain_zeusdb import ZeusDBVectorStore\n",
"from langchain_openai import OpenAIEmbeddings\n",
"from zeusdb import VectorDatabase"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "dc37144c-208d-4ab3-9f3a-0407a69fe052",
"metadata": {
"tags": []
},
"outputs": [],
"source": [
"# Initialize embeddings\n",
"embeddings = OpenAIEmbeddings(model=\"text-embedding-3-small\")\n",
"\n",
"# Create ZeusDB index\n",
"vdb = VectorDatabase()\n",
"index = vdb.create(index_type=\"hnsw\", dim=1536, space=\"cosine\")\n",
"\n",
"# Create vector store\n",
"vector_store = ZeusDBVectorStore(zeusdb_index=index, embedding=embeddings)"
]
},
{
"cell_type": "markdown",
"id": "f45fa43c-8b54-4a75-b7b0-92ac0ac506c6",
"metadata": {},
"source": [
"---"
]
},
{
"cell_type": "markdown",
"id": "ac6071d4",
"metadata": {},
"source": [
"## Manage vector store"
]
},
{
"cell_type": "markdown",
"id": "edf53787-ebda-4306-afc3-f7d440dcb1ff",
"metadata": {},
"source": [
"### 2.1 Add items to vector store"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "17f5efc0",
"metadata": {},
"outputs": [],
"source": [
"from langchain_core.documents import Document\n",
"\n",
"document_1 = Document(\n",
" page_content=\"ZeusDB is a high-performance vector database\",\n",
" metadata={\"source\": \"https://docs.zeusdb.com\"},\n",
")\n",
"\n",
"document_2 = Document(\n",
" page_content=\"Product Quantization reduces memory usage significantly\",\n",
" metadata={\"source\": \"https://docs.zeusdb.com\"},\n",
")\n",
"\n",
"document_3 = Document(\n",
" page_content=\"ZeusDB integrates seamlessly with LangChain\",\n",
" metadata={\"source\": \"https://docs.zeusdb.com\"},\n",
")\n",
"\n",
"documents = [document_1, document_2, document_3]\n",
"\n",
"vector_store.add_documents(documents=documents, ids=[\"1\", \"2\", \"3\"])"
]
},
{
"cell_type": "markdown",
"id": "c738c3e0",
"metadata": {},
"source": [
"### 2.2 Update items in vector store"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "f0aa8b71",
"metadata": {},
"outputs": [],
"source": [
"updated_document = Document(\n",
" page_content=\"ZeusDB now supports advanced Product Quantization with 4x-256x compression\",\n",
" metadata={\"source\": \"https://docs.zeusdb.com\", \"updated\": True},\n",
")\n",
"\n",
"vector_store.add_documents([updated_document], ids=[\"1\"])"
]
},
{
"cell_type": "markdown",
"id": "dcf1b905",
"metadata": {},
"source": [
"### 2.3 Delete items from vector store"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "ef61e188",
"metadata": {},
"outputs": [],
"source": [
"vector_store.delete(ids=[\"3\"])"
]
},
{
"cell_type": "markdown",
"id": "1a0091af-777d-4651-888a-3b346d7990f5",
"metadata": {},
"source": [
"---"
]
},
{
"cell_type": "markdown",
"id": "c3620501",
"metadata": {},
"source": [
"## Query vector store"
]
},
{
"cell_type": "markdown",
"id": "4ba3fdb2-b7d6-4f0f-b8c9-91f63596018b",
"metadata": {},
"source": [
"### 3.1 Query directly"
]
},
{
"cell_type": "markdown",
"id": "400a9b25-9587-4116-ab59-6888602ec2b1",
"metadata": {},
"source": [
"Performing a simple similarity search:"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "aa0a16fa",
"metadata": {},
"outputs": [],
"source": [
"results = vector_store.similarity_search(query=\"high performance database\", k=2)\n",
"\n",
"for doc in results:\n",
" print(f\"* {doc.page_content} [{doc.metadata}]\")"
]
},
{
"cell_type": "markdown",
"id": "3ed9d733",
"metadata": {},
"source": [
"If you want to execute a similarity search and receive the corresponding scores:"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "5efd2eaa",
"metadata": {},
"outputs": [],
"source": [
"results = vector_store.similarity_search_with_score(query=\"memory optimization\", k=2)\n",
"\n",
"for doc, score in results:\n",
" print(f\"* [SIM={score:.3f}] {doc.page_content} [{doc.metadata}]\")"
]
},
{
"cell_type": "markdown",
"id": "0c235cdc",
"metadata": {},
"source": [
"### 3.2 Query by turning into retriever"
]
},
{
"cell_type": "markdown",
"id": "59292cb5-5dc8-4158-9137-89d0f6ca711d",
"metadata": {},
"source": [
"You can also transform the vector store into a retriever for easier usage in your chains:"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "f3460093",
"metadata": {},
"outputs": [],
"source": [
"retriever = vector_store.as_retriever(search_type=\"mmr\", search_kwargs={\"k\": 2})\n",
"\n",
"retriever.invoke(\"vector database features\")"
]
},
{
"cell_type": "markdown",
"id": "cc2d2b63-99d8-45c4-85e6-6a9409551ada",
"metadata": {},
"source": [
"---"
]
},
{
"cell_type": "markdown",
"id": "persistence_section",
"metadata": {},
"source": [
"## ZeusDB-Specific Features"
]
},
{
"cell_type": "markdown",
"id": "memory_section",
"metadata": {},
"source": [
"### 4.1 Memory-Efficient Setup with Product Quantization"
]
},
{
"cell_type": "markdown",
"id": "12832d02-d9ea-4c35-a20f-05c85d1d7723",
"metadata": {},
"source": [
"For large datasets, use Product Quantization to reduce memory usage:"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "quantization_example",
"metadata": {},
"outputs": [],
"source": [
"# Create memory-optimized vector store\n",
"quantization_config = {\"type\": \"pq\", \"subvectors\": 8, \"bits\": 8, \"training_size\": 10000}\n",
"\n",
"vdb_quantized = VectorDatabase()\n",
"quantized_index = vdb_quantized.create(\n",
" index_type=\"hnsw\", dim=1536, quantization_config=quantization_config\n",
")\n",
"\n",
"quantized_vector_store = ZeusDBVectorStore(\n",
" zeusdb_index=quantized_index, embedding=embeddings\n",
")\n",
"\n",
"print(f\"Created quantized store: {quantized_index.info()}\")"
]
},
{
"cell_type": "markdown",
"id": "6ffe0613-b2a7-484e-9219-1166b65c49c5",
"metadata": {},
"source": [
"### 4.2 Persistence"
]
},
{
"cell_type": "markdown",
"id": "fbc323ee-4c6c-43fc-beba-675d820ca078",
"metadata": {},
"source": [
"Save and load your vector store to disk:"
]
},
{
"cell_type": "markdown",
"id": "834354d1-55ad-48fe-84e1-a5eacff3f6bb",
"metadata": {},
"source": [
"How to Save your vector store"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "f9d1332b-a7ac-4a4b-a060-f2061599d3f1",
"metadata": {},
"outputs": [],
"source": [
"# Save the vector store\n",
"vector_store.save_index(\"my_zeusdb_index.zdb\")"
]
},
{
"cell_type": "markdown",
"id": "23370621-5b51-4313-800f-3a2fb9de52d2",
"metadata": {},
"source": [
"How to Load your vector store"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "f9ed5778-58e4-4724-b69d-3c7b48cda429",
"metadata": {},
"outputs": [],
"source": [
"# Load the vector store\n",
"loaded_store = ZeusDBVectorStore.load_index(\n",
" path=\"my_zeusdb_index.zdb\", embedding=embeddings\n",
")\n",
"\n",
"print(f\"Loaded store with {loaded_store.get_vector_count()} vectors\")"
]
},
{
"cell_type": "markdown",
"id": "610cfe63-d4a8-4ef0-88a8-cf9cc3cbbfce",
"metadata": {},
"source": [
"---"
]
},
{
"cell_type": "markdown",
"id": "901c75dc",
"metadata": {},
"source": [
"## Usage for retrieval-augmented generation\n",
"\n",
"For guides on how to use this vector store for retrieval-augmented generation (RAG), see the following sections:\n",
"\n",
"- [How-to: Question and answer with RAG](https://python.langchain.com/docs/how_to/#qa-with-rag)\n",
"- [Retrieval conceptual docs](https://python.langchain.com/docs/concepts/retrieval/)"
]
},
{
"cell_type": "markdown",
"id": "1d9d9d51-3798-410f-b1b3-f9736ea8c238",
"metadata": {},
"source": [
"---"
]
},
{
"cell_type": "markdown",
"id": "25b08eb0-99ab-4919-a201-5243fdfa39e9",
"metadata": {},
"source": [
"## API reference"
]
},
{
"cell_type": "markdown",
"id": "77fdca8b-f75e-4100-9f1d-7a017567dc59",
"metadata": {},
"source": [
"For detailed documentation of all ZeusDBVectorStore features and configurations head to the Doc reference: https://docs.zeusdb.com/en/latest/vector_database/integrations/langchain.html"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.13.3"
}
},
"nbformat": 4,
"nbformat_minor": 5
}