community[minor]: Added VLite as VectorStore (#20245)

Support [VLite](https://github.com/sdan/vlite) as a new VectorStore type. **Description**: vlite is a simple and blazing fast vector database(vdb) made with numpy. It abstracts a lot of the functionality around using a vdb in the retrieval augmented generation(RAG) pipeline such as embeddings generation, chunking, and file processing while still giving developers the functionality to change how they're made/stored. **Before submitting**: Added tests [here](c09c2ebd5c/libs/community/tests/integration_tests/vectorstores/test_vlite.py) Added ipython notebook [here](c09c2ebd5c/docs/docs/integrations/vectorstores/vlite.ipynb) Added simple docs on how to use [here](c09c2ebd5c/docs/docs/integrations/providers/vlite.mdx) **Profiles** Maintainers: @sdan Twitter handles: [@sdand](https://x.com/sdand) --------- Co-authored-by: Bagatur <baskaryan@gmail.com>
2025-09-17 07:26:16 +00:00 · 2024-04-16 18:24:38 -07:00
parent 7824291252
commit a7c5e41443
8 changed files with 560 additions and 0 deletions
--- a/docs/docs/integrations/providers/vlite.mdx
+++ b/docs/docs/integrations/providers/vlite.mdx
@@ -0,0 +1,31 @@
+# vlite
+
+This page covers how to use [vlite](https://github.com/sdan/vlite) within LangChain. vlite is a simple and fast vector database for storing and retrieving embeddings.
+
+## Installation and Setup
+
+To install vlite, run the following command:
+
+```bash
+pip install vlite
+```
+
+For PDF OCR support, install the `vlite[ocr]` extra:
+
+```bash
+pip install vlite[ocr]
+```
+
+## VectorStore
+
+vlite provides a wrapper around its vector database, allowing you to use it as a vectorstore for semantic search and example selection.
+
+To import the vlite vectorstore:
+
+```python
+from langchain_community.vectorstores import vlite
+```
+
+### Usage
+
+For a more detailed walkthrough of the vlite wrapper, see [this notebook](/docs/integrations/vectorstores/vlite).
--- a/docs/docs/integrations/vectorstores/vlite.ipynb
+++ b/docs/docs/integrations/vectorstores/vlite.ipynb
@@ -0,0 +1,186 @@
+{
+ "cells": [
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "\n",
+    "# vlite\n",
+    "\n",
+    "VLite is a simple and blazing fast vector database that allows you to store and retrieve data semantically using embeddings. Made with numpy, vlite is a lightweight batteries-included database to implement RAG, similarity search, and embeddings into your projects.\n",
+    "\n",
+    "## Installation\n",
+    "\n",
+    "To use the VLite in LangChain, you need to install the `vlite` package:\n",
+    "\n",
+    "```bash\n",
+    "!pip install vlite\n",
+    "```\n",
+    "\n",
+    "## Importing VLite\n",
+    "\n",
+    "```python\n",
+    "from langchain.vectorstores import VLite\n",
+    "```\n",
+    "\n",
+    "## Basic Example\n",
+    "\n",
+    "In this basic example, we load a text document, and store them in the VLite vector database. Then, we perform a similarity search to retrieve relevant documents based on a query.\n",
+    "\n",
+    "VLite handles chunking and embedding of the text for you, and you can change these parameters by pre-chunking the text and/or embeddings those chunks into the VLite database.\n",
+    "\n",
+    "```python\n",
+    "from langchain.document_loaders import TextLoader\n",
+    "from langchain.text_splitter import CharacterTextSplitter\n",
+    "\n",
+    "# Load the document and split it into chunks\n",
+    "loader = TextLoader(\"path/to/document.txt\")\n",
+    "documents = loader.load()\n",
+    "\n",
+    "# Create a VLite instance\n",
+    "vlite = VLite(collection=\"my_collection\")\n",
+    "\n",
+    "# Add documents to the VLite vector database\n",
+    "vlite.add_documents(documents)\n",
+    "\n",
+    "# Perform a similarity search\n",
+    "query = \"What is the main topic of the document?\"\n",
+    "docs = vlite.similarity_search(query)\n",
+    "\n",
+    "# Print the most relevant document\n",
+    "print(docs[0].page_content)\n",
+    "```\n",
+    "\n",
+    "## Adding Texts and Documents\n",
+    "\n",
+    "You can add texts or documents to the VLite vector database using the `add_texts` and `add_documents` methods, respectively.\n",
+    "\n",
+    "```python\n",
+    "# Add texts to the VLite vector database\n",
+    "texts = [\"This is the first text.\", \"This is the second text.\"]\n",
+    "vlite.add_texts(texts)\n",
+    "\n",
+    "# Add documents to the VLite vector database\n",
+    "documents = [Document(page_content=\"This is a document.\", metadata={\"source\": \"example.txt\"})]\n",
+    "vlite.add_documents(documents)\n",
+    "```\n",
+    "\n",
+    "## Similarity Search\n",
+    "\n",
+    "VLite provides methods for performing similarity search on the stored documents.\n",
+    "\n",
+    "```python\n",
+    "# Perform a similarity search\n",
+    "query = \"What is the main topic of the document?\"\n",
+    "docs = vlite.similarity_search(query, k=3)\n",
+    "\n",
+    "# Perform a similarity search with scores\n",
+    "docs_with_scores = vlite.similarity_search_with_score(query, k=3)\n",
+    "```\n",
+    "\n",
+    "## Max Marginal Relevance Search\n",
+    "\n",
+    "VLite also supports Max Marginal Relevance (MMR) search, which optimizes for both similarity to the query and diversity among the retrieved documents.\n",
+    "\n",
+    "```python\n",
+    "# Perform an MMR search\n",
+    "docs = vlite.max_marginal_relevance_search(query, k=3)\n",
+    "```\n",
+    "\n",
+    "## Updating and Deleting Documents\n",
+    "\n",
+    "You can update or delete documents in the VLite vector database using the `update_document` and `delete` methods.\n",
+    "\n",
+    "```python\n",
+    "# Update a document\n",
+    "document_id = \"doc_id_1\"\n",
+    "updated_document = Document(page_content=\"Updated content\", metadata={\"source\": \"updated.txt\"})\n",
+    "vlite.update_document(document_id, updated_document)\n",
+    "\n",
+    "# Delete documents\n",
+    "document_ids = [\"doc_id_1\", \"doc_id_2\"]\n",
+    "vlite.delete(document_ids)\n",
+    "```\n",
+    "\n",
+    "## Retrieving Documents\n",
+    "\n",
+    "You can retrieve documents from the VLite vector database based on their IDs or metadata using the `get` method.\n",
+    "\n",
+    "```python\n",
+    "# Retrieve documents by IDs\n",
+    "document_ids = [\"doc_id_1\", \"doc_id_2\"]\n",
+    "docs = vlite.get(ids=document_ids)\n",
+    "\n",
+    "# Retrieve documents by metadata\n",
+    "metadata_filter = {\"source\": \"example.txt\"}\n",
+    "docs = vlite.get(where=metadata_filter)\n",
+    "```\n",
+    "\n",
+    "## Creating VLite Instances\n",
+    "\n",
+    "You can create VLite instances using various methods:\n",
+    "\n",
+    "```python\n",
+    "# Create a VLite instance from texts\n",
+    "vlite = VLite.from_texts(texts)\n",
+    "\n",
+    "# Create a VLite instance from documents\n",
+    "vlite = VLite.from_documents(documents)\n",
+    "\n",
+    "# Create a VLite instance from an existing index\n",
+    "vlite = VLite.from_existing_index(collection=\"existing_collection\")\n",
+    "```\n",
+    "\n",
+    "## Additional Features\n",
+    "\n",
+    "VLite provides additional features for managing the vector database:\n",
+    "\n",
+    "```python\n",
+    "from langchain.vectorstores import VLite\n",
+    "vlite = VLite(collection=\"my_collection\")\n",
+    "\n",
+    "# Get the number of items in the collection\n",
+    "count = vlite.count()\n",
+    "\n",
+    "# Save the collection\n",
+    "vlite.save()\n",
+    "\n",
+    "# Clear the collection\n",
+    "vlite.clear()\n",
+    "\n",
+    "# Get collection information\n",
+    "vlite.info()\n",
+    "\n",
+    "# Dump the collection data\n",
+    "data = vlite.dump()\n",
+    "```"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": []
+  }
+ ],
+ "metadata": {
+  "kernelspec": {
+   "display_name": "Python 3 (ipykernel)",
+   "language": "python",
+   "name": "python3"
+  },
+  "language_info": {
+   "codemirror_mode": {
+    "name": "ipython",
+    "version": 3
+   },
+   "file_extension": ".py",
+   "mimetype": "text/x-python",
+   "name": "python",
+   "nbconvert_exporter": "python",
+   "pygments_lexer": "ipython3",
+   "version": "3.9.7"
+  }
+ },
+ "nbformat": 4,
+ "nbformat_minor": 4
+}