community[minor]: Add VDMS vectorstore (#19551)

- **Description:** Add support for Intel Lab's [Visual Data Management
System (VDMS)](https://github.com/IntelLabs/vdms) as a vector store
- **Dependencies:** `vdms` library which requires protobuf = "4.24.2".
There is a conflict with dashvector in `langchain` package but conflict
is resolved in `community`.
- **Contribution maintainer:** [@cwlacewe](https://github.com/cwlacewe)
- **Added tests:**
libs/community/tests/integration_tests/vectorstores/test_vdms.py
- **Added docs:** docs/docs/integrations/vectorstores/vdms.ipynb
- **Added cookbook:** cookbook/multi_modal_RAG_vdms.ipynb

---------

Co-authored-by: Eugene Yurtsev <eugene@langchain.dev>
Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
Co-authored-by: Bagatur <baskaryan@gmail.com>
This commit is contained in:
Chaunte W. Lacewell
2024-03-27 20:12:11 -07:00
committed by GitHub
parent b7b62e29fb
commit a31f692f4e
12 changed files with 3705 additions and 20 deletions

View File

@@ -0,0 +1,62 @@
# VDMS
> [VDMS](https://github.com/IntelLabs/vdms/blob/master/README.md) is a storage solution for efficient access
> of big-”visual”-data that aims to achieve cloud scale by searching for relevant visual data via visual metadata
> stored as a graph and enabling machine friendly enhancements to visual data for faster access.
## Installation and Setup
### Install Client
```bash
pip install vdms
```
### Install Database
There are two ways to get started with VDMS:
#### Install VDMS on your local machine via docker
```bash
docker run -d -p 55555:55555 intellabs/vdms:latest
```
#### Install VDMS directly on your local machine
Please see [installation instructions](https://github.com/IntelLabs/vdms/blob/master/INSTALL.md).
## VectorStore
The vector store is a simple wrapper around VDMS. It provides a simple interface to store and retrieve data.
```python
from langchain_community.document_loaders import TextLoader
from langchain.text_splitter import CharacterTextSplitter
loader = TextLoader("./state_of_the_union.txt")
documents = loader.load()
text_splitter = CharacterTextSplitter(chunk_size=500, chunk_overlap=0)
docs = text_splitter.split_documents(documents)
from langchain_community.vectorstores import VDMS
from langchain_community.vectorstores.vdms import VDMS_Client
from langchain_community.embeddings.huggingface import HuggingFaceEmbeddings
client = VDMS_Client("localhost", 55555)
vectorstore = VDMS.from_documents(
docs,
client=client,
collection_name="langchain-demo",
embedding_function=HuggingFaceEmbeddings(),
engine="FaissFlat"
distance_strategy="L2",
)
query = "What did the president say about Ketanji Brown Jackson"
results = vectorstore.similarity_search(query)
```
For a more detailed walkthrough of the VDMS wrapper, see [this notebook](/docs/integrations/vectorstores/vdms)

File diff suppressed because it is too large Load Diff

View File

@@ -60,7 +60,7 @@
" * document addition by id (`add_documents` method with `ids` argument)\n",
" * delete by id (`delete` method with `ids` argument)\n",
"\n",
"Compatible Vectorstores: `AnalyticDB`, `AstraDB`, `AwaDB`, `Bagel`, `Cassandra`, `Chroma`, `CouchbaseVectorStore`, `DashVector`, `DatabricksVectorSearch`, `DeepLake`, `Dingo`, `ElasticVectorSearch`, `ElasticsearchStore`, `FAISS`, `HanaDB`, `Milvus`, `MyScale`, `OpenSearchVectorSearch`, `PGVector`, `Pinecone`, `Qdrant`, `Redis`, `Rockset`, `ScaNN`, `SupabaseVectorStore`, `SurrealDBStore`, `TimescaleVector`, `Vald`, `Vearch`, `VespaStore`, `Weaviate`, `ZepVectorStore`.\n",
"Compatible Vectorstores: `AnalyticDB`, `AstraDB`, `AwaDB`, `Bagel`, `Cassandra`, `Chroma`, `CouchbaseVectorStore`, `DashVector`, `DatabricksVectorSearch`, `DeepLake`, `Dingo`, `ElasticVectorSearch`, `ElasticsearchStore`, `FAISS`, `HanaDB`, `Milvus`, `MyScale`, `OpenSearchVectorSearch`, `PGVector`, `Pinecone`, `Qdrant`, `Redis`, `Rockset`, `ScaNN`, `SupabaseVectorStore`, `SurrealDBStore`, `TimescaleVector`, `Vald`, `VDMS`, `Vearch`, `VespaStore`, `Weaviate`, `ZepVectorStore`.\n",
" \n",
"## Caution\n",
"\n",