community[patch]: Added add_images method to SingleStoreDB vector store (#17871)

In this pull request, we introduce the add_images method to the
SingleStoreDB vector store class, expanding its capabilities to handle
multi-modal embeddings seamlessly. This method facilitates the
incorporation of image data into the vector store by associating each
image's URI with corresponding document content, metadata, and either
pre-generated embeddings or embeddings computed using the embed_image
method of the provided embedding object.

the change includes integration tests, validating the behavior of the
add_images. Additionally, we provide a notebook showcasing the usage of
this new method.

---------

Co-authored-by: Volodymyr Tkachuk <vtkachuk-ua@singlestore.com>
This commit is contained in:
volodymyr-memsql
2024-02-22 01:16:32 +02:00
committed by GitHub
parent 7735721929
commit 0a9a519a39
3 changed files with 138 additions and 4 deletions

View File

@@ -115,12 +115,56 @@
]
},
{
"cell_type": "code",
"execution_count": null,
"cell_type": "markdown",
"id": "86efff90",
"metadata": {},
"source": [
"## Multi-modal Example: Leveraging CLIP and OpenClip Embeddings\n",
"\n",
"In the realm of multi-modal data analysis, the integration of diverse information types like images and text has become increasingly crucial. One powerful tool facilitating such integration is [CLIP](https://openai.com/research/clip), a cutting-edge model capable of embedding both images and text into a shared semantic space. By doing so, CLIP enables the retrieval of relevant content across different modalities through similarity search.\n",
"\n",
"To illustrate, let's consider an application scenario where we aim to effectively analyze multi-modal data. In this example, we harness the capabilities of [OpenClip multimodal embeddings](https://python.langchain.com/docs/integrations/text_embedding/open_clip), which leverage CLIP's framework. With OpenClip, we can seamlessly embed textual descriptions alongside corresponding images, enabling comprehensive analysis and retrieval tasks. Whether it's identifying visually similar images based on textual queries or finding relevant text passages associated with specific visual content, OpenClip empowers users to explore and extract insights from multi-modal data with remarkable efficiency and accuracy."
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "9c0bce88",
"metadata": {},
"outputs": [],
"source": []
"source": [
"%pip install -U langchain openai singlestoredb langchain-experimental # (newest versions required for multi-modal)"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "21a8c25c",
"metadata": {},
"outputs": [],
"source": [
"import os\n",
"\n",
"from langchain_community.vectorstores import SingleStoreDB\n",
"from langchain_experimental.open_clip import OpenCLIPEmbeddings\n",
"\n",
"os.environ[\"SINGLESTOREDB_URL\"] = \"root:pass@localhost:3306/db\"\n",
"\n",
"TEST_IMAGES_DIR = \"../../modules/images\"\n",
"\n",
"docsearch = SingleStoreDB(OpenCLIPEmbeddings())\n",
"\n",
"image_uris = sorted(\n",
" [\n",
" os.path.join(TEST_IMAGES_DIR, image_name)\n",
" for image_name in os.listdir(TEST_IMAGES_DIR)\n",
" if image_name.endswith(\".jpg\")\n",
" ]\n",
")\n",
"\n",
"# Add images\n",
"docsearch.add_images(uris=image_uris)"
]
}
],
"metadata": {