diff --git a/docs/extras/ecosystem/integrations/hologres.mdx b/docs/extras/ecosystem/integrations/hologres.mdx
new file mode 100644
index 00000000000..66284efbd36
--- /dev/null
+++ b/docs/extras/ecosystem/integrations/hologres.mdx
@@ -0,0 +1,23 @@
+# Hologres
+
+>[Hologres](https://www.alibabacloud.com/help/en/hologres/latest/introduction) is a unified real-time data warehousing service developed by Alibaba Cloud. You can use Hologres to write, update, process, and analyze large amounts of data in real time.
+>`Hologres` supports standard `SQL` syntax, is compatible with `PostgreSQL`, and supports most PostgreSQL functions. Hologres supports online analytical processing (OLAP) and ad hoc analysis for up to petabytes of data, and provides high-concurrency and low-latency online data services.
+
+>`Hologres` provides **vector database** functionality by adopting [Proxima](https://www.alibabacloud.com/help/en/hologres/latest/vector-processing).
+>`Proxima` is a high-performance software library developed by `Alibaba DAMO Academy`. It allows you to search for the nearest neighbors of vectors. Proxima provides higher stability and performance than similar open source software such as Faiss. Proxima allows you to search for similar text or image embeddings with high throughput and low latency. Hologres is deeply integrated with Proxima to provide a high-performance vector search service.
+
+## Installation and Setup
+
+Click [here](https://www.alibabacloud.com/zh/product/hologres) to fast deploy a Hologres cloud instance.
+
+```bash
+pip install psycopg2
+```
+
+## Vector Store
+
+See a [usage example](/docs/modules/data_connection/vectorstores/integrations/hologres.html).
+
+```python
+from langchain.vectorstores import Hologres
+```
diff --git a/docs/extras/ecosystem/integrations/rockset.mdx b/docs/extras/ecosystem/integrations/rockset.mdx
new file mode 100644
index 00000000000..6fe71f393c3
--- /dev/null
+++ b/docs/extras/ecosystem/integrations/rockset.mdx
@@ -0,0 +1,19 @@
+# Rockset
+
+>[Rockset](https://rockset.com/product/) is a real-time analytics database service for serving low latency, high concurrency analytical queries at scale. It builds a Converged Index™ on structured and semi-structured data with an efficient store for vector embeddings. Its support for running SQL on schemaless data makes it a perfect choice for running vector search with metadata filters.
+
+## Installation and Setup
+
+Make sure you have Rockset account and go to the web console to get the API key. Details can be found on [the website](https://rockset.com/docs/rest-api/).
+
+```bash
+pip install rockset
+```
+
+## Vector Store
+
+See a [usage example](/docs/modules/data_connection/vectorstores/integrations/rockset.html).
+
+```python
+from langchain.vectorstores import RocksetDB
+```
diff --git a/docs/extras/ecosystem/integrations/singlestoredb.mdx b/docs/extras/ecosystem/integrations/singlestoredb.mdx
new file mode 100644
index 00000000000..313b7ccbae6
--- /dev/null
+++ b/docs/extras/ecosystem/integrations/singlestoredb.mdx
@@ -0,0 +1,20 @@
+# SingleStoreDB
+
+>[SingleStoreDB](https://singlestore.com/) is a high-performance distributed SQL database that supports deployment both in the [cloud](https://www.singlestore.com/cloud/) and on-premises. It provides vector storage, and vector functions including [dot_product](https://docs.singlestore.com/managed-service/en/reference/sql-reference/vector-functions/dot_product.html) and [euclidean_distance](https://docs.singlestore.com/managed-service/en/reference/sql-reference/vector-functions/euclidean_distance.html), thereby supporting AI applications that require text similarity matching.
+
+## Installation and Setup
+
+There are several ways to establish a [connection](https://singlestoredb-python.labs.singlestore.com/generated/singlestoredb.connect.html) to the database. You can either set up environment variables or pass named parameters to the `SingleStoreDB constructor`.
+Alternatively, you may provide these parameters to the `from_documents` and `from_texts` methods.
+
+```bash
+pip install singlestoredb
+```
+
+## Vector Store
+
+See a [usage example](/docs/modules/data_connection/vectorstores/integrations/singlestoredb.html).
+
+```python
+from langchain.vectorstores import SingleStoreDB
+```
diff --git a/docs/extras/ecosystem/integrations/sklearn.mdx b/docs/extras/ecosystem/integrations/sklearn.mdx
index cb8723a5b87..8f463110c84 100644
--- a/docs/extras/ecosystem/integrations/sklearn.mdx
+++ b/docs/extras/ecosystem/integrations/sklearn.mdx
@@ -1,15 +1,14 @@
# scikit-learn
-This page covers how to use the scikit-learn package within LangChain.
-It is broken into two parts: installation and setup, and then references to specific scikit-learn wrappers.
+>[scikit-learn](https://scikit-learn.org/stable/) is an open source collection of machine learning algorithms,
+> including some implementations of the [k nearest neighbors](https://scikit-learn.org/stable/modules/generated/sklearn.neighbors.NearestNeighbors.html). `SKLearnVectorStore` wraps this implementation and adds the possibility to persist the vector store in json, bson (binary json) or Apache Parquet format.
## Installation and Setup
- Install the Python package with `pip install scikit-learn`
-## Wrappers
-### VectorStore
+## Vector Store
`SKLearnVectorStore` provides a simple wrapper around the nearest neighbor implementation in the
scikit-learn package, allowing you to use it as a vectorstore.
diff --git a/docs/extras/ecosystem/integrations/starrocks.mdx b/docs/extras/ecosystem/integrations/starrocks.mdx
new file mode 100644
index 00000000000..0c0febacc67
--- /dev/null
+++ b/docs/extras/ecosystem/integrations/starrocks.mdx
@@ -0,0 +1,21 @@
+# StarRocks
+
+>[StarRocks](https://www.starrocks.io/) is a High-Performance Analytical Database.
+`StarRocks` is a next-gen sub-second MPP database for full analytics scenarios, including multi-dimensional analytics, real-time analytics and ad-hoc query.
+
+>Usually `StarRocks` is categorized into OLAP, and it has showed excellent performance in [ClickBench — a Benchmark For Analytical DBMS](https://benchmark.clickhouse.com/). Since it has a super-fast vectorized execution engine, it could also be used as a fast vectordb.
+
+## Installation and Setup
+
+
+```bash
+pip install pymysql
+```
+
+## Vector Store
+
+See a [usage example](/docs/modules/data_connection/vectorstores/integrations/starrocks.html).
+
+```python
+from langchain.vectorstores import StarRocks
+```
diff --git a/docs/extras/ecosystem/integrations/tigris.mdx b/docs/extras/ecosystem/integrations/tigris.mdx
new file mode 100644
index 00000000000..7c69141ea4f
--- /dev/null
+++ b/docs/extras/ecosystem/integrations/tigris.mdx
@@ -0,0 +1,19 @@
+# Tigris
+
+> [Tigris](htttps://tigrisdata.com) is an open source Serverless NoSQL Database and Search Platform designed to simplify building high-performance vector search applications.
+> `Tigris` eliminates the infrastructure complexity of managing, operating, and synchronizing multiple tools, allowing you to focus on building great applications instead.
+
+## Installation and Setup
+
+
+```bash
+pip install tigrisdb openapi-schema-pydantic openai tiktoken
+```
+
+## Vector Store
+
+See a [usage example](/docs/modules/data_connection/vectorstores/integrations/tigris.html).
+
+```python
+from langchain.vectorstores import Tigris
+```
diff --git a/docs/extras/ecosystem/integrations/typesense.mdx b/docs/extras/ecosystem/integrations/typesense.mdx
new file mode 100644
index 00000000000..d2c64a0a0ac
--- /dev/null
+++ b/docs/extras/ecosystem/integrations/typesense.mdx
@@ -0,0 +1,22 @@
+# Typesense
+
+> [Typesense](https://typesense.org) is an open source, in-memory search engine, that you can either
+> [self-host](https://typesense.org/docs/guide/install-typesense.html#option-2-local-machine-self-hosting) or run
+> on [Typesense Cloud](https://cloud.typesense.org/).
+> `Typesense` focuses on performance by storing the entire index in RAM (with a backup on disk) and also
+> focuses on providing an out-of-the-box developer experience by simplifying available options and setting good defaults.
+
+## Installation and Setup
+
+
+```bash
+pip install typesense openapi-schema-pydantic openai tiktoken
+```
+
+## Vector Store
+
+See a [usage example](/docs/modules/data_connection/vectorstores/integrations/typesense.html).
+
+```python
+from langchain.vectorstores import Typesense
+```
diff --git a/docs/extras/modules/data_connection/vectorstores/integrations/alibabacloud_opensearch.ipynb b/docs/extras/modules/data_connection/vectorstores/integrations/alibabacloud_opensearch.ipynb
index 2a31d7f9d87..9be50011575 100644
--- a/docs/extras/modules/data_connection/vectorstores/integrations/alibabacloud_opensearch.ipynb
+++ b/docs/extras/modules/data_connection/vectorstores/integrations/alibabacloud_opensearch.ipynb
@@ -2,28 +2,34 @@
"cells": [
{
"cell_type": "markdown",
- "metadata": {
- "collapsed": false
- },
+ "metadata": {},
"source": [
"# Alibaba Cloud OpenSearch\n",
"\n",
- ">[Alibaba Cloud Opensearch](https://www.alibabacloud.com/product/opensearch) OpenSearch is a one-stop platform to develop intelligent search services. OpenSearch was built based on the large-scale distributed search engine developed by Alibaba. OpenSearch serves more than 500 business cases in Alibaba Group and thousands of Alibaba Cloud customers. OpenSearch helps develop search services in different search scenarios, including e-commerce, O2O, multimedia, the content industry, communities and forums, and big data query in enterprises.\n",
+ ">[Alibaba Cloud Opensearch](https://www.alibabacloud.com/product/opensearch) is a one-stop platform to develop intelligent search services. `OpenSearch` was built on the large-scale distributed search engine developed by `Alibaba`. `OpenSearch` serves more than 500 business cases in Alibaba Group and thousands of Alibaba Cloud customers. `OpenSearch` helps develop search services in different search scenarios, including e-commerce, O2O, multimedia, the content industry, communities and forums, and big data query in enterprises.\n",
"\n",
- ">OpenSearch helps you develop high quality, maintenance-free, and high performance intelligent search services to provide your users with high search efficiency and accuracy.\n",
+ ">`OpenSearch` helps you develop high quality, maintenance-free, and high performance intelligent search services to provide your users with high search efficiency and accuracy.\n",
"\n",
- ">OpenSearch provides the vector search feature. In specific scenarios, especially test question search and image search scenarios, you can use the vector search feature together with the multimodal search feature to improve the accuracy of search results. This topic describes the syntax and usage notes of vector indexes.\n",
+ ">`OpenSearch` provides the vector search feature. In specific scenarios, especially test question search and image search scenarios, you can use the vector search feature together with the multimodal search feature to improve the accuracy of search results. This topic describes the syntax and usage notes of vector indexes.\n",
"\n",
"This notebook shows how to use functionality related to the `Alibaba Cloud OpenSearch Vector Search Edition`.\n",
"To run, you should have an [OpenSearch Vector Search Edition](https://opensearch.console.aliyun.com) instance up and running:\n",
- "- Read the [help document](https://www.alibabacloud.com/help/en/opensearch/latest/vector-search) to quickly familiarize and configure OpenSearch Vector Search Edition instance.\n"
+ "\n",
+ "Read the [help document](https://www.alibabacloud.com/help/en/opensearch/latest/vector-search) to quickly familiarize and configure OpenSearch Vector Search Edition instance.\n"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 1,
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "#!pip install alibabacloud-ha3engine"
]
},
{
"cell_type": "markdown",
- "metadata": {
- "collapsed": false
- },
+ "metadata": {},
"source": [
"After completing the configuration, follow these steps to connect to the instance, index documents, and perform vector retrieval."
]
@@ -33,6 +39,9 @@
"execution_count": null,
"metadata": {
"collapsed": false,
+ "jupyter": {
+ "outputs_hidden": false
+ },
"pycharm": {
"name": "#%%\n"
}
@@ -49,9 +58,7 @@
},
{
"cell_type": "markdown",
- "metadata": {
- "collapsed": false
- },
+ "metadata": {},
"source": [
"Split documents and get embeddings by call OpenAI API"
]
@@ -61,6 +68,9 @@
"execution_count": null,
"metadata": {
"collapsed": false,
+ "jupyter": {
+ "outputs_hidden": false
+ },
"pycharm": {
"name": "#%%\n"
}
@@ -80,7 +90,6 @@
{
"cell_type": "markdown",
"metadata": {
- "collapsed": false,
"pycharm": {
"name": "#%% md\n"
}
@@ -94,6 +103,9 @@
"execution_count": null,
"metadata": {
"collapsed": false,
+ "jupyter": {
+ "outputs_hidden": false
+ },
"pycharm": {
"name": "#%%\n"
}
@@ -133,9 +145,7 @@
},
{
"cell_type": "markdown",
- "metadata": {
- "collapsed": false
- },
+ "metadata": {},
"source": [
"Create an opensearch access instance by settings."
]
@@ -145,6 +155,9 @@
"execution_count": null,
"metadata": {
"collapsed": false,
+ "jupyter": {
+ "outputs_hidden": false
+ },
"pycharm": {
"name": "#%%\n"
}
@@ -159,9 +172,7 @@
},
{
"cell_type": "markdown",
- "metadata": {
- "collapsed": false
- },
+ "metadata": {},
"source": [
"or"
]
@@ -171,6 +182,9 @@
"execution_count": null,
"metadata": {
"collapsed": false,
+ "jupyter": {
+ "outputs_hidden": false
+ },
"pycharm": {
"name": "#%%\n"
}
@@ -183,9 +197,7 @@
},
{
"cell_type": "markdown",
- "metadata": {
- "collapsed": false
- },
+ "metadata": {},
"source": [
"Add texts and build index."
]
@@ -195,6 +207,9 @@
"execution_count": null,
"metadata": {
"collapsed": false,
+ "jupyter": {
+ "outputs_hidden": false
+ },
"pycharm": {
"name": "#%%\n"
}
@@ -208,9 +223,7 @@
},
{
"cell_type": "markdown",
- "metadata": {
- "collapsed": false
- },
+ "metadata": {},
"source": [
"Query and retrieve data."
]
@@ -220,6 +233,9 @@
"execution_count": null,
"metadata": {
"collapsed": false,
+ "jupyter": {
+ "outputs_hidden": false
+ },
"pycharm": {
"name": "#%%\n"
}
@@ -233,9 +249,7 @@
},
{
"cell_type": "markdown",
- "metadata": {
- "collapsed": false
- },
+ "metadata": {},
"source": [
"Query and retrieve data with metadata\n"
]
@@ -245,6 +259,9 @@
"execution_count": null,
"metadata": {
"collapsed": false,
+ "jupyter": {
+ "outputs_hidden": false
+ },
"pycharm": {
"name": "#%%\n"
}
@@ -260,7 +277,6 @@
{
"cell_type": "markdown",
"metadata": {
- "collapsed": false,
"pycharm": {
"name": "#%% md\n"
}
@@ -272,23 +288,23 @@
],
"metadata": {
"kernelspec": {
- "display_name": "Python 3",
+ "display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
- "version": 2
+ "version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
- "pygments_lexer": "ipython2",
- "version": "2.7.6"
+ "pygments_lexer": "ipython3",
+ "version": "3.10.6"
}
},
"nbformat": 4,
- "nbformat_minor": 0
+ "nbformat_minor": 4
}
diff --git a/docs/extras/modules/data_connection/vectorstores/integrations/awadb.ipynb b/docs/extras/modules/data_connection/vectorstores/integrations/awadb.ipynb
index aedfc8feb12..93bf1a6d975 100644
--- a/docs/extras/modules/data_connection/vectorstores/integrations/awadb.ipynb
+++ b/docs/extras/modules/data_connection/vectorstores/integrations/awadb.ipynb
@@ -6,8 +6,9 @@
"metadata": {},
"source": [
"# AwaDB\n",
- "[AwaDB](https://github.com/awa-ai/awadb) is an AI Native database for the search and storage of embedding vectors used by LLM Applications.\n",
- "This notebook shows how to use functionality related to the AwaDB."
+ ">[AwaDB](https://github.com/awa-ai/awadb) is an AI Native database for the search and storage of embedding vectors used by LLM Applications.\n",
+ "\n",
+ "This notebook shows how to use functionality related to the `AwaDB`."
]
},
{
@@ -184,7 +185,7 @@
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
- "version": "3.11.1"
+ "version": "3.10.6"
}
},
"nbformat": 4,
diff --git a/docs/extras/modules/data_connection/vectorstores/integrations/azuresearch.ipynb b/docs/extras/modules/data_connection/vectorstores/integrations/azuresearch.ipynb
index cf0ee7d0eab..c36f525fd2a 100644
--- a/docs/extras/modules/data_connection/vectorstores/integrations/azuresearch.ipynb
+++ b/docs/extras/modules/data_connection/vectorstores/integrations/azuresearch.ipynb
@@ -1,19 +1,19 @@
{
"cells": [
{
- "attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
- "# Azure Cognitive Search"
+ "# Azure Cognitive Search\n",
+ "\n",
+ ">[Azure Cognitive Search](https://learn.microsoft.com/en-us/azure/search/search-what-is-azure-search) (formerly known as `Azure Search`) is a cloud search service that gives developers infrastructure, APIs, and tools for building a rich search experience over private, heterogeneous content in web, mobile, and enterprise applications.\n"
]
},
{
- "attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
- "# Install Azure Cognitive Search SDK"
+ "## Install Azure Cognitive Search SDK"
]
},
{
@@ -27,7 +27,6 @@
]
},
{
- "attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
@@ -49,7 +48,6 @@
]
},
{
- "attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
@@ -74,7 +72,6 @@
]
},
{
- "attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
@@ -95,7 +92,6 @@
]
},
{
- "attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
@@ -120,7 +116,6 @@
]
},
{
- "attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
@@ -148,7 +143,6 @@
]
},
{
- "attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
@@ -187,7 +181,6 @@
]
},
{
- "attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
@@ -226,7 +219,7 @@
],
"metadata": {
"kernelspec": {
- "display_name": "Python 3.9.13 ('.venv': venv)",
+ "display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
@@ -240,9 +233,8 @@
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
- "version": "3.11.3"
+ "version": "3.10.6"
},
- "orig_nbformat": 4,
"vscode": {
"interpreter": {
"hash": "645053d6307d413a1a75681b5ebb6449bb2babba4bcb0bf65a1ddc3dbefb108a"
@@ -250,5 +242,5 @@
}
},
"nbformat": 4,
- "nbformat_minor": 2
+ "nbformat_minor": 4
}
diff --git a/docs/extras/modules/data_connection/vectorstores/integrations/chroma.ipynb b/docs/extras/modules/data_connection/vectorstores/integrations/chroma.ipynb
index 1744d2d48c3..631b0f045e3 100644
--- a/docs/extras/modules/data_connection/vectorstores/integrations/chroma.ipynb
+++ b/docs/extras/modules/data_connection/vectorstores/integrations/chroma.ipynb
@@ -9,20 +9,6 @@
"\n",
">[Chroma](https://docs.trychroma.com/getting-started) is a AI-native open-source vector database focused on developer productivity and happiness. Chroma is licensed under Apache 2.0.\n",
"\n",
- "\n",
- "
\n",
- " \n",
- "\n",
- "
\n",
- " \n",
- "
\n",
- "\n",
- "- [Website](https://www.trychroma.com/)\n",
- "- [Documentation](https://docs.trychroma.com/)\n",
- "- [Twitter](https://twitter.com/trychroma)\n",
- "- [Discord](https://discord.gg/MMeYNTmh3x)\n",
- "\n",
- "Chroma is fully-typed, fully-tested and fully-documented.\n",
"\n",
"Install Chroma with:\n",
"\n",
@@ -47,19 +33,6 @@
"View full docs at [docs](https://docs.trychroma.com/reference/Collection). To access these methods directly, you can do `._collection_.method()`\n"
]
},
- {
- "cell_type": "code",
- "execution_count": null,
- "id": "12e83df7",
- "metadata": {},
- "outputs": [],
- "source": [
- "# first install dependencies\n",
- "!pip install langchain\n",
- "!pip install langchainplus_sdk\n",
- "!pip install chromadb\n"
- ]
- },
{
"cell_type": "markdown",
"id": "2b5ffbf8",
@@ -576,7 +549,7 @@
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
- "version": "3.11.3"
+ "version": "3.10.6"
}
},
"nbformat": 4,
diff --git a/docs/extras/modules/data_connection/vectorstores/integrations/elasticsearch.ipynb b/docs/extras/modules/data_connection/vectorstores/integrations/elasticsearch.ipynb
index ac1c65b3aef..188b9cd2402 100644
--- a/docs/extras/modules/data_connection/vectorstores/integrations/elasticsearch.ipynb
+++ b/docs/extras/modules/data_connection/vectorstores/integrations/elasticsearch.ipynb
@@ -14,22 +14,12 @@
"This notebook shows how to use functionality related to the `Elasticsearch` database."
]
},
- {
- "cell_type": "markdown",
- "source": [
- "# ElasticVectorSearch class"
- ],
- "metadata": {
- "id": "tKSYjyTBtSLc"
- },
- "id": "tKSYjyTBtSLc"
- },
{
"cell_type": "markdown",
"id": "b66c12b2-2a07-4136-ac77-ce1c9fa7a409",
"metadata": {
- "tags": [],
- "id": "b66c12b2-2a07-4136-ac77-ce1c9fa7a409"
+ "id": "b66c12b2-2a07-4136-ac77-ce1c9fa7a409",
+ "tags": []
},
"source": [
"## Installation"
@@ -104,8 +94,8 @@
"execution_count": null,
"id": "d6197931-cbe5-460c-a5e6-b5eedb83887c",
"metadata": {
- "tags": [],
- "id": "d6197931-cbe5-460c-a5e6-b5eedb83887c"
+ "id": "d6197931-cbe5-460c-a5e6-b5eedb83887c",
+ "tags": []
},
"outputs": [],
"source": [
@@ -117,9 +107,9 @@
"execution_count": null,
"id": "67ab8afa-f7c6-4fbf-b596-cb512da949da",
"metadata": {
- "tags": [],
"id": "67ab8afa-f7c6-4fbf-b596-cb512da949da",
- "outputId": "fd16b37f-cb76-40a9-b83f-eab58dd0d912"
+ "outputId": "fd16b37f-cb76-40a9-b83f-eab58dd0d912",
+ "tags": []
},
"outputs": [
{
@@ -141,8 +131,8 @@
"cell_type": "markdown",
"id": "f6030187-0bd7-4798-8372-a265036af5e0",
"metadata": {
- "tags": [],
- "id": "f6030187-0bd7-4798-8372-a265036af5e0"
+ "id": "f6030187-0bd7-4798-8372-a265036af5e0",
+ "tags": []
},
"source": [
"## Example"
@@ -153,8 +143,8 @@
"execution_count": null,
"id": "aac9563e",
"metadata": {
- "tags": [],
- "id": "aac9563e"
+ "id": "aac9563e",
+ "tags": []
},
"outputs": [],
"source": [
@@ -169,8 +159,8 @@
"execution_count": null,
"id": "a3c3999a",
"metadata": {
- "tags": [],
- "id": "a3c3999a"
+ "id": "a3c3999a",
+ "tags": []
},
"outputs": [],
"source": [
@@ -189,8 +179,8 @@
"execution_count": null,
"id": "12eb86d8",
"metadata": {
- "tags": [],
- "id": "12eb86d8"
+ "id": "12eb86d8",
+ "tags": []
},
"outputs": [],
"source": [
@@ -235,43 +225,49 @@
},
{
"cell_type": "markdown",
- "source": [
- "# ElasticKnnSearch Class\n",
- "The `ElasticKnnSearch` implements features allowing storing vectors and documents in Elasticsearch for use with approximate [kNN search](https://www.elastic.co/guide/en/elasticsearch/reference/current/knn-search.html)"
- ],
+ "id": "FheGPztJsrRB",
"metadata": {
"id": "FheGPztJsrRB"
},
- "id": "FheGPztJsrRB"
+ "source": [
+ "# ElasticKnnSearch Class\n",
+ "The `ElasticKnnSearch` implements features allowing storing vectors and documents in Elasticsearch for use with approximate [kNN search](https://www.elastic.co/guide/en/elasticsearch/reference/current/knn-search.html)"
+ ]
},
{
"cell_type": "code",
- "source": [
- "!pip install langchain elasticsearch"
- ],
+ "execution_count": null,
+ "id": "gRVcbh5zqCJQ",
"metadata": {
"id": "gRVcbh5zqCJQ"
},
- "execution_count": null,
"outputs": [],
- "id": "gRVcbh5zqCJQ"
+ "source": [
+ "!pip install langchain elasticsearch"
+ ]
},
{
"cell_type": "code",
+ "execution_count": null,
+ "id": "TJtqiw5AqBp8",
+ "metadata": {
+ "id": "TJtqiw5AqBp8"
+ },
+ "outputs": [],
"source": [
"from langchain.vectorstores.elastic_vector_search import ElasticKnnSearch\n",
"from langchain.embeddings import ElasticsearchEmbeddings\n",
"import elasticsearch"
- ],
- "metadata": {
- "id": "TJtqiw5AqBp8"
- },
- "execution_count": null,
- "outputs": [],
- "id": "TJtqiw5AqBp8"
+ ]
},
{
"cell_type": "code",
+ "execution_count": null,
+ "id": "XHfC0As6qN3T",
+ "metadata": {
+ "id": "XHfC0As6qN3T"
+ },
+ "outputs": [],
"source": [
"# Initialize ElasticsearchEmbeddings\n",
"model_id = \"\"\n",
@@ -281,16 +277,16 @@
"es_password = \"es_pass\"\n",
"test_index = \"\"\n",
"# input_field = \"your_input_field\" # if different from 'text_field'"
- ],
- "metadata": {
- "id": "XHfC0As6qN3T"
- },
- "execution_count": null,
- "outputs": [],
- "id": "XHfC0As6qN3T"
+ ]
},
{
"cell_type": "code",
+ "execution_count": null,
+ "id": "UkTipx1lqc3h",
+ "metadata": {
+ "id": "UkTipx1lqc3h"
+ },
+ "outputs": [],
"source": [
"# Generate embedding object\n",
"embeddings = ElasticsearchEmbeddings.from_credentials(\n",
@@ -300,16 +296,16 @@
" es_user=es_user,\n",
" es_password=es_password,\n",
")"
- ],
- "metadata": {
- "id": "UkTipx1lqc3h"
- },
- "execution_count": null,
- "outputs": [],
- "id": "UkTipx1lqc3h"
+ ]
},
{
"cell_type": "code",
+ "execution_count": null,
+ "id": "74psgD0oqjYK",
+ "metadata": {
+ "id": "74psgD0oqjYK"
+ },
+ "outputs": [],
"source": [
"# Initialize ElasticKnnSearch\n",
"knn_search = ElasticKnnSearch(\n",
@@ -319,26 +315,26 @@
" index_name=test_index,\n",
" embedding=embeddings,\n",
")"
- ],
- "metadata": {
- "id": "74psgD0oqjYK"
- },
- "execution_count": null,
- "outputs": [],
- "id": "74psgD0oqjYK"
+ ]
},
{
"cell_type": "markdown",
- "source": [
- "## Test adding vectors"
- ],
+ "id": "7AfgIKLWqnQl",
"metadata": {
"id": "7AfgIKLWqnQl"
},
- "id": "7AfgIKLWqnQl"
+ "source": [
+ "## Test adding vectors"
+ ]
},
{
"cell_type": "code",
+ "execution_count": null,
+ "id": "yNUUIaL9qmze",
+ "metadata": {
+ "id": "yNUUIaL9qmze"
+ },
+ "outputs": [],
"source": [
"# Test `add_texts` method\n",
"texts = [\"Hello, world!\", \"Machine learning is fun.\", \"I love Python.\"]\n",
@@ -351,26 +347,26 @@
" \"Python is great for data analysis.\",\n",
"]\n",
"knn_search.from_texts(new_texts, dims=dims)"
- ],
- "metadata": {
- "id": "yNUUIaL9qmze"
- },
- "execution_count": null,
- "outputs": [],
- "id": "yNUUIaL9qmze"
+ ]
},
{
"cell_type": "markdown",
- "source": [
- "## Test knn search using query vector builder "
- ],
+ "id": "0zdR-Iubquov",
"metadata": {
"id": "0zdR-Iubquov"
},
- "id": "0zdR-Iubquov"
+ "source": [
+ "## Test knn search using query vector builder "
+ ]
},
{
"cell_type": "code",
+ "execution_count": null,
+ "id": "bwR4jYvqqxTo",
+ "metadata": {
+ "id": "bwR4jYvqqxTo"
+ },
+ "outputs": [],
"source": [
"# Test `knn_search` method with model_id and query_text\n",
"query = \"Hello\"\n",
@@ -387,26 +383,26 @@
"print(\n",
" f\"The 'text' field value from the top hit is: '{hybrid_result['hits']['hits'][0]['_source']['text']}'\"\n",
")"
- ],
- "metadata": {
- "id": "bwR4jYvqqxTo"
- },
- "execution_count": null,
- "outputs": [],
- "id": "bwR4jYvqqxTo"
+ ]
},
{
"cell_type": "markdown",
- "source": [
- "## Test knn search using pre generated vector \n"
- ],
+ "id": "ltXYqp0qqz7R",
"metadata": {
"id": "ltXYqp0qqz7R"
},
- "id": "ltXYqp0qqz7R"
+ "source": [
+ "## Test knn search using pre generated vector \n"
+ ]
},
{
"cell_type": "code",
+ "execution_count": null,
+ "id": "O5COtpTqq23t",
+ "metadata": {
+ "id": "O5COtpTqq23t"
+ },
+ "outputs": [],
"source": [
"# Generate embedding for tests\n",
"query_text = \"Hello\"\n",
@@ -428,26 +424,26 @@
"print(\n",
" f\"The 'text' field value from the top hit is: '{knn_result['hits']['hits'][0]['_source']['text']}'\"\n",
")"
- ],
- "metadata": {
- "id": "O5COtpTqq23t"
- },
- "execution_count": null,
- "outputs": [],
- "id": "O5COtpTqq23t"
+ ]
},
{
"cell_type": "markdown",
- "source": [
- "## Test source option"
- ],
+ "id": "0dnmimcJq42C",
"metadata": {
"id": "0dnmimcJq42C"
},
- "id": "0dnmimcJq42C"
+ "source": [
+ "## Test source option"
+ ]
},
{
"cell_type": "code",
+ "execution_count": null,
+ "id": "v4_B72nHq7g1",
+ "metadata": {
+ "id": "v4_B72nHq7g1"
+ },
+ "outputs": [],
"source": [
"# Test `knn_search` method with model_id and query_text\n",
"query = \"Hello\"\n",
@@ -460,26 +456,26 @@
" query=query, model_id=model_id, k=2, source=False\n",
")\n",
"assert not \"_source\" in hybrid_result[\"hits\"][\"hits\"][0].keys()"
- ],
- "metadata": {
- "id": "v4_B72nHq7g1"
- },
- "execution_count": null,
- "outputs": [],
- "id": "v4_B72nHq7g1"
+ ]
},
{
"cell_type": "markdown",
- "source": [
- "## Test fields option "
- ],
+ "id": "teHgJgrlq-Jb",
"metadata": {
"id": "teHgJgrlq-Jb"
},
- "id": "teHgJgrlq-Jb"
+ "source": [
+ "## Test fields option "
+ ]
},
{
"cell_type": "code",
+ "execution_count": null,
+ "id": "utNBbpZYrAYW",
+ "metadata": {
+ "id": "utNBbpZYrAYW"
+ },
+ "outputs": [],
"source": [
"# Test `knn_search` method with model_id and query_text\n",
"query = \"Hello\"\n",
@@ -492,72 +488,72 @@
" query=query, model_id=model_id, k=2, fields=[\"text\"]\n",
")\n",
"assert \"text\" in hybrid_result[\"hits\"][\"hits\"][0][\"fields\"].keys()"
- ],
- "metadata": {
- "id": "utNBbpZYrAYW"
- },
- "execution_count": null,
- "outputs": [],
- "id": "utNBbpZYrAYW"
+ ]
},
{
"cell_type": "markdown",
- "source": [
- "### Test with es client connection rather than cloud_id "
- ],
+ "id": "hddsIFferBy1",
"metadata": {
"id": "hddsIFferBy1"
},
- "id": "hddsIFferBy1"
+ "source": [
+ "### Test with es client connection rather than cloud_id "
+ ]
},
{
"cell_type": "code",
+ "execution_count": null,
+ "id": "bXqrUnoirFia",
+ "metadata": {
+ "id": "bXqrUnoirFia"
+ },
+ "outputs": [],
"source": [
"# Create Elasticsearch connection\n",
"es_connection = Elasticsearch(\n",
" hosts=[\"https://es_cluster_url:port\"], basic_auth=(\"user\", \"password\")\n",
")"
- ],
- "metadata": {
- "id": "bXqrUnoirFia"
- },
- "execution_count": null,
- "outputs": [],
- "id": "bXqrUnoirFia"
+ ]
},
{
"cell_type": "code",
+ "execution_count": null,
+ "id": "TIM__Hm8rSEW",
+ "metadata": {
+ "id": "TIM__Hm8rSEW"
+ },
+ "outputs": [],
"source": [
"# Instantiate ElasticsearchEmbeddings using es_connection\n",
"embeddings = ElasticsearchEmbeddings.from_es_connection(\n",
" model_id,\n",
" es_connection,\n",
")"
- ],
- "metadata": {
- "id": "TIM__Hm8rSEW"
- },
- "execution_count": null,
- "outputs": [],
- "id": "TIM__Hm8rSEW"
+ ]
},
{
"cell_type": "code",
+ "execution_count": null,
+ "id": "1-CdnOrArVc_",
+ "metadata": {
+ "id": "1-CdnOrArVc_"
+ },
+ "outputs": [],
"source": [
"# Initialize ElasticKnnSearch\n",
"knn_search = ElasticKnnSearch(\n",
" es_connection=es_connection, index_name=test_index, embedding=embeddings\n",
")"
- ],
- "metadata": {
- "id": "1-CdnOrArVc_"
- },
- "execution_count": null,
- "outputs": [],
- "id": "1-CdnOrArVc_"
+ ]
},
{
"cell_type": "code",
+ "execution_count": null,
+ "id": "0kgyaL6QrYVF",
+ "metadata": {
+ "id": "0kgyaL6QrYVF"
+ },
+ "outputs": [],
"source": [
"# Test `knn_search` method with model_id and query_text\n",
"query = \"Hello\"\n",
@@ -566,16 +562,13 @@
"print(\n",
" f\"The 'text' field value from the top hit is: '{knn_result['hits']['hits'][0]['_source']['text']}'\"\n",
")"
- ],
- "metadata": {
- "id": "0kgyaL6QrYVF"
- },
- "execution_count": null,
- "outputs": [],
- "id": "0kgyaL6QrYVF"
+ ]
}
],
"metadata": {
+ "colab": {
+ "provenance": []
+ },
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
@@ -592,11 +585,8 @@
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.10.6"
- },
- "colab": {
- "provenance": []
}
},
"nbformat": 4,
"nbformat_minor": 5
-}
\ No newline at end of file
+}
diff --git a/docs/extras/modules/data_connection/vectorstores/integrations/hologres.ipynb b/docs/extras/modules/data_connection/vectorstores/integrations/hologres.ipynb
index 1d671cd6bde..77ff7bf032e 100644
--- a/docs/extras/modules/data_connection/vectorstores/integrations/hologres.ipynb
+++ b/docs/extras/modules/data_connection/vectorstores/integrations/hologres.ipynb
@@ -16,6 +16,15 @@
"Click [here](https://www.alibabacloud.com/zh/product/hologres) to fast deploy a Hologres cloud instance."
]
},
+ {
+ "cell_type": "code",
+ "execution_count": 1,
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "#!pip install psycopg2"
+ ]
+ },
{
"cell_type": "code",
"execution_count": 1,
@@ -149,7 +158,7 @@
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
- "version": "3.9.16"
+ "version": "3.10.6"
}
},
"nbformat": 4,
diff --git a/docs/extras/modules/data_connection/vectorstores/integrations/mongodb_atlas_vector_search.ipynb b/docs/extras/modules/data_connection/vectorstores/integrations/mongodb_atlas.ipynb
similarity index 99%
rename from docs/extras/modules/data_connection/vectorstores/integrations/mongodb_atlas_vector_search.ipynb
rename to docs/extras/modules/data_connection/vectorstores/integrations/mongodb_atlas.ipynb
index ddb7f28fd98..a56fc73cf50 100644
--- a/docs/extras/modules/data_connection/vectorstores/integrations/mongodb_atlas_vector_search.ipynb
+++ b/docs/extras/modules/data_connection/vectorstores/integrations/mongodb_atlas.ipynb
@@ -5,7 +5,7 @@
"id": "683953b3",
"metadata": {},
"source": [
- "# MongoDB Atlas Vector Search\n",
+ "# MongoDB Atlas\n",
"\n",
">[MongoDB Atlas](https://www.mongodb.com/docs/atlas/) is a fully-managed cloud database available in AWS , Azure, and GCP. It now has support for native Vector Search on your MongoDB document data.\n",
"\n",
@@ -214,7 +214,7 @@
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
- "version": "3.9.1"
+ "version": "3.10.6"
}
},
"nbformat": 4,
diff --git a/docs/extras/modules/data_connection/vectorstores/integrations/opensearch.ipynb b/docs/extras/modules/data_connection/vectorstores/integrations/opensearch.ipynb
index ee9fa2760e9..7d3d73136da 100644
--- a/docs/extras/modules/data_connection/vectorstores/integrations/opensearch.ipynb
+++ b/docs/extras/modules/data_connection/vectorstores/integrations/opensearch.ipynb
@@ -96,7 +96,7 @@
"id": "01a9a035",
"metadata": {},
"source": [
- "### similarity_search using Approximate k-NN\n",
+ "## similarity_search using Approximate k-NN\n",
"\n",
"`similarity_search` using `Approximate k-NN` Search with Custom Parameters"
]
@@ -182,7 +182,7 @@
"id": "0d0cd877",
"metadata": {},
"source": [
- "### similarity_search using Script Scoring\n",
+ "## similarity_search using Script Scoring\n",
"\n",
"`similarity_search` using `Script Scoring` with Custom Parameters"
]
@@ -221,7 +221,7 @@
"id": "a4af96cc",
"metadata": {},
"source": [
- "### similarity_search using Painless Scripting\n",
+ "## similarity_search using Painless Scripting\n",
"\n",
"`similarity_search` using `Painless Scripting` with Custom Parameters"
]
@@ -258,32 +258,35 @@
},
{
"cell_type": "markdown",
+ "id": "4f8fb0d0",
+ "metadata": {},
"source": [
- "### Maximum marginal relevance search (MMR)\n",
+ "## Maximum marginal relevance search (MMR)\n",
"If you’d like to look up for some similar documents, but you’d also like to receive diverse results, MMR is method you should consider. Maximal marginal relevance optimizes for similarity to query AND diversity among selected documents."
- ],
- "metadata": {
- "collapsed": false
- }
+ ]
},
{
"cell_type": "code",
"execution_count": null,
+ "id": "ba85e092",
+ "metadata": {
+ "collapsed": false,
+ "jupyter": {
+ "outputs_hidden": false
+ }
+ },
"outputs": [],
"source": [
"query = \"What did the president say about Ketanji Brown Jackson\"\n",
"docs = docsearch.max_marginal_relevance_search(query, k=2, fetch_k=10, lambda_param=0.5)"
- ],
- "metadata": {
- "collapsed": false
- }
+ ]
},
{
"cell_type": "markdown",
"id": "73264864",
"metadata": {},
"source": [
- "### Using a preexisting OpenSearch instance\n",
+ "## Using a preexisting OpenSearch instance\n",
"\n",
"It's also possible to use a preexisting OpenSearch instance with documents that already have vectors present."
]
@@ -330,7 +333,7 @@
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
- "version": "3.11.3"
+ "version": "3.10.6"
}
},
"nbformat": 4,
diff --git a/docs/extras/modules/data_connection/vectorstores/integrations/pgvector.ipynb b/docs/extras/modules/data_connection/vectorstores/integrations/pgvector.ipynb
index 292ed6c813b..381de0ee9f5 100644
--- a/docs/extras/modules/data_connection/vectorstores/integrations/pgvector.ipynb
+++ b/docs/extras/modules/data_connection/vectorstores/integrations/pgvector.ipynb
@@ -201,14 +201,7 @@
"cell_type": "markdown",
"metadata": {},
"source": [
- "## Similarity search with score"
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {},
- "source": [
- "### Similarity Search with Euclidean Distance (Default)"
+ "## Similarity Search with Euclidean Distance (Default)"
]
},
{
@@ -303,14 +296,14 @@
"cell_type": "markdown",
"metadata": {},
"source": [
- "## Working with vectorstore in PG"
+ "## Working with vectorstore"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
- "### Uploading a vectorstore in PG "
+ "### Uploading a vectorstore"
]
},
{
@@ -336,7 +329,7 @@
"cell_type": "markdown",
"metadata": {},
"source": [
- "### Retrieving a vectorstore in PG"
+ "### Retrieving a vectorstore"
]
},
{
@@ -498,7 +491,7 @@
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
- "version": "3.9.7"
+ "version": "3.10.6"
}
},
"nbformat": 4,
diff --git a/docs/extras/modules/data_connection/vectorstores/integrations/rockset_vector_database.ipynb b/docs/extras/modules/data_connection/vectorstores/integrations/rockset.ipynb
similarity index 91%
rename from docs/extras/modules/data_connection/vectorstores/integrations/rockset_vector_database.ipynb
rename to docs/extras/modules/data_connection/vectorstores/integrations/rockset.ipynb
index 0c44fa35797..bf96c786cd1 100644
--- a/docs/extras/modules/data_connection/vectorstores/integrations/rockset_vector_database.ipynb
+++ b/docs/extras/modules/data_connection/vectorstores/integrations/rockset.ipynb
@@ -1,20 +1,18 @@
{
"cells": [
{
- "attachments": {},
"cell_type": "markdown",
"id": "20b588b4",
"metadata": {},
"source": [
- "# Rockset Vector Search\n",
+ "# Rockset\n",
"\n",
- "[Rockset](https://rockset.com/product/) is a real-time analytics database service for serving low latency, high concurrency analytical queries at scale. It builds a Converged Index™ on structured and semi-structured data with an efficient store for vector embeddings. Its support for running SQL on schemaless data makes it a perfect choice for running vector search with metadata filters. \n",
+ ">[Rockset](https://rockset.com/product/) is a real-time analytics database service for serving low latency, high concurrency analytical queries at scale. It builds a Converged Index™ on structured and semi-structured data with an efficient store for vector embeddings. Its support for running SQL on schemaless data makes it a perfect choice for running vector search with metadata filters. \n",
"\n",
- "This notebook demonstrates how to use Rockset as a vectorstore in langchain. To get started, make sure you have a Rockset account and an API key available."
+ "This notebook demonstrates how to use `Rockset` as a vectorstore in langchain. To get started, make sure you have a `Rockset` account and an API key available."
]
},
{
- "attachments": {},
"cell_type": "markdown",
"id": "e290ddc0",
"metadata": {},
@@ -25,7 +23,6 @@
]
},
{
- "attachments": {},
"cell_type": "markdown",
"id": "7d77bbbe",
"metadata": {},
@@ -52,7 +49,6 @@
]
},
{
- "attachments": {},
"cell_type": "markdown",
"id": "7951c9cd",
"metadata": {},
@@ -71,7 +67,6 @@
]
},
{
- "attachments": {},
"cell_type": "markdown",
"id": "8600900d",
"metadata": {},
@@ -80,12 +75,11 @@
]
},
{
- "attachments": {},
"cell_type": "markdown",
"id": "3bf2f818",
"metadata": {},
"source": [
- "## Using Rockset langchain vectorstore"
+ "## Example"
]
},
{
@@ -109,7 +103,6 @@
]
},
{
- "attachments": {},
"cell_type": "markdown",
"id": "474636a2",
"metadata": {},
@@ -138,7 +131,6 @@
]
},
{
- "attachments": {},
"cell_type": "markdown",
"id": "1404cada",
"metadata": {},
@@ -173,7 +165,6 @@
]
},
{
- "attachments": {},
"cell_type": "markdown",
"id": "f1290844",
"metadata": {},
@@ -205,7 +196,6 @@
]
},
{
- "attachments": {},
"cell_type": "markdown",
"id": "5e15d630",
"metadata": {},
@@ -243,7 +233,6 @@
]
},
{
- "attachments": {},
"cell_type": "markdown",
"id": "0765b822",
"metadata": {},
@@ -266,7 +255,6 @@
]
},
{
- "attachments": {},
"cell_type": "markdown",
"id": "03fa12a9",
"metadata": {},
@@ -277,6 +265,14 @@
"\n",
"Keep an eye on https://rockset.com/blog/introducing-vector-search-on-rockset/ for future updates in this space!"
]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "id": "2763dddb-e87d-4d3b-b0bf-c246b0573d87",
+ "metadata": {},
+ "outputs": [],
+ "source": []
}
],
"metadata": {
@@ -295,7 +291,7 @@
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
- "version": "3.9.6"
+ "version": "3.10.6"
}
},
"nbformat": 4,
diff --git a/docs/extras/modules/data_connection/vectorstores/integrations/singlestoredb.ipynb b/docs/extras/modules/data_connection/vectorstores/integrations/singlestoredb.ipynb
index c011e950778..a70370e82ee 100644
--- a/docs/extras/modules/data_connection/vectorstores/integrations/singlestoredb.ipynb
+++ b/docs/extras/modules/data_connection/vectorstores/integrations/singlestoredb.ipynb
@@ -6,7 +6,9 @@
"metadata": {},
"source": [
"# SingleStoreDB\n",
- "[SingleStoreDB](https://singlestore.com/) is a high-performance distributed SQL database that supports deployment both in the [cloud](https://www.singlestore.com/cloud/) and on-premises. It provides vector storage, and vector functions including [dot_product](https://docs.singlestore.com/managed-service/en/reference/sql-reference/vector-functions/dot_product.html) and [euclidean_distance](https://docs.singlestore.com/managed-service/en/reference/sql-reference/vector-functions/euclidean_distance.html), thereby supporting AI applications that require text similarity matching. This tutorial illustrates how to [work with vector data in SingleStoreDB](https://docs.singlestore.com/managed-service/en/developer-resources/functional-extensions/working-with-vector-data.html)."
+ ">[SingleStoreDB](https://singlestore.com/) is a high-performance distributed SQL database that supports deployment both in the [cloud](https://www.singlestore.com/cloud/) and on-premises. It provides vector storage, and vector functions including [dot_product](https://docs.singlestore.com/managed-service/en/reference/sql-reference/vector-functions/dot_product.html) and [euclidean_distance](https://docs.singlestore.com/managed-service/en/reference/sql-reference/vector-functions/euclidean_distance.html), thereby supporting AI applications that require text similarity matching. \n",
+ "\n",
+ "This tutorial illustrates how to [work with vector data in SingleStoreDB](https://docs.singlestore.com/managed-service/en/developer-resources/functional-extensions/working-with-vector-data.html)."
]
},
{
@@ -129,7 +131,7 @@
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
- "version": "3.9.2"
+ "version": "3.10.6"
}
},
"nbformat": 4,
diff --git a/docs/extras/modules/data_connection/vectorstores/integrations/sklearn.ipynb b/docs/extras/modules/data_connection/vectorstores/integrations/sklearn.ipynb
index cca192ab47b..b93c734a74f 100644
--- a/docs/extras/modules/data_connection/vectorstores/integrations/sklearn.ipynb
+++ b/docs/extras/modules/data_connection/vectorstores/integrations/sklearn.ipynb
@@ -1,13 +1,12 @@
{
"cells": [
{
- "attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
- "# SKLearnVectorStore\n",
+ "# scikit-learn\n",
"\n",
- "[scikit-learn](https://scikit-learn.org/stable/) is an open source collection of machine learning algorithms, including some implementations of the [k nearest neighbors](https://scikit-learn.org/stable/modules/generated/sklearn.neighbors.NearestNeighbors.html). `SKLearnVectorStore` wraps this implementation and adds the possibility to persist the vector store in json, bson (binary json) or Apache Parquet format.\n",
+ ">[scikit-learn](https://scikit-learn.org/stable/) is an open source collection of machine learning algorithms, including some implementations of the [k nearest neighbors](https://scikit-learn.org/stable/modules/generated/sklearn.neighbors.NearestNeighbors.html). `SKLearnVectorStore` wraps this implementation and adds the possibility to persist the vector store in json, bson (binary json) or Apache Parquet format.\n",
"\n",
"This notebook shows how to use the `SKLearnVectorStore` vector database."
]
@@ -28,7 +27,6 @@
]
},
{
- "attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
@@ -48,7 +46,6 @@
]
},
{
- "attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
@@ -76,7 +73,6 @@
]
},
{
- "attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
@@ -120,7 +116,6 @@
]
},
{
- "attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
@@ -190,7 +185,6 @@
]
},
{
- "attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
@@ -209,7 +203,7 @@
],
"metadata": {
"kernelspec": {
- "display_name": "sofia",
+ "display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
@@ -223,10 +217,9 @@
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
- "version": "3.8.16"
- },
- "orig_nbformat": 4
+ "version": "3.10.6"
+ }
},
"nbformat": 4,
- "nbformat_minor": 2
+ "nbformat_minor": 4
}
diff --git a/docs/extras/modules/data_connection/vectorstores/integrations/starrocks.ipynb b/docs/extras/modules/data_connection/vectorstores/integrations/starrocks.ipynb
index 84d640eb71d..515002a0bff 100644
--- a/docs/extras/modules/data_connection/vectorstores/integrations/starrocks.ipynb
+++ b/docs/extras/modules/data_connection/vectorstores/integrations/starrocks.ipynb
@@ -7,11 +7,10 @@
"source": [
"# StarRocks\n",
"\n",
- "[StarRocks | A High-Performance Analytical Database](https://www.starrocks.io/)\n",
+ ">[StarRocks](https://www.starrocks.io/) is a High-Performance Analytical Database.\n",
+ "`StarRocks` is a next-gen sub-second MPP database for full analytics scenarios, including multi-dimensional analytics, real-time analytics and ad-hoc query.\n",
"\n",
- "StarRocks is a next-gen sub-second MPP database for full analytics scenarios, including multi-dimensional analytics, real-time analytics and ad-hoc query.\n",
- "\n",
- "Usually StarRocks is categorized into OLAP, and it has showed excellent performance in [ClickBench — a Benchmark For Analytical DBMS](https://benchmark.clickhouse.com/). Since it has a super-fast vectorized execution engine, it could also be used as a fast vectordb.\n",
+ ">Usually `StarRocks` is categorized into OLAP, and it has showed excellent performance in [ClickBench — a Benchmark For Analytical DBMS](https://benchmark.clickhouse.com/). Since it has a super-fast vectorized execution engine, it could also be used as a fast vectordb.\n",
"\n",
"Here we'll show how to use the StarRocks Vector Store."
]
@@ -21,8 +20,17 @@
"id": "1685854f",
"metadata": {},
"source": [
- "\n",
- "## Import all used modules"
+ "## Setup"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 1,
+ "id": "311d44bb-4aca-4f3b-8f97-5e1f29238e40",
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "#!pip install pymysql"
]
},
{
@@ -305,7 +313,7 @@
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
- "version": "3.11.3"
+ "version": "3.10.6"
}
},
"nbformat": 4,
diff --git a/docs/extras/modules/data_connection/vectorstores/integrations/tigris.ipynb b/docs/extras/modules/data_connection/vectorstores/integrations/tigris.ipynb
index e3718a66915..ba529c1033b 100644
--- a/docs/extras/modules/data_connection/vectorstores/integrations/tigris.ipynb
+++ b/docs/extras/modules/data_connection/vectorstores/integrations/tigris.ipynb
@@ -2,68 +2,67 @@
"cells": [
{
"cell_type": "markdown",
+ "metadata": {},
"source": [
"# Tigris\n",
"\n",
"> [Tigris](htttps://tigrisdata.com) is an open source Serverless NoSQL Database and Search Platform designed to simplify building high-performance vector search applications.\n",
- "> Tigris eliminates the infrastructure complexity of managing, operating, and synchronizing multiple tools, allowing you to focus on building great applications instead."
- ],
- "metadata": {
- "collapsed": false
- }
+ "> `Tigris` eliminates the infrastructure complexity of managing, operating, and synchronizing multiple tools, allowing you to focus on building great applications instead."
+ ]
},
{
"cell_type": "markdown",
+ "metadata": {},
"source": [
"This notebook guides you how to use Tigris as your VectorStore"
- ],
- "metadata": {
- "collapsed": false
- }
+ ]
},
{
"cell_type": "markdown",
+ "metadata": {},
"source": [
"**Pre requisites**\n",
"1. An OpenAI account. You can sign up for an account [here](https://platform.openai.com/)\n",
"2. [Sign up for a free Tigris account](https://console.preview.tigrisdata.cloud). Once you have signed up for the Tigris account, create a new project called `vectordemo`. Next, make a note of the *Uri* for the region you've created your project in, the **clientId** and **clientSecret**. You can get all this information from the **Application Keys** section of the project."
- ],
- "metadata": {
- "collapsed": false
- }
+ ]
},
{
"cell_type": "markdown",
+ "metadata": {},
"source": [
"Let's first install our dependencies:"
- ],
- "metadata": {
- "collapsed": false
- }
+ ]
},
{
"cell_type": "code",
"execution_count": null,
+ "metadata": {
+ "collapsed": false,
+ "jupyter": {
+ "outputs_hidden": false
+ }
+ },
"outputs": [],
"source": [
"!pip install tigrisdb openapi-schema-pydantic openai tiktoken"
- ],
- "metadata": {
- "collapsed": false
- }
+ ]
},
{
"cell_type": "markdown",
+ "metadata": {},
"source": [
"We will load the `OpenAI` api key and `Tigris` credentials in our environment"
- ],
- "metadata": {
- "collapsed": false
- }
+ ]
},
{
"cell_type": "code",
"execution_count": null,
+ "metadata": {
+ "collapsed": false,
+ "jupyter": {
+ "outputs_hidden": false
+ }
+ },
"outputs": [],
"source": [
"import os\n",
@@ -73,38 +72,42 @@
"os.environ[\"TIGRIS_PROJECT\"] = getpass.getpass(\"Tigris Project Name:\")\n",
"os.environ[\"TIGRIS_CLIENT_ID\"] = getpass.getpass(\"Tigris Client Id:\")\n",
"os.environ[\"TIGRIS_CLIENT_SECRET\"] = getpass.getpass(\"Tigris Client Secret:\")"
- ],
- "metadata": {
- "collapsed": false
- }
+ ]
},
{
"cell_type": "code",
"execution_count": null,
+ "metadata": {
+ "collapsed": false,
+ "jupyter": {
+ "outputs_hidden": false
+ }
+ },
"outputs": [],
"source": [
"from langchain.embeddings.openai import OpenAIEmbeddings\n",
"from langchain.text_splitter import CharacterTextSplitter\n",
"from langchain.vectorstores import Tigris\n",
"from langchain.document_loaders import TextLoader"
- ],
- "metadata": {
- "collapsed": false
- }
+ ]
},
{
"cell_type": "markdown",
+ "metadata": {},
"source": [
"### Initialize Tigris vector store\n",
"Let's import our test dataset:"
- ],
- "metadata": {
- "collapsed": false
- }
+ ]
},
{
"cell_type": "code",
"execution_count": null,
+ "metadata": {
+ "collapsed": false,
+ "jupyter": {
+ "outputs_hidden": false
+ }
+ },
"outputs": [],
"source": [
"loader = TextLoader(\"../../../state_of_the_union.txt\")\n",
@@ -113,87 +116,89 @@
"docs = text_splitter.split_documents(documents)\n",
"\n",
"embeddings = OpenAIEmbeddings()"
- ],
- "metadata": {
- "collapsed": false
- }
+ ]
},
{
"cell_type": "code",
"execution_count": null,
+ "metadata": {
+ "collapsed": false,
+ "jupyter": {
+ "outputs_hidden": false
+ }
+ },
"outputs": [],
"source": [
"vector_store = Tigris.from_documents(docs, embeddings, index_name=\"my_embeddings\")"
- ],
- "metadata": {
- "collapsed": false
- }
+ ]
},
{
"cell_type": "markdown",
+ "metadata": {},
"source": [
"### Similarity Search"
- ],
- "metadata": {
- "collapsed": false
- }
+ ]
},
{
"cell_type": "code",
"execution_count": null,
+ "metadata": {
+ "collapsed": false,
+ "jupyter": {
+ "outputs_hidden": false
+ }
+ },
"outputs": [],
"source": [
"query = \"What did the president say about Ketanji Brown Jackson\"\n",
"found_docs = vector_store.similarity_search(query)\n",
"print(found_docs)"
- ],
- "metadata": {
- "collapsed": false
- }
+ ]
},
{
"cell_type": "markdown",
+ "metadata": {},
"source": [
"### Similarity Search with score (vector distance)"
- ],
- "metadata": {
- "collapsed": false
- }
+ ]
},
{
"cell_type": "code",
"execution_count": null,
+ "metadata": {
+ "collapsed": false,
+ "jupyter": {
+ "outputs_hidden": false
+ }
+ },
"outputs": [],
"source": [
"query = \"What did the president say about Ketanji Brown Jackson\"\n",
"result = vector_store.similarity_search_with_score(query)\n",
"for doc, score in result:\n",
" print(f\"document={doc}, score={score}\")"
- ],
- "metadata": {
- "collapsed": false
- }
+ ]
}
],
"metadata": {
"kernelspec": {
- "display_name": "Python 3",
+ "display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
- "version": 2
+ "version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
- "pygments_lexer": "ipython2",
- "version": "2.7.6"
+ "pygments_lexer": "ipython3",
+ "version": "3.10.6"
}
},
"nbformat": 4,
- "nbformat_minor": 0
+ "nbformat_minor": 4
}
diff --git a/docs/extras/modules/data_connection/vectorstores/integrations/typesense.ipynb b/docs/extras/modules/data_connection/vectorstores/integrations/typesense.ipynb
index a00fe58f73a..a547f5c640f 100644
--- a/docs/extras/modules/data_connection/vectorstores/integrations/typesense.ipynb
+++ b/docs/extras/modules/data_connection/vectorstores/integrations/typesense.ipynb
@@ -2,6 +2,7 @@
"cells": [
{
"cell_type": "markdown",
+ "metadata": {},
"source": [
"# Typesense\n",
"\n",
@@ -10,97 +11,105 @@
"> Typesense focuses on performance by storing the entire index in RAM (with a backup on disk) and also focuses on providing an out-of-the-box developer experience by simplifying available options and setting good defaults.\n",
">\n",
"> It also lets you combine attribute-based filtering together with vector queries, to fetch the most relevant documents."
- ],
- "metadata": {
- "collapsed": false
- }
+ ]
},
{
"cell_type": "markdown",
+ "metadata": {},
"source": [
"This notebook shows you how to use Typesense as your VectorStore."
- ],
- "metadata": {
- "collapsed": false
- }
+ ]
},
{
"cell_type": "markdown",
+ "metadata": {},
"source": [
"Let's first install our dependencies:"
- ],
- "metadata": {
- "collapsed": false
- }
+ ]
},
{
"cell_type": "code",
"execution_count": null,
+ "metadata": {
+ "collapsed": false,
+ "jupyter": {
+ "outputs_hidden": false
+ }
+ },
"outputs": [],
"source": [
"!pip install typesense openapi-schema-pydantic openai tiktoken"
- ],
- "metadata": {
- "collapsed": false
- }
+ ]
},
{
"cell_type": "markdown",
+ "metadata": {},
"source": [
"We want to use `OpenAIEmbeddings` so we have to get the OpenAI API Key."
- ],
- "metadata": {
- "collapsed": false
- }
+ ]
},
{
"cell_type": "code",
"execution_count": 2,
+ "metadata": {
+ "ExecuteTime": {
+ "end_time": "2023-05-23T22:48:02.968822Z",
+ "start_time": "2023-05-23T22:47:48.574094Z"
+ },
+ "collapsed": false,
+ "jupyter": {
+ "outputs_hidden": false
+ }
+ },
"outputs": [],
"source": [
"import os\n",
"import getpass\n",
"\n",
"os.environ[\"OPENAI_API_KEY\"] = getpass.getpass(\"OpenAI API Key:\")"
- ],
- "metadata": {
- "collapsed": false,
- "ExecuteTime": {
- "end_time": "2023-05-23T22:48:02.968822Z",
- "start_time": "2023-05-23T22:47:48.574094Z"
- }
- }
+ ]
},
{
"cell_type": "code",
"execution_count": 6,
+ "metadata": {
+ "ExecuteTime": {
+ "end_time": "2023-05-23T22:50:34.775893Z",
+ "start_time": "2023-05-23T22:50:34.771889Z"
+ },
+ "collapsed": false,
+ "jupyter": {
+ "outputs_hidden": false
+ }
+ },
"outputs": [],
"source": [
"from langchain.embeddings.openai import OpenAIEmbeddings\n",
"from langchain.text_splitter import CharacterTextSplitter\n",
"from langchain.vectorstores import Typesense\n",
"from langchain.document_loaders import TextLoader"
- ],
- "metadata": {
- "collapsed": false,
- "ExecuteTime": {
- "end_time": "2023-05-23T22:50:34.775893Z",
- "start_time": "2023-05-23T22:50:34.771889Z"
- }
- }
+ ]
},
{
"cell_type": "markdown",
+ "metadata": {},
"source": [
"Let's import our test dataset:"
- ],
- "metadata": {
- "collapsed": false
- }
+ ]
},
{
"cell_type": "code",
"execution_count": 19,
+ "metadata": {
+ "ExecuteTime": {
+ "end_time": "2023-05-23T22:56:19.093489Z",
+ "start_time": "2023-05-23T22:56:19.089Z"
+ },
+ "collapsed": false,
+ "jupyter": {
+ "outputs_hidden": false
+ }
+ },
"outputs": [],
"source": [
"loader = TextLoader(\"../../../state_of_the_union.txt\")\n",
@@ -109,18 +118,17 @@
"docs = text_splitter.split_documents(documents)\n",
"\n",
"embeddings = OpenAIEmbeddings()"
- ],
- "metadata": {
- "collapsed": false,
- "ExecuteTime": {
- "end_time": "2023-05-23T22:56:19.093489Z",
- "start_time": "2023-05-23T22:56:19.089Z"
- }
- }
+ ]
},
{
"cell_type": "code",
"execution_count": null,
+ "metadata": {
+ "collapsed": false,
+ "jupyter": {
+ "outputs_hidden": false
+ }
+ },
"outputs": [],
"source": [
"docsearch = Typesense.from_documents(\n",
@@ -134,98 +142,103 @@
" \"typesense_collection_name\": \"lang-chain\",\n",
" },\n",
")"
- ],
- "metadata": {
- "collapsed": false
- }
+ ]
},
{
"cell_type": "markdown",
+ "metadata": {},
"source": [
"## Similarity Search"
- ],
- "metadata": {
- "collapsed": false
- }
+ ]
},
{
"cell_type": "code",
"execution_count": null,
+ "metadata": {
+ "collapsed": false,
+ "jupyter": {
+ "outputs_hidden": false
+ }
+ },
"outputs": [],
"source": [
"query = \"What did the president say about Ketanji Brown Jackson\"\n",
"found_docs = docsearch.similarity_search(query)"
- ],
- "metadata": {
- "collapsed": false
- }
+ ]
},
{
"cell_type": "code",
"execution_count": null,
+ "metadata": {
+ "collapsed": false,
+ "jupyter": {
+ "outputs_hidden": false
+ }
+ },
"outputs": [],
"source": [
"print(found_docs[0].page_content)"
- ],
- "metadata": {
- "collapsed": false
- }
+ ]
},
{
"cell_type": "markdown",
+ "metadata": {},
"source": [
"## Typesense as a Retriever\n",
"\n",
"Typesense, as all the other vector stores, is a LangChain Retriever, by using cosine similarity."
- ],
- "metadata": {
- "collapsed": false
- }
+ ]
},
{
"cell_type": "code",
"execution_count": null,
+ "metadata": {
+ "collapsed": false,
+ "jupyter": {
+ "outputs_hidden": false
+ }
+ },
"outputs": [],
"source": [
"retriever = docsearch.as_retriever()\n",
"retriever"
- ],
- "metadata": {
- "collapsed": false
- }
+ ]
},
{
"cell_type": "code",
"execution_count": null,
+ "metadata": {
+ "collapsed": false,
+ "jupyter": {
+ "outputs_hidden": false
+ }
+ },
"outputs": [],
"source": [
"query = \"What did the president say about Ketanji Brown Jackson\"\n",
"retriever.get_relevant_documents(query)[0]"
- ],
- "metadata": {
- "collapsed": false
- }
+ ]
}
],
"metadata": {
"kernelspec": {
- "display_name": "Python 3",
+ "display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
- "version": 2
+ "version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
- "pygments_lexer": "ipython2",
- "version": "2.7.6"
+ "pygments_lexer": "ipython3",
+ "version": "3.10.6"
}
},
"nbformat": 4,
- "nbformat_minor": 0
+ "nbformat_minor": 4
}