diff --git a/docs/extras/integrations/vectorstores/vespa.ipynb b/docs/extras/integrations/vectorstores/vespa.ipynb
index c7500944093..62e2bd7679a 100644
--- a/docs/extras/integrations/vectorstores/vespa.ipynb
+++ b/docs/extras/integrations/vectorstores/vespa.ipynb
@@ -1,883 +1,922 @@
 {
- "cells": [
-  {
-   "cell_type": "markdown",
-   "id": "ce0f17b9",
-   "metadata": {},
-   "source": [
-    "# Vespa\n",
-    "\n",
-    ">[Vespa](https://vespa.ai/) is a fully featured search engine and vector database. It supports vector search (ANN), lexical search, and search in structured data, all in the same query.\n",
-    "\n",
-    "This notebook shows how to use `Vespa.ai` as a LangChain vector store.\n",
-    "\n",
-    "In order to create the vector store, we use\n",
-    "[pyvespa](https://pyvespa.readthedocs.io/en/latest/index.html) to create a\n",
-    "connection a `Vespa` service."
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": null,
-   "id": "7e6a11ab-38bd-4920-ba11-60cb2f075754",
-   "metadata": {
-    "tags": []
-   },
-   "outputs": [],
-   "source": [
-    "#!pip install pyvespa"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "source": [
-    "Using the `pyvespa` package, you can either connect to a\n",
-    "[Vespa Cloud instance](https://pyvespa.readthedocs.io/en/latest/deploy-vespa-cloud.html)\n",
-    "or a local\n",
-    "[Docker instance](https://pyvespa.readthedocs.io/en/latest/deploy-docker.html).\n",
-    "Here, we will create a new Vespa application and deploy that using Docker.\n",
-    "\n",
-    "#### Creating a Vespa application\n",
-    "\n",
-    "First, we need to create an application package:"
-   ],
-   "metadata": {
-    "collapsed": false
-   }
-  },
-  {
-   "cell_type": "code",
-   "execution_count": null,
-   "outputs": [],
-   "source": [
-    "from vespa.package import ApplicationPackage, Field, RankProfile\n",
-    "\n",
-    "app_package = ApplicationPackage(name=\"testapp\")\n",
-    "app_package.schema.add_fields(\n",
-    "    Field(name=\"text\", type=\"string\", indexing=[\"index\", \"summary\"], index=\"enable-bm25\"),\n",
-    "    Field(name=\"embedding\", type=\"tensor<float>(x[384])\",\n",
-    "          indexing=[\"attribute\", \"summary\"],\n",
-    "          attribute=[f\"distance-metric: angular\"]),\n",
-    ")\n",
-    "app_package.schema.add_rank_profile(\n",
-    "    RankProfile(name=\"default\",\n",
-    "                first_phase=\"closeness(field, embedding)\",\n",
-    "                inputs=[(\"query(query_embedding)\", \"tensor<float>(x[384])\")]\n",
-    "                )\n",
-    ")"
-   ],
-   "metadata": {
-    "collapsed": false,
-    "pycharm": {
-     "name": "#%%\n"
+  "cells": [
+    {
+      "cell_type": "markdown",
+      "id": "ce0f17b9",
+      "metadata": {},
+      "source": [
+        "# Vespa\n",
+        "\n",
+        ">[Vespa](https://vespa.ai/) is a fully featured search engine and vector database. It supports vector search (ANN), lexical search, and search in structured data, all in the same query.\n",
+        "\n",
+        "This notebook shows how to use `Vespa.ai` as a LangChain vector store.\n",
+        "\n",
+        "In order to create the vector store, we use\n",
+        "[pyvespa](https://pyvespa.readthedocs.io/en/latest/index.html) to create a\n",
+        "connection a `Vespa` service."
+      ]
+    },
+    {
+      "cell_type": "code",
+      "execution_count": null,
+      "id": "7e6a11ab-38bd-4920-ba11-60cb2f075754",
+      "metadata": {
+        "tags": []
+      },
+      "outputs": [],
+      "source": [
+        "#!pip install pyvespa"
+      ]
+    },
+    {
+      "cell_type": "markdown",
+      "source": [
+        "Using the `pyvespa` package, you can either connect to a\n",
+        "[Vespa Cloud instance](https://pyvespa.readthedocs.io/en/latest/deploy-vespa-cloud.html)\n",
+        "or a local\n",
+        "[Docker instance](https://pyvespa.readthedocs.io/en/latest/deploy-docker.html).\n",
+        "Here, we will create a new Vespa application and deploy that using Docker.\n",
+        "\n",
+        "#### Creating a Vespa application\n",
+        "\n",
+        "First, we need to create an application package:"
+      ],
+      "metadata": {
+        "collapsed": false
+      },
+      "id": "283b49c9"
+    },
+    {
+      "cell_type": "code",
+      "execution_count": null,
+      "outputs": [],
+      "source": [
+        "from vespa.package import ApplicationPackage, Field, RankProfile\n",
+        "\n",
+        "app_package = ApplicationPackage(name=\"testapp\")\n",
+        "app_package.schema.add_fields(\n",
+        "    Field(name=\"text\", type=\"string\", indexing=[\"index\", \"summary\"], index=\"enable-bm25\"),\n",
+        "    Field(name=\"embedding\", type=\"tensor<float>(x[384])\",\n",
+        "          indexing=[\"attribute\", \"summary\"],\n",
+        "          attribute=[f\"distance-metric: angular\"]),\n",
+        ")\n",
+        "app_package.schema.add_rank_profile(\n",
+        "    RankProfile(name=\"default\",\n",
+        "                first_phase=\"closeness(field, embedding)\",\n",
+        "                inputs=[(\"query(query_embedding)\", \"tensor<float>(x[384])\")]\n",
+        "                )\n",
+        ")"
+      ],
+      "metadata": {
+        "collapsed": false,
+        "pycharm": {
+          "name": "#%%\n"
+        }
+      },
+      "id": "91150665"
+    },
+    {
+      "cell_type": "markdown",
+      "source": [
+        "This sets up a Vespa application with a schema for each document that contains\n",
+        "two fields: `text` for holding the document text and `embedding` for holding\n",
+        "the embedding vector. The `text` field is set up to use a BM25 index for\n",
+        "efficient text retrieval, and we'll see how to use this and hybrid search a\n",
+        "bit later.\n",
+        "\n",
+        "The `embedding` field is set up with a vector of length 384 to hold the\n",
+        "embedding representation of the text. See\n",
+        "[Vespa's Tensor Guide](https://docs.vespa.ai/en/tensor-user-guide.html)\n",
+        "for more on tensors in Vespa.\n",
+        "\n",
+        "Lastly, we add a [rank profile](https://docs.vespa.ai/en/ranking.html) to\n",
+        "instruct Vespa how to order documents. Here we set this up with a\n",
+        "[nearest neighbor search](https://docs.vespa.ai/en/nearest-neighbor-search.html).\n",
+        "\n",
+        "Now we can deploy this application locally:"
+      ],
+      "metadata": {
+        "collapsed": false,
+        "pycharm": {
+          "name": "#%% md\n"
+        }
+      },
+      "id": "15477106"
+    },
+    {
+      "cell_type": "code",
+      "execution_count": 2,
+      "id": "c10dd962",
+      "metadata": {
+        "tags": []
+      },
+      "outputs": [],
+      "source": [
+        "from vespa.deployment import VespaDocker\n",
+        "\n",
+        "vespa_docker = VespaDocker()\n",
+        "vespa_app = vespa_docker.deploy(application_package=app_package)"
+      ]
+    },
+    {
+      "cell_type": "markdown",
+      "id": "3df4ce53",
+      "metadata": {},
+      "source": [
+        "This deploys and creates a connection to a `Vespa` service. In case you\n",
+        "already have a Vespa application running, for instance in the cloud,\n",
+        "please refer to the PyVespa application for how to connect.\n",
+        "\n",
+        "#### Creating a Vespa vector store\n",
+        "\n",
+        "Now, let's load some documents:"
+      ]
+    },
+    {
+      "cell_type": "code",
+      "execution_count": null,
+      "outputs": [],
+      "source": [
+        "from langchain.document_loaders import TextLoader\n",
+        "from langchain.text_splitter import CharacterTextSplitter\n",
+        "\n",
+        "loader = TextLoader(\"../../modules/state_of_the_union.txt\")\n",
+        "documents = loader.load()\n",
+        "text_splitter = CharacterTextSplitter(chunk_size=1000, chunk_overlap=0)\n",
+        "docs = text_splitter.split_documents(documents)\n",
+        "\n",
+        "from langchain.embeddings.sentence_transformer import SentenceTransformerEmbeddings\n",
+        "\n",
+        "embedding_function = SentenceTransformerEmbeddings(model_name=\"all-MiniLM-L6-v2\")"
+      ],
+      "metadata": {
+        "collapsed": false,
+        "pycharm": {
+          "name": "#%%\n"
+        }
+      },
+      "id": "7abde491"
+    },
+    {
+      "cell_type": "markdown",
+      "source": [
+        "Here, we also set up local sentence embedder to transform the text to embedding\n",
+        "vectors. One could also use OpenAI embeddings, but the vector length needs to\n",
+        "be updated to `1536` to reflect the larger size of that embedding.\n",
+        "\n",
+        "To feed these to Vespa, we need to configure how the vector store should map to\n",
+        "fields in the Vespa application. Then we create the vector store directly from\n",
+        "this set of documents:"
+      ],
+      "metadata": {
+        "collapsed": false,
+        "pycharm": {
+          "name": "#%% md\n"
+        }
+      },
+      "id": "d42365c7"
+    },
+    {
+      "cell_type": "code",
+      "execution_count": null,
+      "outputs": [],
+      "source": [
+        "vespa_config = dict(\n",
+        "    page_content_field=\"text\",\n",
+        "    embedding_field=\"embedding\",\n",
+        "    input_field=\"query_embedding\"\n",
+        ")\n",
+        "\n",
+        "from langchain.vectorstores import VespaStore\n",
+        "\n",
+        "db = VespaStore.from_documents(docs, embedding_function, app=vespa_app, **vespa_config)"
+      ],
+      "metadata": {
+        "collapsed": false,
+        "pycharm": {
+          "name": "#%%\n"
+        }
+      },
+      "id": "0b647878"
+    },
+    {
+      "cell_type": "markdown",
+      "source": [
+        "This creates a Vespa vector store and feeds that set of documents to Vespa.\n",
+        "The vector store takes care of calling the embedding function for each document\n",
+        "and inserts them into the database.\n",
+        "\n",
+        "We can now query the vector store:"
+      ],
+      "metadata": {
+        "collapsed": false,
+        "pycharm": {
+          "name": "#%% md\n"
+        }
+      },
+      "id": "d6bd0aab"
+    },
+    {
+      "cell_type": "code",
+      "execution_count": null,
+      "id": "7ccca1f4",
+      "metadata": {
+        "pycharm": {
+          "name": "#%%\n"
+        }
+      },
+      "outputs": [],
+      "source": [
+        "query = \"What did the president say about Ketanji Brown Jackson\"\n",
+        "results = db.similarity_search(query)\n",
+        "\n",
+        "print(results[0].page_content)"
+      ]
+    },
+    {
+      "cell_type": "markdown",
+      "id": "1e7e34e1",
+      "metadata": {
+        "pycharm": {
+          "name": "#%% md\n"
+        }
+      },
+      "source": [
+        "This will use the embedding function given above to create a representation\n",
+        "for the query and use that to search Vespa. Note that this will use the\n",
+        "`default` ranking function, which we set up in the application package\n",
+        "above. You can use the `ranking` argument to `similarity_search` to\n",
+        "specify which ranking function to use.\n",
+        "\n",
+        "Please refer to the [pyvespa documentation](https://pyvespa.readthedocs.io/en/latest/getting-started-pyvespa.html#Query)\n",
+        "for more information.\n",
+        "\n",
+        "This covers the basic usage of the Vespa store in LangChain.\n",
+        "Now you can return the results and continue using these in LangChain.\n",
+        "\n",
+        "#### Updating documents\n",
+        "\n",
+        "An alternative to calling `from_documents`, you can create the vector\n",
+        "store directly and call `add_texts` from that. This can also be used to update\n",
+        "documents:"
+      ]
+    },
+    {
+      "cell_type": "code",
+      "execution_count": null,
+      "outputs": [],
+      "source": [
+        "query = \"What did the president say about Ketanji Brown Jackson\"\n",
+        "results = db.similarity_search(query)\n",
+        "result = results[0]\n",
+        "\n",
+        "result.page_content = \"UPDATED: \" + result.page_content\n",
+        "db.add_texts([result.page_content], [result.metadata], result.metadata[\"id\"])\n",
+        "\n",
+        "results = db.similarity_search(query)\n",
+        "print(results[0].page_content)"
+      ],
+      "metadata": {
+        "collapsed": false,
+        "pycharm": {
+          "name": "#%%\n"
+        }
+      },
+      "id": "a5256284"
+    },
+    {
+      "cell_type": "markdown",
+      "source": [
+        "However, the `pyvespa` library contains methods to manipulate\n",
+        "content on Vespa which you can use directly.\n",
+        "\n",
+        "#### Deleting documents\n",
+        "\n",
+        "You can delete documents using the `delete` function:"
+      ],
+      "metadata": {
+        "collapsed": false,
+        "pycharm": {
+          "name": "#%% md\n"
+        }
+      },
+      "id": "2526b50e"
+    },
+    {
+      "cell_type": "code",
+      "execution_count": null,
+      "outputs": [],
+      "source": [
+        "result = db.similarity_search(query)\n",
+        "# docs[0].metadata[\"id\"] == \"id:testapp:testapp::32\"\n",
+        "\n",
+        "db.delete([\"32\"])\n",
+        "result = db.similarity_search(query)\n",
+        "# docs[0].metadata[\"id\"] != \"id:testapp:testapp::32\""
+      ],
+      "metadata": {
+        "collapsed": false,
+        "pycharm": {
+          "name": "#%%\n"
+        }
+      },
+      "id": "52cab87e"
+    },
+    {
+      "cell_type": "markdown",
+      "source": [
+        "Again, the `pyvespa` connection contains methods to delete documents as well.\n",
+        "\n",
+        "### Returning with scores\n",
+        "\n",
+        "The `similarity_search` method only returns the documents in order of\n",
+        "relevancy. To retrieve the actual scores:"
+      ],
+      "metadata": {
+        "collapsed": false,
+        "pycharm": {
+          "name": "#%% md\n"
+        }
+      },
+      "id": "deffaba5"
+    },
+    {
+      "cell_type": "code",
+      "execution_count": null,
+      "outputs": [],
+      "source": [
+        "results = db.similarity_search_with_score(query)\n",
+        "result = results[0]\n",
+        "# result[1] ~= 0.463"
+      ],
+      "metadata": {
+        "collapsed": false,
+        "pycharm": {
+          "name": "#%%\n"
+        }
+      },
+      "id": "cd9ae173"
+    },
+    {
+      "cell_type": "markdown",
+      "source": [
+        "This is a result of using the `\"all-MiniLM-L6-v2\"` embedding model using the\n",
+        "cosine distance function (as given by the argument `angular` in the\n",
+        "application function).\n",
+        "\n",
+        "Different embedding functions need different distance functions, and Vespa\n",
+        "needs to know which distance function to use when orderings documents.\n",
+        "Please refer to the\n",
+        "[documentation on distance functions](https://docs.vespa.ai/en/reference/schema-reference.html#distance-metric)\n",
+        "for more information.\n",
+        "\n",
+        "### As retriever\n",
+        "\n",
+        "To use this vector store as a\n",
+        "[LangChain retriever](https://python.langchain.com/docs/modules/data_connection/retrievers/)\n",
+        "simply call the `as_retriever` function, which is a standard vector store\n",
+        "method:"
+      ],
+      "metadata": {
+        "collapsed": false,
+        "pycharm": {
+          "name": "#%% md\n"
+        }
+      },
+      "id": "7257d67a"
+    },
+    {
+      "cell_type": "code",
+      "execution_count": null,
+      "outputs": [],
+      "source": [
+        "db = VespaStore.from_documents(docs, embedding_function, app=vespa_app, **vespa_config)\n",
+        "retriever = db.as_retriever()\n",
+        "query = \"What did the president say about Ketanji Brown Jackson\"\n",
+        "results = retriever.get_relevant_documents(query)\n",
+        "\n",
+        "# results[0].metadata[\"id\"] == \"id:testapp:testapp::32\""
+      ],
+      "metadata": {
+        "collapsed": false,
+        "pycharm": {
+          "name": "#%%\n"
+        }
+      },
+      "id": "7fb717a9"
+    },
+    {
+      "cell_type": "markdown",
+      "source": [
+        "This allows for more general, unstructured, retrieval from the vector store.\n",
+        "\n",
+        "### Metadata\n",
+        "\n",
+        "In the example so far, we've only used the text and the embedding for that\n",
+        "text. Documents usually contain additional information, which in LangChain\n",
+        "is referred to as metadata.\n",
+        "\n",
+        "Vespa can contain many fields with different types by adding them to the application\n",
+        "package:"
+      ],
+      "metadata": {
+        "collapsed": false,
+        "pycharm": {
+          "name": "#%% md\n"
+        }
+      },
+      "id": "fba7f07e"
+    },
+    {
+      "cell_type": "code",
+      "execution_count": null,
+      "outputs": [],
+      "source": [
+        "app_package.schema.add_fields(\n",
+        "    # ...\n",
+        "    Field(name=\"date\", type=\"string\", indexing=[\"attribute\", \"summary\"]),\n",
+        "    Field(name=\"rating\", type=\"int\", indexing=[\"attribute\", \"summary\"]),\n",
+        "    Field(name=\"author\", type=\"string\", indexing=[\"attribute\", \"summary\"]),\n",
+        "    # ...\n",
+        ")\n",
+        "vespa_app = vespa_docker.deploy(application_package=app_package)"
+      ],
+      "metadata": {
+        "collapsed": false,
+        "pycharm": {
+          "name": "#%%\n"
+        }
+      },
+      "id": "59cffcf2"
+    },
+    {
+      "cell_type": "markdown",
+      "source": [
+        "We can add some metadata fields in the documents:"
+      ],
+      "metadata": {
+        "collapsed": false,
+        "pycharm": {
+          "name": "#%% md\n"
+        }
+      },
+      "id": "eebef70c"
+    },
+    {
+      "cell_type": "code",
+      "execution_count": null,
+      "outputs": [],
+      "source": [
+        "# Add metadata\n",
+        "for i, doc in enumerate(docs):\n",
+        "    doc.metadata[\"date\"] = f\"2023-{(i % 12)+1}-{(i % 28)+1}\"\n",
+        "    doc.metadata[\"rating\"] = range(1, 6)[i % 5]\n",
+        "    doc.metadata[\"author\"] = [\"Joe Biden\", \"Unknown\"][min(i, 1)]"
+      ],
+      "metadata": {
+        "collapsed": false,
+        "pycharm": {
+          "name": "#%%\n"
+        }
+      },
+      "id": "b21efbfa"
+    },
+    {
+      "cell_type": "markdown",
+      "source": [
+        "And let the Vespa vector store know about these fields:"
+      ],
+      "metadata": {
+        "collapsed": false,
+        "pycharm": {
+          "name": "#%% md\n"
+        }
+      },
+      "id": "9b42bd4d"
+    },
+    {
+      "cell_type": "code",
+      "execution_count": null,
+      "outputs": [],
+      "source": [
+        "vespa_config.update(dict(metadata_fields=[\"date\", \"rating\", \"author\"]))"
+      ],
+      "metadata": {
+        "collapsed": false,
+        "pycharm": {
+          "name": "#%%\n"
+        }
+      },
+      "id": "6bb272f6"
+    },
+    {
+      "cell_type": "markdown",
+      "source": [
+        "Now, when searching for these documents, these fields will be returned.\n",
+        "Also, these fields can be filtered on:"
+      ],
+      "metadata": {
+        "collapsed": false,
+        "pycharm": {
+          "name": "#%% md\n"
+        }
+      },
+      "id": "43818655"
+    },
+    {
+      "cell_type": "code",
+      "execution_count": null,
+      "outputs": [],
+      "source": [
+        "db = VespaStore.from_documents(docs, embedding_function, app=vespa_app, **vespa_config)\n",
+        "query = \"What did the president say about Ketanji Brown Jackson\"\n",
+        "results = db.similarity_search(query, filter=\"rating > 3\")\n",
+        "# results[0].metadata[\"id\"] == \"id:testapp:testapp::34\"\n",
+        "# results[0].metadata[\"author\"] == \"Unknown\""
+      ],
+      "metadata": {
+        "collapsed": false,
+        "pycharm": {
+          "name": "#%%\n"
+        }
+      },
+      "id": "831759f3"
+    },
+    {
+      "cell_type": "markdown",
+      "source": [
+        "### Custom query\n",
+        "\n",
+        "If the default behavior of the similarity search does not fit your\n",
+        "requirements, you can always provide your own query. Thus, you don't\n",
+        "need to provide all of the configuration to the vector store, but\n",
+        "rather just write this yourself.\n",
+        "\n",
+        "First, let's add a BM25 ranking function to our application:"
+      ],
+      "metadata": {
+        "collapsed": false,
+        "pycharm": {
+          "name": "#%% md\n"
+        }
+      },
+      "id": "a49aad6e"
+    },
+    {
+      "cell_type": "code",
+      "execution_count": null,
+      "outputs": [],
+      "source": [
+        "from vespa.package import FieldSet\n",
+        "\n",
+        "app_package.schema.add_field_set(FieldSet(name=\"default\", fields=[\"text\"]))\n",
+        "app_package.schema.add_rank_profile(RankProfile(name=\"bm25\", first_phase=\"bm25(text)\"))\n",
+        "vespa_app = vespa_docker.deploy(application_package=app_package)\n",
+        "db = VespaStore.from_documents(docs, embedding_function, app=vespa_app, **vespa_config)"
+      ],
+      "metadata": {
+        "collapsed": false,
+        "pycharm": {
+          "name": "#%%\n"
+        }
+      },
+      "id": "d0fb0562"
+    },
+    {
+      "cell_type": "markdown",
+      "source": [
+        "Then, to perform a regular text search based on BM25:"
+      ],
+      "metadata": {
+        "collapsed": false,
+        "pycharm": {
+          "name": "#%% md\n"
+        }
+      },
+      "id": "fe607747"
+    },
+    {
+      "cell_type": "code",
+      "execution_count": null,
+      "outputs": [],
+      "source": [
+        "query = \"What did the president say about Ketanji Brown Jackson\"\n",
+        "custom_query = {\n",
+        "    \"yql\": f\"select * from sources * where userQuery()\",\n",
+        "    \"query\": query,\n",
+        "    \"type\": \"weakAnd\",\n",
+        "    \"ranking\": \"bm25\",\n",
+        "    \"hits\": 4\n",
+        "}\n",
+        "results  = db.similarity_search_with_score(query, custom_query=custom_query)\n",
+        "# results[0][0].metadata[\"id\"] == \"id:testapp:testapp::32\"\n",
+        "# results[0][1] ~= 14.384"
+      ],
+      "metadata": {
+        "collapsed": false,
+        "pycharm": {
+          "name": "#%%\n"
+        }
+      },
+      "id": "cee245c3"
+    },
+    {
+      "cell_type": "markdown",
+      "source": [
+        "All of the powerful search and query capabilities of Vespa can be used\n",
+        "by using a custom query. Please refer to the Vespa documentation on it's\n",
+        "[Query API](https://docs.vespa.ai/en/query-api.html) for more details.\n",
+        "\n",
+        "### Hybrid search\n",
+        "\n",
+        "Hybrid search means using both a classic term-based search such as\n",
+        "BM25 and a vector search and combining the results. We need to create\n",
+        "a new rank profile for hybrid search on Vespa:"
+      ],
+      "metadata": {
+        "collapsed": false,
+        "pycharm": {
+          "name": "#%% md\n"
+        }
+      },
+      "id": "41a4c081"
+    },
+    {
+      "cell_type": "code",
+      "execution_count": null,
+      "outputs": [],
+      "source": [
+        "app_package.schema.add_rank_profile(\n",
+        "    RankProfile(name=\"hybrid\",\n",
+        "                first_phase=\"log(bm25(text)) + 0.5 * closeness(field, embedding)\",\n",
+        "                inputs=[(\"query(query_embedding)\", \"tensor<float>(x[384])\")]\n",
+        "                )\n",
+        ")\n",
+        "vespa_app = vespa_docker.deploy(application_package=app_package)\n",
+        "db = VespaStore.from_documents(docs, embedding_function, app=vespa_app, **vespa_config)"
+      ],
+      "metadata": {
+        "collapsed": false,
+        "pycharm": {
+          "name": "#%%\n"
+        }
+      },
+      "id": "bf73efc1"
+    },
+    {
+      "cell_type": "markdown",
+      "source": [
+        "Here, we score each document as a combination of it's BM25 score and its\n",
+        "distance score. We can query using a custom query:"
+      ],
+      "metadata": {
+        "collapsed": false,
+        "pycharm": {
+          "name": "#%% md\n"
+        }
+      },
+      "id": "40f48711"
+    },
+    {
+      "cell_type": "code",
+      "execution_count": null,
+      "outputs": [],
+      "source": [
+        "query = \"What did the president say about Ketanji Brown Jackson\"\n",
+        "query_embedding = embedding_function.embed_query(query)\n",
+        "nearest_neighbor_expression = \"{targetHits: 4}nearestNeighbor(embedding, query_embedding)\"\n",
+        "custom_query = {\n",
+        "    \"yql\": f\"select * from sources * where {nearest_neighbor_expression} and userQuery()\",\n",
+        "    \"query\": query,\n",
+        "    \"type\": \"weakAnd\",\n",
+        "    \"input.query(query_embedding)\": query_embedding,\n",
+        "    \"ranking\": \"hybrid\",\n",
+        "    \"hits\": 4\n",
+        "}\n",
+        "results = db.similarity_search_with_score(query, custom_query=custom_query)\n",
+        "# results[0][0].metadata[\"id\"], \"id:testapp:testapp::32\")\n",
+        "# results[0][1] ~= 2.897"
+      ],
+      "metadata": {
+        "collapsed": false,
+        "pycharm": {
+          "name": "#%%\n"
+        }
+      },
+      "id": "d2e289f0"
+    },
+    {
+      "cell_type": "markdown",
+      "source": [
+        "### Native embedders in Vespa\n",
+        "\n",
+        "Up until this point we've used an embedding function in Python to provide\n",
+        "embeddings for the texts. Vespa supports embedding function natively, so\n",
+        "you can defer this calculation in to Vespa. One benefit is the ability to use\n",
+        "GPUs when embedding documents if you have a large collections.\n",
+        "\n",
+        "Please refer to [Vespa embeddings](https://docs.vespa.ai/en/embedding.html)\n",
+        "for more information.\n",
+        "\n",
+        "First, we need to modify our application package:"
+      ],
+      "metadata": {
+        "collapsed": false,
+        "pycharm": {
+          "name": "#%% md\n"
+        }
+      },
+      "id": "958e269f"
+    },
+    {
+      "cell_type": "code",
+      "execution_count": null,
+      "outputs": [],
+      "source": [
+        "from vespa.package import Component, Parameter\n",
+        "\n",
+        "app_package.components = [\n",
+        "    Component(id=\"hf-embedder\", type=\"hugging-face-embedder\",\n",
+        "        parameters=[\n",
+        "            Parameter(\"transformer-model\", {\"path\": \"...\"}),\n",
+        "            Parameter(\"tokenizer-model\", {\"url\": \"...\"}),\n",
+        "        ]\n",
+        "    )\n",
+        "]\n",
+        "Field(name=\"hfembedding\", type=\"tensor<float>(x[384])\",\n",
+        "      is_document_field=False,\n",
+        "      indexing=[\"input text\", \"embed hf-embedder\", \"attribute\", \"summary\"],\n",
+        "      attribute=[f\"distance-metric: angular\"],\n",
+        "      )\n",
+        "app_package.schema.add_rank_profile(\n",
+        "    RankProfile(name=\"hf_similarity\",\n",
+        "                first_phase=\"closeness(field, hfembedding)\",\n",
+        "                inputs=[(\"query(query_embedding)\", \"tensor<float>(x[384])\")]\n",
+        "                )\n",
+        ")"
+      ],
+      "metadata": {
+        "collapsed": false,
+        "pycharm": {
+          "name": "#%%\n"
+        }
+      },
+      "id": "56b9686c"
+    },
+    {
+      "cell_type": "markdown",
+      "source": [
+        "Please refer to the embeddings documentation on adding embedder models\n",
+        "and tokenizers to the application. Note that the `hfembedding` field\n",
+        "includes instructions for embedding using the `hf-embedder`.\n",
+        "\n",
+        "Now we can query with a custom query:"
+      ],
+      "metadata": {
+        "collapsed": false,
+        "pycharm": {
+          "name": "#%% md\n"
+        }
+      },
+      "id": "5cd721a8"
+    },
+    {
+      "cell_type": "code",
+      "execution_count": null,
+      "outputs": [],
+      "source": [
+        "query = \"What did the president say about Ketanji Brown Jackson\"\n",
+        "nearest_neighbor_expression = \"{targetHits: 4}nearestNeighbor(internalembedding, query_embedding)\"\n",
+        "custom_query = {\n",
+        "    \"yql\": f\"select * from sources * where {nearest_neighbor_expression}\",\n",
+        "    \"input.query(query_embedding)\": f\"embed(hf-embedder, \\\"{query}\\\")\",\n",
+        "    \"ranking\": \"internal_similarity\",\n",
+        "    \"hits\": 4\n",
+        "}\n",
+        "results = db.similarity_search_with_score(query, custom_query=custom_query)\n",
+        "# results[0][0].metadata[\"id\"], \"id:testapp:testapp::32\")\n",
+        "# results[0][1] ~= 0.630"
+      ],
+      "metadata": {
+        "collapsed": false,
+        "pycharm": {
+          "name": "#%%\n"
+        }
+      },
+      "id": "da631d13"
+    },
+    {
+      "cell_type": "markdown",
+      "source": [
+        "Note that the query here includes an `embed` instruction to embed the query\n",
+        "using the same model as for the documents.\n",
+        "\n",
+        "### Approximate nearest neighbor\n",
+        "\n",
+        "In all of the above examples, we've used exact nearest neighbor to\n",
+        "find results. However, for large collections of documents this is\n",
+        "not feasible as one has to scan through all documents to find the\n",
+        "best matches. To avoid this, we can use\n",
+        "[approximate nearest neighbors](https://docs.vespa.ai/en/approximate-nn-hnsw.html).\n",
+        "\n",
+        "First, we can change the embedding field to create a HNSW index:"
+      ],
+      "metadata": {
+        "collapsed": false,
+        "pycharm": {
+          "name": "#%% md\n"
+        }
+      },
+      "id": "a333b553"
+    },
+    {
+      "cell_type": "code",
+      "execution_count": null,
+      "outputs": [],
+      "source": [
+        "from vespa.package import HNSW\n",
+        "\n",
+        "app_package.schema.add_fields(\n",
+        "    Field(name=\"embedding\", type=\"tensor<float>(x[384])\",\n",
+        "          indexing=[\"attribute\", \"summary\", \"index\"],\n",
+        "          ann=HNSW(distance_metric=\"angular\", max_links_per_node=16, neighbors_to_explore_at_insert=200)\n",
+        "          )\n",
+        ")\n"
+      ],
+      "metadata": {
+        "collapsed": false,
+        "pycharm": {
+          "name": "#%%\n"
+        }
+      },
+      "id": "9ee955c8"
+    },
+    {
+      "cell_type": "markdown",
+      "source": [
+        "This creates a HNSW index on the embedding data which allows for efficient\n",
+        "searching. With this set, we can easily search using ANN by setting\n",
+        "the `approximate` argument to `True`:"
+      ],
+      "metadata": {
+        "collapsed": false,
+        "pycharm": {
+          "name": "#%% md\n"
+        }
+      },
+      "id": "2ed1c224"
+    },
+    {
+      "cell_type": "code",
+      "execution_count": null,
+      "outputs": [],
+      "source": [
+        "query = \"What did the president say about Ketanji Brown Jackson\"\n",
+        "results = db.similarity_search(query, approximate=True)\n",
+        "# results[0][0].metadata[\"id\"], \"id:testapp:testapp::32\")"
+      ],
+      "metadata": {
+        "collapsed": false,
+        "pycharm": {
+          "name": "#%%\n"
+        }
+      },
+      "id": "7981739a"
+    },
+    {
+      "cell_type": "markdown",
+      "source": [
+        "This covers most of the functionality in the Vespa vector store in LangChain.\n",
+        "\n"
+      ],
+      "metadata": {
+        "collapsed": false,
+        "pycharm": {
+          "name": "#%% md\n"
+        }
+      },
+      "id": "24791204"
     }
-   }
-  },
-  {
-   "cell_type": "markdown",
-   "source": [
-    "This sets up a Vespa application with a schema for each document that contains\n",
-    "two fields: `text` for holding the document text and `embedding` for holding\n",
-    "the embedding vector. The `text` field is set up to use a BM25 index for\n",
-    "efficient text retrieval, and we'll see how to use this and hybrid search a\n",
-    "bit later.\n",
-    "\n",
-    "The `embedding` field is set up with a vector of length 384 to hold the\n",
-    "embedding representation of the text. See\n",
-    "[Vespa's Tensor Guide](https://docs.vespa.ai/en/tensor-user-guide.html)\n",
-    "for more on tensors in Vespa.\n",
-    "\n",
-    "Lastly, we add a [rank profile](https://docs.vespa.ai/en/ranking.html) to\n",
-    "instruct Vespa how to order documents. Here we set this up with a\n",
-    "[nearest neighbor search](https://docs.vespa.ai/en/nearest-neighbor-search.html).\n",
-    "\n",
-    "Now we can deploy this application locally:"
-   ],
-   "metadata": {
-    "collapsed": false,
-    "pycharm": {
-     "name": "#%% md\n"
+  ],
+  "metadata": {
+    "kernelspec": {
+      "display_name": "Python 3 (ipykernel)",
+      "language": "python",
+      "name": "python3"
+    },
+    "language_info": {
+      "codemirror_mode": {
+        "name": "ipython",
+        "version": 3
+      },
+      "file_extension": ".py",
+      "mimetype": "text/x-python",
+      "name": "python",
+      "nbconvert_exporter": "python",
+      "pygments_lexer": "ipython3",
+      "version": "3.10.6"
     }
-   }
   },
-  {
-   "cell_type": "code",
-   "execution_count": 2,
-   "id": "c10dd962",
-   "metadata": {
-    "tags": []
-   },
-   "outputs": [],
-   "source": [
-    "from vespa.deployment import VespaDocker\n",
-    "\n",
-    "vespa_docker = VespaDocker()\n",
-    "vespa_app = vespa_docker.deploy(application_package=app_package)"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "id": "3df4ce53",
-   "metadata": {},
-   "source": [
-    "This deploys and creates a connection to a `Vespa` service. In case you\n",
-    "already have a Vespa application running, for instance in the cloud,\n",
-    "please refer to the PyVespa application for how to connect.\n",
-    "\n",
-    "#### Creating a Vespa vector store\n",
-    "\n",
-    "Now, let's load some documents:"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": null,
-   "outputs": [],
-   "source": [
-    "from langchain.document_loaders import TextLoader\n",
-    "from langchain.text_splitter import CharacterTextSplitter\n",
-    "\n",
-    "loader = TextLoader(\"../../modules/state_of_the_union.txt\")\n",
-    "documents = loader.load()\n",
-    "text_splitter = CharacterTextSplitter(chunk_size=1000, chunk_overlap=0)\n",
-    "docs = text_splitter.split_documents(documents)\n",
-    "\n",
-    "from langchain.embeddings.sentence_transformer import SentenceTransformerEmbeddings\n",
-    "\n",
-    "embedding_function = SentenceTransformerEmbeddings(model_name=\"all-MiniLM-L6-v2\")"
-   ],
-   "metadata": {
-    "collapsed": false,
-    "pycharm": {
-     "name": "#%%\n"
-    }
-   }
-  },
-  {
-   "cell_type": "markdown",
-   "source": [
-    "Here, we also set up local sentence embedder to transform the text to embedding\n",
-    "vectors. One could also use OpenAI embeddings, but the vector length needs to\n",
-    "be updated to `1536` to reflect the larger size of that embedding.\n",
-    "\n",
-    "To feed these to Vespa, we need to configure how the vector store should map to\n",
-    "fields in the Vespa application. Then we create the vector store directly from\n",
-    "this set of documents:"
-   ],
-   "metadata": {
-    "collapsed": false,
-    "pycharm": {
-     "name": "#%% md\n"
-    }
-   }
-  },
-  {
-   "cell_type": "code",
-   "execution_count": null,
-   "outputs": [],
-   "source": [
-    "vespa_config = dict(\n",
-    "    page_content_field=\"text\",\n",
-    "    embedding_field=\"embedding\",\n",
-    "    input_field=\"query_embedding\"\n",
-    ")\n",
-    "\n",
-    "from langchain.vectorstores import VespaStore\n",
-    "\n",
-    "db = VespaStore.from_documents(docs, embedding_function, app=vespa_app, **vespa_config)"
-   ],
-   "metadata": {
-    "collapsed": false,
-    "pycharm": {
-     "name": "#%%\n"
-    }
-   }
-  },
-  {
-   "cell_type": "markdown",
-   "source": [
-    "This creates a Vespa vector store and feeds that set of documents to Vespa.\n",
-    "The vector store takes care of calling the embedding function for each document\n",
-    "and inserts them into the database.\n",
-    "\n",
-    "We can now query the vector store:"
-   ],
-   "metadata": {
-    "collapsed": false,
-    "pycharm": {
-     "name": "#%% md\n"
-    }
-   }
-  },
-  {
-   "cell_type": "code",
-   "execution_count": null,
-   "id": "7ccca1f4",
-   "metadata": {
-    "pycharm": {
-     "name": "#%%\n"
-    }
-   },
-   "outputs": [],
-   "source": [
-    "query = \"What did the president say about Ketanji Brown Jackson\"\n",
-    "results = db.similarity_search(query)\n",
-    "\n",
-    "print(results[0].page_content)"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "id": "1e7e34e1",
-   "metadata": {
-    "pycharm": {
-     "name": "#%% md\n"
-    }
-   },
-   "source": [
-    "This will use the embedding function given above to create a representation\n",
-    "for the query and use that to search Vespa. Note that this will use the\n",
-    "`default` ranking function, which we set up in the application package\n",
-    "above. You can use the `ranking` argument to `similarity_search` to\n",
-    "specify which ranking function to use.\n",
-    "\n",
-    "Please refer to the [pyvespa documentation](https://pyvespa.readthedocs.io/en/latest/getting-started-pyvespa.html#Query)\n",
-    "for more information.\n",
-    "\n",
-    "This covers the basic usage of the Vespa store in LangChain.\n",
-    "Now you can return the results and continue using these in LangChain.\n",
-    "\n",
-    "#### Updating documents\n",
-    "\n",
-    "An alternative to calling `from_documents`, you can create the vector\n",
-    "store directly and call `add_texts` from that. This can also be used to update\n",
-    "documents:"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": null,
-   "outputs": [],
-   "source": [
-    "query = \"What did the president say about Ketanji Brown Jackson\"\n",
-    "results = db.similarity_search(query)\n",
-    "result = results[0]\n",
-    "\n",
-    "result.page_content = \"UPDATED: \" + result.page_content\n",
-    "db.add_texts([result.page_content], [result.metadata], result.metadata[\"id\"])\n",
-    "\n",
-    "results = db.similarity_search(query)\n",
-    "print(results[0].page_content)"
-   ],
-   "metadata": {
-    "collapsed": false,
-    "pycharm": {
-     "name": "#%%\n"
-    }
-   }
-  },
-  {
-   "cell_type": "markdown",
-   "source": [
-    "However, the `pyvespa` library contains methods to manipulate\n",
-    "content on Vespa which you can use directly.\n",
-    "\n",
-    "#### Deleting documents\n",
-    "\n",
-    "You can delete documents using the `delete` function:"
-   ],
-   "metadata": {
-    "collapsed": false,
-    "pycharm": {
-     "name": "#%% md\n"
-    }
-   }
-  },
-  {
-   "cell_type": "code",
-   "execution_count": null,
-   "outputs": [],
-   "source": [
-    "result = db.similarity_search(query)\n",
-    "# docs[0].metadata[\"id\"] == \"id:testapp:testapp::32\"\n",
-    "\n",
-    "db.delete([\"32\"])\n",
-    "result = db.similarity_search(query)\n",
-    "# docs[0].metadata[\"id\"] != \"id:testapp:testapp::32\""
-   ],
-   "metadata": {
-    "collapsed": false,
-    "pycharm": {
-     "name": "#%%\n"
-    }
-   }
-  },
-  {
-   "cell_type": "markdown",
-   "source": [
-    "Again, the `pyvespa` connection contains methods to delete documents as well.\n",
-    "\n",
-    "### Returning with scores\n",
-    "\n",
-    "The `similarity_search` method only returns the documents in order of\n",
-    "relevancy. To retrieve the actual scores:"
-   ],
-   "metadata": {
-    "collapsed": false,
-    "pycharm": {
-     "name": "#%% md\n"
-    }
-   }
-  },
-  {
-   "cell_type": "code",
-   "execution_count": null,
-   "outputs": [],
-   "source": [
-    "results = db.similarity_search_with_score(query)\n",
-    "result = results[0]\n",
-    "# result[1] ~= 0.463"
-   ],
-   "metadata": {
-    "collapsed": false,
-    "pycharm": {
-     "name": "#%%\n"
-    }
-   }
-  },
-  {
-   "cell_type": "markdown",
-   "source": [
-    "This is a result of using the `\"all-MiniLM-L6-v2\"` embedding model using the\n",
-    "cosine distance function (as given by the argument `angular` in the\n",
-    "application function).\n",
-    "\n",
-    "Different embedding functions need different distance functions, and Vespa\n",
-    "needs to know which distance function to use when orderings documents.\n",
-    "Please refer to the\n",
-    "[documentation on distance functions](https://docs.vespa.ai/en/reference/schema-reference.html#distance-metric)\n",
-    "for more information.\n",
-    "\n",
-    "### As retriever\n",
-    "\n",
-    "To use this vector store as a\n",
-    "[LangChain retriever](https://python.langchain.com/docs/modules/data_connection/retrievers/)\n",
-    "simply call the `as_retriever` function, which is a standard vector store\n",
-    "method:"
-   ],
-   "metadata": {
-    "collapsed": false,
-    "pycharm": {
-     "name": "#%% md\n"
-    }
-   }
-  },
-  {
-   "cell_type": "code",
-   "execution_count": null,
-   "outputs": [],
-   "source": [
-    "db = VespaStore.from_documents(docs, embedding_function, app=vespa_app, **vespa_config)\n",
-    "retriever = db.as_retriever()\n",
-    "query = \"What did the president say about Ketanji Brown Jackson\"\n",
-    "results = retriever.get_relevant_documents(query)\n",
-    "\n",
-    "# results[0].metadata[\"id\"] == \"id:testapp:testapp::32\""
-   ],
-   "metadata": {
-    "collapsed": false,
-    "pycharm": {
-     "name": "#%%\n"
-    }
-   }
-  },
-  {
-   "cell_type": "markdown",
-   "source": [
-    "This allows for more general, unstructured, retrieval from the vector store.\n",
-    "\n",
-    "### Metadata\n",
-    "\n",
-    "In the example so far, we've only used the text and the embedding for that\n",
-    "text. Documents usually contain additional information, which in LangChain\n",
-    "is referred to as metadata.\n",
-    "\n",
-    "Vespa can contain many fields with different types by adding them to the application\n",
-    "package:"
-   ],
-   "metadata": {
-    "collapsed": false,
-    "pycharm": {
-     "name": "#%% md\n"
-    }
-   }
-  },
-  {
-   "cell_type": "code",
-   "execution_count": null,
-   "outputs": [],
-   "source": [
-    "app_package.schema.add_fields(\n",
-    "    # ...\n",
-    "    Field(name=\"date\", type=\"string\", indexing=[\"attribute\", \"summary\"]),\n",
-    "    Field(name=\"rating\", type=\"int\", indexing=[\"attribute\", \"summary\"]),\n",
-    "    Field(name=\"author\", type=\"string\", indexing=[\"attribute\", \"summary\"]),\n",
-    "    # ...\n",
-    ")\n",
-    "vespa_app = vespa_docker.deploy(application_package=app_package)"
-   ],
-   "metadata": {
-    "collapsed": false,
-    "pycharm": {
-     "name": "#%%\n"
-    }
-   }
-  },
-  {
-   "cell_type": "markdown",
-   "source": [
-    "We can add some metadata fields in the documents:"
-   ],
-   "metadata": {
-    "collapsed": false,
-    "pycharm": {
-     "name": "#%% md\n"
-    }
-   }
-  },
-  {
-   "cell_type": "code",
-   "execution_count": null,
-   "outputs": [],
-   "source": [
-    "# Add metadata\n",
-    "for i, doc in enumerate(docs):\n",
-    "    doc.metadata[\"date\"] = f\"2023-{(i % 12)+1}-{(i % 28)+1}\"\n",
-    "    doc.metadata[\"rating\"] = range(1, 6)[i % 5]\n",
-    "    doc.metadata[\"author\"] = [\"Joe Biden\", \"Unknown\"][min(i, 1)]"
-   ],
-   "metadata": {
-    "collapsed": false,
-    "pycharm": {
-     "name": "#%%\n"
-    }
-   }
-  },
-  {
-   "cell_type": "markdown",
-   "source": [
-    "And let the Vespa vector store know about these fields:"
-   ],
-   "metadata": {
-    "collapsed": false,
-    "pycharm": {
-     "name": "#%% md\n"
-    }
-   }
-  },
-  {
-   "cell_type": "code",
-   "execution_count": null,
-   "outputs": [],
-   "source": [
-    "vespa_config.update(dict(metadata_fields=[\"date\", \"rating\", \"author\"]))"
-   ],
-   "metadata": {
-    "collapsed": false,
-    "pycharm": {
-     "name": "#%%\n"
-    }
-   }
-  },
-  {
-   "cell_type": "markdown",
-   "source": [
-    "Now, when searching for these documents, these fields will be returned.\n",
-    "Also, these fields can be filtered on:"
-   ],
-   "metadata": {
-    "collapsed": false,
-    "pycharm": {
-     "name": "#%% md\n"
-    }
-   }
-  },
-  {
-   "cell_type": "code",
-   "execution_count": null,
-   "outputs": [],
-   "source": [
-    "db = VespaStore.from_documents(docs, embedding_function, app=vespa_app, **vespa_config)\n",
-    "query = \"What did the president say about Ketanji Brown Jackson\"\n",
-    "results = db.similarity_search(query, filter=\"rating > 3\")\n",
-    "# results[0].metadata[\"id\"] == \"id:testapp:testapp::34\"\n",
-    "# results[0].metadata[\"author\"] == \"Unknown\""
-   ],
-   "metadata": {
-    "collapsed": false,
-    "pycharm": {
-     "name": "#%%\n"
-    }
-   }
-  },
-  {
-   "cell_type": "markdown",
-   "source": [
-    "### Custom query\n",
-    "\n",
-    "If the default behavior of the similarity search does not fit your\n",
-    "requirements, you can always provide your own query. Thus, you don't\n",
-    "need to provide all of the configuration to the vector store, but\n",
-    "rather just write this yourself.\n",
-    "\n",
-    "First, let's add a BM25 ranking function to our application:"
-   ],
-   "metadata": {
-    "collapsed": false,
-    "pycharm": {
-     "name": "#%% md\n"
-    }
-   }
-  },
-  {
-   "cell_type": "code",
-   "execution_count": null,
-   "outputs": [],
-   "source": [
-    "from vespa.package import FieldSet\n",
-    "\n",
-    "app_package.schema.add_field_set(FieldSet(name=\"default\", fields=[\"text\"]))\n",
-    "app_package.schema.add_rank_profile(RankProfile(name=\"bm25\", first_phase=\"bm25(text)\"))\n",
-    "vespa_app = vespa_docker.deploy(application_package=app_package)\n",
-    "db = VespaStore.from_documents(docs, embedding_function, app=vespa_app, **vespa_config)"
-   ],
-   "metadata": {
-    "collapsed": false,
-    "pycharm": {
-     "name": "#%%\n"
-    }
-   }
-  },
-  {
-   "cell_type": "markdown",
-   "source": [
-    "Then, to perform a regular text search based on BM25:"
-   ],
-   "metadata": {
-    "collapsed": false,
-    "pycharm": {
-     "name": "#%% md\n"
-    }
-   }
-  },
-  {
-   "cell_type": "code",
-   "execution_count": null,
-   "outputs": [],
-   "source": [
-    "query = \"What did the president say about Ketanji Brown Jackson\"\n",
-    "custom_query = {\n",
-    "    \"yql\": f\"select * from sources * where userQuery()\",\n",
-    "    \"query\": query,\n",
-    "    \"type\": \"weakAnd\",\n",
-    "    \"ranking\": \"bm25\",\n",
-    "    \"hits\": 4\n",
-    "}\n",
-    "results  = db.similarity_search_with_score(query, custom_query=custom_query)\n",
-    "# results[0][0].metadata[\"id\"] == \"id:testapp:testapp::32\"\n",
-    "# results[0][1] ~= 14.384"
-   ],
-   "metadata": {
-    "collapsed": false,
-    "pycharm": {
-     "name": "#%%\n"
-    }
-   }
-  },
-  {
-   "cell_type": "markdown",
-   "source": [
-    "All of the powerful search and query capabilities of Vespa can be used\n",
-    "by using a custom query. Please refer to the Vespa documentation on it's\n",
-    "[Query API](https://docs.vespa.ai/en/query-api.html) for more details.\n",
-    "\n",
-    "### Hybrid search\n",
-    "\n",
-    "Hybrid search means using both a classic term-based search such as\n",
-    "BM25 and a vector search and combining the results. We need to create\n",
-    "a new rank profile for hybrid search on Vespa:"
-   ],
-   "metadata": {
-    "collapsed": false,
-    "pycharm": {
-     "name": "#%% md\n"
-    }
-   }
-  },
-  {
-   "cell_type": "code",
-   "execution_count": null,
-   "outputs": [],
-   "source": [
-    "app_package.schema.add_rank_profile(\n",
-    "    RankProfile(name=\"hybrid\",\n",
-    "                first_phase=\"log(bm25(text)) + 0.5 * closeness(field, embedding)\",\n",
-    "                inputs=[(\"query(query_embedding)\", \"tensor<float>(x[384])\")]\n",
-    "                )\n",
-    ")\n",
-    "vespa_app = vespa_docker.deploy(application_package=app_package)\n",
-    "db = VespaStore.from_documents(docs, embedding_function, app=vespa_app, **vespa_config)"
-   ],
-   "metadata": {
-    "collapsed": false,
-    "pycharm": {
-     "name": "#%%\n"
-    }
-   }
-  },
-  {
-   "cell_type": "markdown",
-   "source": [
-    "Here, we score each document as a combination of it's BM25 score and its\n",
-    "distance score. We can query using a custom query:"
-   ],
-   "metadata": {
-    "collapsed": false,
-    "pycharm": {
-     "name": "#%% md\n"
-    }
-   }
-  },
-  {
-   "cell_type": "code",
-   "execution_count": null,
-   "outputs": [],
-   "source": [
-    "query = \"What did the president say about Ketanji Brown Jackson\"\n",
-    "query_embedding = embedding_function.embed_query(query)\n",
-    "nearest_neighbor_expression = \"{targetHits: 4}nearestNeighbor(embedding, query_embedding)\"\n",
-    "custom_query = {\n",
-    "    \"yql\": f\"select * from sources * where {nearest_neighbor_expression} and userQuery()\",\n",
-    "    \"query\": query,\n",
-    "    \"type\": \"weakAnd\",\n",
-    "    \"input.query(query_embedding)\": query_embedding,\n",
-    "    \"ranking\": \"hybrid\",\n",
-    "    \"hits\": 4\n",
-    "}\n",
-    "results = db.similarity_search_with_score(query, custom_query=custom_query)\n",
-    "# results[0][0].metadata[\"id\"], \"id:testapp:testapp::32\")\n",
-    "# results[0][1] ~= 2.897"
-   ],
-   "metadata": {
-    "collapsed": false,
-    "pycharm": {
-     "name": "#%%\n"
-    }
-   }
-  },
-  {
-   "cell_type": "markdown",
-   "source": [
-    "### Native embedders in Vespa\n",
-    "\n",
-    "Up until this point we've used an embedding function in Python to provide\n",
-    "embeddings for the texts. Vespa supports embedding function natively, so\n",
-    "you can defer this calculation in to Vespa. One benefit is the ability to use\n",
-    "GPUs when embedding documents if you have a large collections.\n",
-    "\n",
-    "Please refer to [Vespa embeddings](https://docs.vespa.ai/en/embedding.html)\n",
-    "for more information.\n",
-    "\n",
-    "First, we need to modify our application package:"
-   ],
-   "metadata": {
-    "collapsed": false,
-    "pycharm": {
-     "name": "#%% md\n"
-    }
-   }
-  },
-  {
-   "cell_type": "code",
-   "execution_count": null,
-   "outputs": [],
-   "source": [
-    "from vespa.package import Component, Parameter\n",
-    "\n",
-    "app_package.components = [\n",
-    "    Component(id=\"hf-embedder\", type=\"hugging-face-embedder\",\n",
-    "        parameters=[\n",
-    "            Parameter(\"transformer-model\", {\"path\": \"...\"}),\n",
-    "            Parameter(\"tokenizer-model\", {\"url\": \"...\"}),\n",
-    "        ]\n",
-    "    )\n",
-    "]\n",
-    "Field(name=\"hfembedding\", type=\"tensor<float>(x[384])\",\n",
-    "      is_document_field=False,\n",
-    "      indexing=[\"input text\", \"embed hf-embedder\", \"attribute\", \"summary\"],\n",
-    "      attribute=[f\"distance-metric: angular\"],\n",
-    "      )\n",
-    "app_package.schema.add_rank_profile(\n",
-    "    RankProfile(name=\"hf_similarity\",\n",
-    "                first_phase=\"closeness(field, hfembedding)\",\n",
-    "                inputs=[(\"query(query_embedding)\", \"tensor<float>(x[384])\")]\n",
-    "                )\n",
-    ")"
-   ],
-   "metadata": {
-    "collapsed": false,
-    "pycharm": {
-     "name": "#%%\n"
-    }
-   }
-  },
-  {
-   "cell_type": "markdown",
-   "source": [
-    "Please refer to the embeddings documentation on adding embedder models\n",
-    "and tokenizers to the application. Note that the `hfembedding` field\n",
-    "includes instructions for embedding using the `hf-embedder`.\n",
-    "\n",
-    "Now we can query with a custom query:"
-   ],
-   "metadata": {
-    "collapsed": false,
-    "pycharm": {
-     "name": "#%% md\n"
-    }
-   }
-  },
-  {
-   "cell_type": "code",
-   "execution_count": null,
-   "outputs": [],
-   "source": [
-    "query = \"What did the president say about Ketanji Brown Jackson\"\n",
-    "nearest_neighbor_expression = \"{targetHits: 4}nearestNeighbor(internalembedding, query_embedding)\"\n",
-    "custom_query = {\n",
-    "    \"yql\": f\"select * from sources * where {nearest_neighbor_expression}\",\n",
-    "    \"input.query(query_embedding)\": f\"embed(hf-embedder, \\\"{query}\\\")\",\n",
-    "    \"ranking\": \"internal_similarity\",\n",
-    "    \"hits\": 4\n",
-    "}\n",
-    "results = db.similarity_search_with_score(query, custom_query=custom_query)\n",
-    "# results[0][0].metadata[\"id\"], \"id:testapp:testapp::32\")\n",
-    "# results[0][1] ~= 0.630"
-   ],
-   "metadata": {
-    "collapsed": false,
-    "pycharm": {
-     "name": "#%%\n"
-    }
-   }
-  },
-  {
-   "cell_type": "markdown",
-   "source": [
-    "Note that the query here includes an `embed` instruction to embed the query\n",
-    "using the same model as for the documents.\n",
-    "\n",
-    "### Approximate nearest neighbor\n",
-    "\n",
-    "In all of the above examples, we've used exact nearest neighbor to\n",
-    "find results. However, for large collections of documents this is\n",
-    "not feasible as one has to scan through all documents to find the\n",
-    "best matches. To avoid this, we can use\n",
-    "[approximate nearest neighbors](https://docs.vespa.ai/en/approximate-nn-hnsw.html).\n",
-    "\n",
-    "First, we can change the embedding field to create a HNSW index:"
-   ],
-   "metadata": {
-    "collapsed": false,
-    "pycharm": {
-     "name": "#%% md\n"
-    }
-   }
-  },
-  {
-   "cell_type": "code",
-   "execution_count": null,
-   "outputs": [],
-   "source": [
-    "from vespa.package import HNSW\n",
-    "\n",
-    "app_package.schema.add_fields(\n",
-    "    Field(name=\"embedding\", type=\"tensor<float>(x[384])\",\n",
-    "          indexing=[\"attribute\", \"summary\", \"index\"],\n",
-    "          ann=HNSW(distance_metric=\"angular\", max_links_per_node=16, neighbors_to_explore_at_insert=200)\n",
-    "          )\n",
-    ")\n"
-   ],
-   "metadata": {
-    "collapsed": false,
-    "pycharm": {
-     "name": "#%%\n"
-    }
-   }
-  },
-  {
-   "cell_type": "markdown",
-   "source": [
-    "This creates a HNSW index on the embedding data which allows for efficient\n",
-    "searching. With this set, we can easily search using ANN by setting\n",
-    "the `approximate` argument to `True`:"
-   ],
-   "metadata": {
-    "collapsed": false,
-    "pycharm": {
-     "name": "#%% md\n"
-    }
-   }
-  },
-  {
-   "cell_type": "code",
-   "execution_count": null,
-   "outputs": [],
-   "source": [
-    "query = \"What did the president say about Ketanji Brown Jackson\"\n",
-    "results = db.similarity_search(query, approximate=True)\n",
-    "# results[0][0].metadata[\"id\"], \"id:testapp:testapp::32\")"
-   ],
-   "metadata": {
-    "collapsed": false,
-    "pycharm": {
-     "name": "#%%\n"
-    }
-   }
-  },
-  {
-   "cell_type": "markdown",
-   "source": [
-    "This covers most of the functionality in the Vespa vector store in LangChain.\n",
-    "\n"
-   ],
-   "metadata": {
-    "collapsed": false,
-    "pycharm": {
-     "name": "#%% md\n"
-    }
-   }
-  }
- ],
- "metadata": {
-  "kernelspec": {
-   "display_name": "Python 3 (ipykernel)",
-   "language": "python",
-   "name": "python3"
-  },
-  "language_info": {
-   "codemirror_mode": {
-    "name": "ipython",
-    "version": 3
-   },
-   "file_extension": ".py",
-   "mimetype": "text/x-python",
-   "name": "python",
-   "nbconvert_exporter": "python",
-   "pygments_lexer": "ipython3",
-   "version": "3.10.6"
-  }
- },
- "nbformat": 4,
- "nbformat_minor": 5
+  "nbformat": 4,
+  "nbformat_minor": 5
 }
\ No newline at end of file
diff --git a/docs/extras/modules/data_connection/retrievers/self_query/opensearch_self_query.ipynb b/docs/extras/modules/data_connection/retrievers/self_query/opensearch_self_query.ipynb
index 947960c2170..0045176faa5 100644
--- a/docs/extras/modules/data_connection/retrievers/self_query/opensearch_self_query.ipynb
+++ b/docs/extras/modules/data_connection/retrievers/self_query/opensearch_self_query.ipynb
@@ -1,439 +1,440 @@
 {
- "cells": [
-  {
-   "cell_type": "markdown",
-   "id": "13afcae7",
-   "metadata": {},
-   "source": [
-    "# OpenSearch\n",
-    "\n",
-    "> [OpenSearch](https://opensearch.org/) is a scalable, flexible, and extensible open-source software suite for search, analytics, and observability applications licensed under Apache 2.0. `OpenSearch` is a distributed search and analytics engine based on `Apache Lucene`.\n",
-    "\n",
-    "In this notebook, we'll demo the `SelfQueryRetriever` with an `OpenSearch` vector store."
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "id": "68e75fb9",
-   "metadata": {},
-   "source": [
-    "## Creating an OpenSearch vector store\n",
-    "\n",
-    "First, we'll want to create an `OpenSearch` vector store and seed it with some data. We've created a small demo set of documents that contain summaries of movies.\n",
-    "\n",
-    "**Note:** The self-query retriever requires you to have `lark` installed (`pip install lark`). We also need the `opensearch-py` package."
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": null,
-   "outputs": [],
-   "source": [
-    "!pip install lark opensearch-py"
-   ],
-   "metadata": {
-    "collapsed": false,
-    "pycharm": {
-     "name": "#%%\n"
-    }
-   }
-  },
-  {
-   "cell_type": "code",
-   "execution_count": 3,
-   "id": "cb4a5787",
-   "metadata": {
-    "tags": []
-   },
-   "outputs": [
+  "cells": [
     {
-     "name": "stdin",
-     "output_type": "stream",
-     "text": [
-      "OpenAI API Key: ········\n"
-     ]
-    }
-   ],
-   "source": [
-    "from langchain.schema import Document\n",
-    "from langchain.embeddings.openai import OpenAIEmbeddings\n",
-    "from langchain.vectorstores import OpenSearchVectorSearch\n",
-    "import os\n",
-    "import getpass\n",
-    "\n",
-    "os.environ[\"OPENAI_API_KEY\"] = getpass.getpass(\"OpenAI API Key:\")\n",
-    "\n",
-    "embeddings = OpenAIEmbeddings()"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": 8,
-   "id": "bcbe04d9",
-   "metadata": {
-    "tags": []
-   },
-   "outputs": [],
-   "source": [
-    "docs = [\n",
-    "    Document(\n",
-    "        page_content=\"A bunch of scientists bring back dinosaurs and mayhem breaks loose\",\n",
-    "        metadata={\"year\": 1993, \"rating\": 7.7, \"genre\": \"science fiction\"},\n",
-    "    ),\n",
-    "    Document(\n",
-    "        page_content=\"Leo DiCaprio gets lost in a dream within a dream within a dream within a ...\",\n",
-    "        metadata={\"year\": 2010, \"director\": \"Christopher Nolan\", \"rating\": 8.2},\n",
-    "    ),\n",
-    "    Document(\n",
-    "        page_content=\"A psychologist / detective gets lost in a series of dreams within dreams within dreams and Inception reused the idea\",\n",
-    "        metadata={\"year\": 2006, \"director\": \"Satoshi Kon\", \"rating\": 8.6},\n",
-    "    ),\n",
-    "    Document(\n",
-    "        page_content=\"A bunch of normal-sized women are supremely wholesome and some men pine after them\",\n",
-    "        metadata={\"year\": 2019, \"director\": \"Greta Gerwig\", \"rating\": 8.3},\n",
-    "    ),\n",
-    "    Document(\n",
-    "        page_content=\"Toys come alive and have a blast doing so\",\n",
-    "        metadata={\"year\": 1995, \"genre\": \"animated\"},\n",
-    "    ),\n",
-    "    Document(\n",
-    "        page_content=\"Three men walk into the Zone, three men walk out of the Zone\",\n",
-    "        metadata={\n",
-    "            \"year\": 1979,\n",
-    "            \"rating\": 9.9,\n",
-    "            \"director\": \"Andrei Tarkovsky\",\n",
-    "            \"genre\": \"science fiction\",\n",
-    "        },\n",
-    "    ),\n",
-    "]\n",
-    "vectorstore = OpenSearchVectorSearch.from_documents(\n",
-    "    docs, embeddings, index_name=\"opensearch-self-query-demo\", opensearch_url=\"http://localhost:9200\"\n",
-    ")"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "id": "5ecaab6d",
-   "metadata": {},
-   "source": [
-    "## Creating our self-querying retriever\n",
-    "Now we can instantiate our retriever. To do this we'll need to provide some information upfront about the metadata fields that our documents support and a short description of the document contents."
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": 9,
-   "id": "86e34dbf",
-   "metadata": {
-    "tags": []
-   },
-   "outputs": [],
-   "source": [
-    "from langchain.llms import OpenAI\n",
-    "from langchain.retrievers.self_query.base import SelfQueryRetriever\n",
-    "from langchain.chains.query_constructor.base import AttributeInfo\n",
-    "\n",
-    "metadata_field_info = [\n",
-    "    AttributeInfo(\n",
-    "        name=\"genre\",\n",
-    "        description=\"The genre of the movie\",\n",
-    "        type=\"string or list[string]\",\n",
-    "    ),\n",
-    "    AttributeInfo(\n",
-    "        name=\"year\",\n",
-    "        description=\"The year the movie was released\",\n",
-    "        type=\"integer\",\n",
-    "    ),\n",
-    "    AttributeInfo(\n",
-    "        name=\"director\",\n",
-    "        description=\"The name of the movie director\",\n",
-    "        type=\"string\",\n",
-    "    ),\n",
-    "    AttributeInfo(\n",
-    "        name=\"rating\", description=\"A 1-10 rating for the movie\", type=\"float\"\n",
-    "    ),\n",
-    "]\n",
-    "document_content_description = \"Brief summary of a movie\"\n",
-    "llm = OpenAI(temperature=0)\n",
-    "retriever = SelfQueryRetriever.from_llm(\n",
-    "    llm, vectorstore, document_content_description, metadata_field_info, verbose=True\n",
-    ")"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "id": "ea9df8d4",
-   "metadata": {},
-   "source": [
-    "## Testing it out\n",
-    "And now we can try actually using our retriever!"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": 10,
-   "id": "38a126e9",
-   "metadata": {},
-   "outputs": [
-    {
-     "name": "stdout",
-     "output_type": "stream",
-     "text": [
-      "query='dinosaur' filter=None limit=None\n"
-     ]
+      "cell_type": "markdown",
+      "id": "13afcae7",
+      "metadata": {},
+      "source": [
+        "# OpenSearch\n",
+        "\n",
+        "> [OpenSearch](https://opensearch.org/) is a scalable, flexible, and extensible open-source software suite for search, analytics, and observability applications licensed under Apache 2.0. `OpenSearch` is a distributed search and analytics engine based on `Apache Lucene`.\n",
+        "\n",
+        "In this notebook, we'll demo the `SelfQueryRetriever` with an `OpenSearch` vector store."
+      ]
     },
     {
-     "data": {
-      "text/plain": [
-       "[Document(page_content='A bunch of scientists bring back dinosaurs and mayhem breaks loose', metadata={'year': 1993, 'rating': 7.7, 'genre': 'science fiction'}),\n",
-       " Document(page_content='Toys come alive and have a blast doing so', metadata={'year': 1995, 'genre': 'animated'}),\n",
-       " Document(page_content='Leo DiCaprio gets lost in a dream within a dream within a dream within a ...', metadata={'year': 2010, 'director': 'Christopher Nolan', 'rating': 8.2}),\n",
-       " Document(page_content='Three men walk into the Zone, three men walk out of the Zone', metadata={'year': 1979, 'rating': 9.9, 'director': 'Andrei Tarkovsky', 'genre': 'science fiction'})]"
+      "cell_type": "markdown",
+      "id": "68e75fb9",
+      "metadata": {},
+      "source": [
+        "## Creating an OpenSearch vector store\n",
+        "\n",
+        "First, we'll want to create an `OpenSearch` vector store and seed it with some data. We've created a small demo set of documents that contain summaries of movies.\n",
+        "\n",
+        "**Note:** The self-query retriever requires you to have `lark` installed (`pip install lark`). We also need the `opensearch-py` package."
       ]
-     },
-     "execution_count": 10,
-     "metadata": {},
-     "output_type": "execute_result"
-    }
-   ],
-   "source": [
-    "# This example only specifies a relevant query\n",
-    "retriever.get_relevant_documents(\"What are some movies about dinosaurs\")"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": 11,
-   "id": "60bf0074-e65e-4558-a4f2-8190f3e4e2f9",
-   "metadata": {},
-   "outputs": [
-    {
-     "name": "stdout",
-     "output_type": "stream",
-     "text": [
-      "query=' ' filter=Comparison(comparator=<Comparator.GT: 'gt'>, attribute='rating', value=8.5) limit=None\n"
-     ]
     },
     {
-     "data": {
-      "text/plain": [
-       "[Document(page_content='Three men walk into the Zone, three men walk out of the Zone', metadata={'year': 1979, 'rating': 9.9, 'director': 'Andrei Tarkovsky', 'genre': 'science fiction'}),\n",
-       " Document(page_content='A psychologist / detective gets lost in a series of dreams within dreams within dreams and Inception reused the idea', metadata={'year': 2006, 'director': 'Satoshi Kon', 'rating': 8.6})]"
-      ]
-     },
-     "execution_count": 11,
-     "metadata": {},
-     "output_type": "execute_result"
-    }
-   ],
-   "source": [
-    "# This example only specifies a filter\n",
-    "retriever.get_relevant_documents(\"I want to watch a movie rated higher than 8.5\")\n"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": 12,
-   "id": "b19d4da0",
-   "metadata": {},
-   "outputs": [
-    {
-     "name": "stdout",
-     "output_type": "stream",
-     "text": [
-      "query='women' filter=Comparison(comparator=<Comparator.EQ: 'eq'>, attribute='director', value='Greta Gerwig') limit=None\n"
-     ]
+      "cell_type": "code",
+      "execution_count": null,
+      "outputs": [],
+      "source": [
+        "!pip install lark opensearch-py"
+      ],
+      "metadata": {
+        "collapsed": false,
+        "pycharm": {
+          "name": "#%%\n"
+        }
+      },
+      "id": "6078a74d"
     },
     {
-     "data": {
-      "text/plain": [
-       "[Document(page_content='A bunch of normal-sized women are supremely wholesome and some men pine after them', metadata={'year': 2019, 'director': 'Greta Gerwig', 'rating': 8.3})]"
+      "cell_type": "code",
+      "execution_count": 3,
+      "id": "cb4a5787",
+      "metadata": {
+        "tags": []
+      },
+      "outputs": [
+        {
+          "name": "stdin",
+          "output_type": "stream",
+          "text": [
+            "OpenAI API Key: \u00b7\u00b7\u00b7\u00b7\u00b7\u00b7\u00b7\u00b7\n"
+          ]
+        }
+      ],
+      "source": [
+        "from langchain.schema import Document\n",
+        "from langchain.embeddings.openai import OpenAIEmbeddings\n",
+        "from langchain.vectorstores import OpenSearchVectorSearch\n",
+        "import os\n",
+        "import getpass\n",
+        "\n",
+        "os.environ[\"OPENAI_API_KEY\"] = getpass.getpass(\"OpenAI API Key:\")\n",
+        "\n",
+        "embeddings = OpenAIEmbeddings()"
       ]
-     },
-     "execution_count": 12,
-     "metadata": {},
-     "output_type": "execute_result"
-    }
-   ],
-   "source": [
-    "# This example specifies a query and a filter\n",
-    "retriever.get_relevant_documents(\"Has Greta Gerwig directed any movies about women\")"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": 13,
-   "id": "a59f946b-78a1-4d3e-9942-63834c7d7589",
-   "metadata": {},
-   "outputs": [
-    {
-     "name": "stdout",
-     "output_type": "stream",
-     "text": [
-      "query=' ' filter=Operation(operator=<Operator.AND: 'and'>, arguments=[Comparison(comparator=<Comparator.GTE: 'gte'>, attribute='rating', value=8.5), Comparison(comparator=<Comparator.CONTAIN: 'contain'>, attribute='genre', value='science fiction')]) limit=None\n"
-     ]
     },
     {
-     "data": {
-      "text/plain": [
-       "[Document(page_content='Three men walk into the Zone, three men walk out of the Zone', metadata={'year': 1979, 'rating': 9.9, 'director': 'Andrei Tarkovsky', 'genre': 'science fiction'})]"
+      "cell_type": "code",
+      "execution_count": 8,
+      "id": "bcbe04d9",
+      "metadata": {
+        "tags": []
+      },
+      "outputs": [],
+      "source": [
+        "docs = [\n",
+        "    Document(\n",
+        "        page_content=\"A bunch of scientists bring back dinosaurs and mayhem breaks loose\",\n",
+        "        metadata={\"year\": 1993, \"rating\": 7.7, \"genre\": \"science fiction\"},\n",
+        "    ),\n",
+        "    Document(\n",
+        "        page_content=\"Leo DiCaprio gets lost in a dream within a dream within a dream within a ...\",\n",
+        "        metadata={\"year\": 2010, \"director\": \"Christopher Nolan\", \"rating\": 8.2},\n",
+        "    ),\n",
+        "    Document(\n",
+        "        page_content=\"A psychologist / detective gets lost in a series of dreams within dreams within dreams and Inception reused the idea\",\n",
+        "        metadata={\"year\": 2006, \"director\": \"Satoshi Kon\", \"rating\": 8.6},\n",
+        "    ),\n",
+        "    Document(\n",
+        "        page_content=\"A bunch of normal-sized women are supremely wholesome and some men pine after them\",\n",
+        "        metadata={\"year\": 2019, \"director\": \"Greta Gerwig\", \"rating\": 8.3},\n",
+        "    ),\n",
+        "    Document(\n",
+        "        page_content=\"Toys come alive and have a blast doing so\",\n",
+        "        metadata={\"year\": 1995, \"genre\": \"animated\"},\n",
+        "    ),\n",
+        "    Document(\n",
+        "        page_content=\"Three men walk into the Zone, three men walk out of the Zone\",\n",
+        "        metadata={\n",
+        "            \"year\": 1979,\n",
+        "            \"rating\": 9.9,\n",
+        "            \"director\": \"Andrei Tarkovsky\",\n",
+        "            \"genre\": \"science fiction\",\n",
+        "        },\n",
+        "    ),\n",
+        "]\n",
+        "vectorstore = OpenSearchVectorSearch.from_documents(\n",
+        "    docs, embeddings, index_name=\"opensearch-self-query-demo\", opensearch_url=\"http://localhost:9200\"\n",
+        ")"
       ]
-     },
-     "execution_count": 13,
-     "metadata": {},
-     "output_type": "execute_result"
-    }
-   ],
-   "source": [
-    "# This example specifies a composite filter\n",
-    "retriever.get_relevant_documents(\"What's a highly rated (above 8.5) science fiction film?\")"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "id": "39bd1de1-b9fe-4a98-89da-58d8a7a6ae51",
-   "metadata": {},
-   "source": [
-    "## Filter k\n",
-    "\n",
-    "We can also use the self query retriever to specify `k`: the number of documents to fetch.\n",
-    "\n",
-    "We can do this by passing `enable_limit=True` to the constructor."
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": 14,
-   "id": "bff36b88-b506-4877-9c63-e5a1a8d78e64",
-   "metadata": {
-    "tags": []
-   },
-   "outputs": [],
-   "source": [
-    "retriever = SelfQueryRetriever.from_llm(\n",
-    "    llm,\n",
-    "    vectorstore,\n",
-    "    document_content_description,\n",
-    "    metadata_field_info,\n",
-    "    enable_limit=True,\n",
-    "    verbose=True,\n",
-    ")"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": 15,
-   "id": "2758d229-4f97-499c-819f-888acaf8ee10",
-   "metadata": {
-    "tags": []
-   },
-   "outputs": [
-    {
-     "name": "stdout",
-     "output_type": "stream",
-     "text": [
-      "query='dinosaur' filter=None limit=2\n"
-     ]
     },
     {
-     "data": {
-      "text/plain": [
-       "[Document(page_content='A bunch of scientists bring back dinosaurs and mayhem breaks loose', metadata={'year': 1993, 'rating': 7.7, 'genre': 'science fiction'}),\n",
-       " Document(page_content='Toys come alive and have a blast doing so', metadata={'year': 1995, 'genre': 'animated'})]"
+      "cell_type": "markdown",
+      "id": "5ecaab6d",
+      "metadata": {},
+      "source": [
+        "## Creating our self-querying retriever\n",
+        "Now we can instantiate our retriever. To do this we'll need to provide some information upfront about the metadata fields that our documents support and a short description of the document contents."
       ]
-     },
-     "execution_count": 15,
-     "metadata": {},
-     "output_type": "execute_result"
-    }
-   ],
-   "source": [
-    "# This example only specifies a relevant query\n",
-    "retriever.get_relevant_documents(\"what are two movies about dinosaurs\")"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "id": "61a10294",
-   "metadata": {},
-   "source": [
-    "## Complex queries in Action!\n",
-    "We've tried out some simple queries, but what about more complex ones? Let's try out a few more complex queries that utilize the full power of OpenSearch."
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": 16,
-   "id": "e460da93",
-   "metadata": {},
-   "outputs": [
-    {
-     "name": "stdout",
-     "output_type": "stream",
-     "text": [
-      "query='animated toys' filter=Operation(operator=<Operator.AND: 'and'>, arguments=[Operation(operator=<Operator.OR: 'or'>, arguments=[Comparison(comparator=<Comparator.EQ: 'eq'>, attribute='genre', value='animated'), Comparison(comparator=<Comparator.EQ: 'eq'>, attribute='genre', value='comedy')]), Comparison(comparator=<Comparator.GTE: 'gte'>, attribute='year', value=1990)]) limit=None\n"
-     ]
     },
     {
-     "data": {
-      "text/plain": [
-       "[Document(page_content='Toys come alive and have a blast doing so', metadata={'year': 1995, 'genre': 'animated'})]"
+      "cell_type": "code",
+      "execution_count": 9,
+      "id": "86e34dbf",
+      "metadata": {
+        "tags": []
+      },
+      "outputs": [],
+      "source": [
+        "from langchain.llms import OpenAI\n",
+        "from langchain.retrievers.self_query.base import SelfQueryRetriever\n",
+        "from langchain.chains.query_constructor.base import AttributeInfo\n",
+        "\n",
+        "metadata_field_info = [\n",
+        "    AttributeInfo(\n",
+        "        name=\"genre\",\n",
+        "        description=\"The genre of the movie\",\n",
+        "        type=\"string or list[string]\",\n",
+        "    ),\n",
+        "    AttributeInfo(\n",
+        "        name=\"year\",\n",
+        "        description=\"The year the movie was released\",\n",
+        "        type=\"integer\",\n",
+        "    ),\n",
+        "    AttributeInfo(\n",
+        "        name=\"director\",\n",
+        "        description=\"The name of the movie director\",\n",
+        "        type=\"string\",\n",
+        "    ),\n",
+        "    AttributeInfo(\n",
+        "        name=\"rating\", description=\"A 1-10 rating for the movie\", type=\"float\"\n",
+        "    ),\n",
+        "]\n",
+        "document_content_description = \"Brief summary of a movie\"\n",
+        "llm = OpenAI(temperature=0)\n",
+        "retriever = SelfQueryRetriever.from_llm(\n",
+        "    llm, vectorstore, document_content_description, metadata_field_info, verbose=True\n",
+        ")"
       ]
-     },
-     "execution_count": 16,
-     "metadata": {},
-     "output_type": "execute_result"
-    }
-   ],
-   "source": [
-    "retriever.get_relevant_documents(\"what animated or comedy movies have been released in the last 30 years about animated toys?\")"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": 17,
-   "id": "0851fc42",
-   "metadata": {
-    "pycharm": {
-     "name": "#%%\n"
-    }
-   },
-   "outputs": [
+    },
     {
-     "data": {
-      "text/plain": [
-       "{'acknowledged': True}"
+      "cell_type": "markdown",
+      "id": "ea9df8d4",
+      "metadata": {},
+      "source": [
+        "## Testing it out\n",
+        "And now we can try actually using our retriever!"
+      ]
+    },
+    {
+      "cell_type": "code",
+      "execution_count": 10,
+      "id": "38a126e9",
+      "metadata": {},
+      "outputs": [
+        {
+          "name": "stdout",
+          "output_type": "stream",
+          "text": [
+            "query='dinosaur' filter=None limit=None\n"
+          ]
+        },
+        {
+          "data": {
+            "text/plain": [
+              "[Document(page_content='A bunch of scientists bring back dinosaurs and mayhem breaks loose', metadata={'year': 1993, 'rating': 7.7, 'genre': 'science fiction'}),\n",
+              " Document(page_content='Toys come alive and have a blast doing so', metadata={'year': 1995, 'genre': 'animated'}),\n",
+              " Document(page_content='Leo DiCaprio gets lost in a dream within a dream within a dream within a ...', metadata={'year': 2010, 'director': 'Christopher Nolan', 'rating': 8.2}),\n",
+              " Document(page_content='Three men walk into the Zone, three men walk out of the Zone', metadata={'year': 1979, 'rating': 9.9, 'director': 'Andrei Tarkovsky', 'genre': 'science fiction'})]"
+            ]
+          },
+          "execution_count": 10,
+          "metadata": {},
+          "output_type": "execute_result"
+        }
+      ],
+      "source": [
+        "# This example only specifies a relevant query\n",
+        "retriever.get_relevant_documents(\"What are some movies about dinosaurs\")"
+      ]
+    },
+    {
+      "cell_type": "code",
+      "execution_count": 11,
+      "id": "60bf0074-e65e-4558-a4f2-8190f3e4e2f9",
+      "metadata": {},
+      "outputs": [
+        {
+          "name": "stdout",
+          "output_type": "stream",
+          "text": [
+            "query=' ' filter=Comparison(comparator=<Comparator.GT: 'gt'>, attribute='rating', value=8.5) limit=None\n"
+          ]
+        },
+        {
+          "data": {
+            "text/plain": [
+              "[Document(page_content='Three men walk into the Zone, three men walk out of the Zone', metadata={'year': 1979, 'rating': 9.9, 'director': 'Andrei Tarkovsky', 'genre': 'science fiction'}),\n",
+              " Document(page_content='A psychologist / detective gets lost in a series of dreams within dreams within dreams and Inception reused the idea', metadata={'year': 2006, 'director': 'Satoshi Kon', 'rating': 8.6})]"
+            ]
+          },
+          "execution_count": 11,
+          "metadata": {},
+          "output_type": "execute_result"
+        }
+      ],
+      "source": [
+        "# This example only specifies a filter\n",
+        "retriever.get_relevant_documents(\"I want to watch a movie rated higher than 8.5\")\n"
+      ]
+    },
+    {
+      "cell_type": "code",
+      "execution_count": 12,
+      "id": "b19d4da0",
+      "metadata": {},
+      "outputs": [
+        {
+          "name": "stdout",
+          "output_type": "stream",
+          "text": [
+            "query='women' filter=Comparison(comparator=<Comparator.EQ: 'eq'>, attribute='director', value='Greta Gerwig') limit=None\n"
+          ]
+        },
+        {
+          "data": {
+            "text/plain": [
+              "[Document(page_content='A bunch of normal-sized women are supremely wholesome and some men pine after them', metadata={'year': 2019, 'director': 'Greta Gerwig', 'rating': 8.3})]"
+            ]
+          },
+          "execution_count": 12,
+          "metadata": {},
+          "output_type": "execute_result"
+        }
+      ],
+      "source": [
+        "# This example specifies a query and a filter\n",
+        "retriever.get_relevant_documents(\"Has Greta Gerwig directed any movies about women\")"
+      ]
+    },
+    {
+      "cell_type": "code",
+      "execution_count": 13,
+      "id": "a59f946b-78a1-4d3e-9942-63834c7d7589",
+      "metadata": {},
+      "outputs": [
+        {
+          "name": "stdout",
+          "output_type": "stream",
+          "text": [
+            "query=' ' filter=Operation(operator=<Operator.AND: 'and'>, arguments=[Comparison(comparator=<Comparator.GTE: 'gte'>, attribute='rating', value=8.5), Comparison(comparator=<Comparator.CONTAIN: 'contain'>, attribute='genre', value='science fiction')]) limit=None\n"
+          ]
+        },
+        {
+          "data": {
+            "text/plain": [
+              "[Document(page_content='Three men walk into the Zone, three men walk out of the Zone', metadata={'year': 1979, 'rating': 9.9, 'director': 'Andrei Tarkovsky', 'genre': 'science fiction'})]"
+            ]
+          },
+          "execution_count": 13,
+          "metadata": {},
+          "output_type": "execute_result"
+        }
+      ],
+      "source": [
+        "# This example specifies a composite filter\n",
+        "retriever.get_relevant_documents(\"What's a highly rated (above 8.5) science fiction film?\")"
+      ]
+    },
+    {
+      "cell_type": "markdown",
+      "id": "39bd1de1-b9fe-4a98-89da-58d8a7a6ae51",
+      "metadata": {},
+      "source": [
+        "## Filter k\n",
+        "\n",
+        "We can also use the self query retriever to specify `k`: the number of documents to fetch.\n",
+        "\n",
+        "We can do this by passing `enable_limit=True` to the constructor."
+      ]
+    },
+    {
+      "cell_type": "code",
+      "execution_count": 14,
+      "id": "bff36b88-b506-4877-9c63-e5a1a8d78e64",
+      "metadata": {
+        "tags": []
+      },
+      "outputs": [],
+      "source": [
+        "retriever = SelfQueryRetriever.from_llm(\n",
+        "    llm,\n",
+        "    vectorstore,\n",
+        "    document_content_description,\n",
+        "    metadata_field_info,\n",
+        "    enable_limit=True,\n",
+        "    verbose=True,\n",
+        ")"
+      ]
+    },
+    {
+      "cell_type": "code",
+      "execution_count": 15,
+      "id": "2758d229-4f97-499c-819f-888acaf8ee10",
+      "metadata": {
+        "tags": []
+      },
+      "outputs": [
+        {
+          "name": "stdout",
+          "output_type": "stream",
+          "text": [
+            "query='dinosaur' filter=None limit=2\n"
+          ]
+        },
+        {
+          "data": {
+            "text/plain": [
+              "[Document(page_content='A bunch of scientists bring back dinosaurs and mayhem breaks loose', metadata={'year': 1993, 'rating': 7.7, 'genre': 'science fiction'}),\n",
+              " Document(page_content='Toys come alive and have a blast doing so', metadata={'year': 1995, 'genre': 'animated'})]"
+            ]
+          },
+          "execution_count": 15,
+          "metadata": {},
+          "output_type": "execute_result"
+        }
+      ],
+      "source": [
+        "# This example only specifies a relevant query\n",
+        "retriever.get_relevant_documents(\"what are two movies about dinosaurs\")"
+      ]
+    },
+    {
+      "cell_type": "markdown",
+      "id": "61a10294",
+      "metadata": {},
+      "source": [
+        "## Complex queries in Action!\n",
+        "We've tried out some simple queries, but what about more complex ones? Let's try out a few more complex queries that utilize the full power of OpenSearch."
+      ]
+    },
+    {
+      "cell_type": "code",
+      "execution_count": 16,
+      "id": "e460da93",
+      "metadata": {},
+      "outputs": [
+        {
+          "name": "stdout",
+          "output_type": "stream",
+          "text": [
+            "query='animated toys' filter=Operation(operator=<Operator.AND: 'and'>, arguments=[Operation(operator=<Operator.OR: 'or'>, arguments=[Comparison(comparator=<Comparator.EQ: 'eq'>, attribute='genre', value='animated'), Comparison(comparator=<Comparator.EQ: 'eq'>, attribute='genre', value='comedy')]), Comparison(comparator=<Comparator.GTE: 'gte'>, attribute='year', value=1990)]) limit=None\n"
+          ]
+        },
+        {
+          "data": {
+            "text/plain": [
+              "[Document(page_content='Toys come alive and have a blast doing so', metadata={'year': 1995, 'genre': 'animated'})]"
+            ]
+          },
+          "execution_count": 16,
+          "metadata": {},
+          "output_type": "execute_result"
+        }
+      ],
+      "source": [
+        "retriever.get_relevant_documents(\"what animated or comedy movies have been released in the last 30 years about animated toys?\")"
+      ]
+    },
+    {
+      "cell_type": "code",
+      "execution_count": 17,
+      "id": "0851fc42",
+      "metadata": {
+        "pycharm": {
+          "name": "#%%\n"
+        }
+      },
+      "outputs": [
+        {
+          "data": {
+            "text/plain": [
+              "{'acknowledged': True}"
+            ]
+          },
+          "execution_count": 17,
+          "metadata": {},
+          "output_type": "execute_result"
+        }
+      ],
+      "source": [
+        "vectorstore.client.indices.delete(index=\"opensearch-self-query-demo\")\n"
       ]
-     },
-     "execution_count": 17,
-     "metadata": {},
-     "output_type": "execute_result"
     }
-   ],
-   "source": [
-    "vectorstore.client.indices.delete(index=\"opensearch-self-query-demo\")\n"
-   ]
-  }
- ],
- "metadata": {
-  "kernelspec": {
-   "display_name": "Python 3 (ipykernel)",
-   "language": "python",
-   "name": "python3"
+  ],
+  "metadata": {
+    "kernelspec": {
+      "display_name": "Python 3 (ipykernel)",
+      "language": "python",
+      "name": "python3"
+    },
+    "language_info": {
+      "codemirror_mode": {
+        "name": "ipython",
+        "version": 3
+      },
+      "file_extension": ".py",
+      "mimetype": "text/x-python",
+      "name": "python",
+      "nbconvert_exporter": "python",
+      "pygments_lexer": "ipython3",
+      "version": "3.9.18"
+    }
   },
-  "language_info": {
-   "codemirror_mode": {
-    "name": "ipython",
-    "version": 3
-   },
-   "file_extension": ".py",
-   "mimetype": "text/x-python",
-   "name": "python",
-   "nbconvert_exporter": "python",
-   "pygments_lexer": "ipython3",
-   "version": "3.9.18"
-  }
- },
- "nbformat": 4,
- "nbformat_minor": 5
+  "nbformat": 4,
+  "nbformat_minor": 5
 }
\ No newline at end of file
diff --git a/docs/snippets/modules/data_connection/document_transformers/text_splitters/code_splitter.mdx b/docs/snippets/modules/data_connection/document_transformers/text_splitters/code_splitter.mdx
index 0a6135e3033..2087fc9e4ca 100644
--- a/docs/snippets/modules/data_connection/document_transformers/text_splitters/code_splitter.mdx
+++ b/docs/snippets/modules/data_connection/document_transformers/text_splitters/code_splitter.mdx
@@ -31,7 +31,8 @@ from langchain.text_splitter import (
      'markdown',
      'latex',
      'html',
-     'sol',]
+     'sol',
+     'csharp']
 ```
 
 </CodeOutputBlock>
@@ -342,3 +343,72 @@ sol_docs
  ```
 
  </CodeOutputBlock>
+
+
+## C#
+Here's an example using the C# text splitter:
+
+```csharp
+using System;
+class Program
+{
+    static void Main()
+    {
+        int age = 30; // Change the age value as needed
+
+        // Categorize the age without any console output
+        if (age < 18)
+        {
+            // Age is under 18
+        }
+        else if (age >= 18 && age < 65)
+        {
+            // Age is an adult
+        }
+        else
+        {
+            // Age is a senior citizen
+        }
+    }
+}
+```
+
+<CodeOutputBlock lang="python">
+
+```
+    [Document(page_content='using System;', metadata={}),
+     Document(page_content='class Program\n{', metadata={}),
+     Document(page_content='static void', metadata={}),
+     Document(page_content='Main()', metadata={}),
+     Document(page_content='{', metadata={}),
+     Document(page_content='int age', metadata={}),
+     Document(page_content='= 30; // Change', metadata={}),
+     Document(page_content='the age value', metadata={}),
+     Document(page_content='as needed', metadata={}),
+     Document(page_content='//', metadata={}),
+     Document(page_content='Categorize the', metadata={}),
+     Document(page_content='age without any', metadata={}),
+     Document(page_content='console output', metadata={}),
+     Document(page_content='if (age', metadata={}),
+     Document(page_content='< 18)', metadata={}),
+     Document(page_content='{', metadata={}),
+     Document(page_content='//', metadata={}),
+     Document(page_content='Age is under 18', metadata={}),
+     Document(page_content='}', metadata={}),
+     Document(page_content='else if', metadata={}),
+     Document(page_content='(age >= 18 &&', metadata={}),
+     Document(page_content='age < 65)', metadata={}),
+     Document(page_content='{', metadata={}),
+     Document(page_content='//', metadata={}),
+     Document(page_content='Age is an adult', metadata={}),
+     Document(page_content='}', metadata={}),
+     Document(page_content='else', metadata={}),
+     Document(page_content='{', metadata={}),
+     Document(page_content='//', metadata={}),
+     Document(page_content='Age is a senior', metadata={}),
+     Document(page_content='citizen', metadata={}),
+     Document(page_content='}\n    }', metadata={}),
+     Document(page_content='}', metadata={})]
+ ```
+
+ </CodeOutputBlock>