Update Notebook Image (#18470 )

merge
community: Add you.com tool, add async to retriever, add async testing, add You tool doc (#18032 )
2026-02-04 16:20:16 +00:00 · 2024-03-03 17:29:54 -08:00 · 2024-03-03 17:29:51 -08:00 · 2024-03-03 17:29:32 -08:00 · 2024-03-03 17:29:32 -08:00 · 2024-03-03 17:29:32 -08:00
54 changed files with 4914 additions and 1986 deletions
--- a/2
+++ b/2
@@ -50,7 +50,7 @@ lint lint_package lint_tests:
 	poetry run ruff docs templates cookbook
 	poetry run ruff format docs templates cookbook --diff
 	poetry run ruff --select I docs templates cookbook
-	git grep 'from langchain import' {docs/docs,templates,cookbook} | grep -vE 'from langchain import (hub)' && exit 1 || exit 0
+	git grep 'from langchain import' docs/docs templates cookbook | grep -vE 'from langchain import (hub)' && exit 1 || exit 0

 format format_diff:
 	poetry run ruff format docs templates cookbook
--- a/docs/docs/get_started/quickstart.mdx
+++ b/docs/docs/get_started/quickstart.mdx
@@ -65,10 +65,10 @@ We will link to relevant docs.

 ## LLM Chain

-We'll show how to use models available via API, like OpenAI and Cohere, and local open source models, using integrations like Ollama.
+We'll show how to use models available via API, like OpenAI, and local open source models, using integrations like Ollama.

 <Tabs>
-  <TabItem value="openai" label="OpenAI (API)" default>
+  <TabItem value="openai" label="OpenAI" default>

 First we'll need to import the LangChain x OpenAI integration package.

@@ -115,7 +115,36 @@ llm = Ollama(model="llama2")
 ```

  </TabItem>
-  <TabItem value="cohere" label="Cohere (API)" default>
+  <TabItem value="anthropic" label="Anthropic">
+
+First we'll need to import the LangChain x Anthropic package.
+
+```shell
+pip install langchain-anthropic
+```
+
+Accessing the API requires an API key, which you can get by creating an account [here](https://claude.ai/login). Once we have a key we'll want to set it as an environment variable by running:
+
+```shell
+export ANTHROPIC_API_KEY="..."
+```
+
+We can then initialize the model:
+
+```python
+from langchain_anthropic import ChatAnthropic
+
+llm = ChatAnthropic(model="claude-2.1", temperature=0.2, max_tokens=1024)
+```
+
+If you'd prefer not to set an environment variable you can pass the key in directly via the `anthropic_api_key` named parameter when initiating the Anthropic Chat Model class:
+
+```python
+llm = ChatAnthropic(anthropic_api_key="...")
+```
+
+  </TabItem>
+  <TabItem value="cohere" label="Cohere">

 First we'll need to import the Cohere SDK package.

--- a/docs/docs/integrations/llms/llm_caching.ipynb
+++ b/docs/docs/integrations/llms/llm_caching.ipynb
@@ -12,9 +12,14 @@
  },
  {
   "cell_type": "code",
-   "execution_count": 3,
+   "execution_count": 15,
   "id": "10ad9224",
-   "metadata": {},
+   "metadata": {
+    "ExecuteTime": {
+     "end_time": "2024-02-02T21:34:23.461332Z",
+     "start_time": "2024-02-02T21:34:23.394461Z"
+    }
+   },
   "outputs": [],
   "source": [
    "from langchain.globals import set_llm_cache\n",
@@ -1349,6 +1354,144 @@
    "print(llm(\"Is is possible that something false can be also true?\"))"
   ]
  },
+  {
+   "cell_type": "markdown",
+   "source": [
+    "## Azure Cosmos DB Semantic Cache"
+   ],
+   "metadata": {
+    "collapsed": false
+   },
+   "id": "40624c26e86b57a4"
+  },
+  {
+   "cell_type": "code",
+   "outputs": [],
+   "source": [
+    "from langchain.cache import AzureCosmosDBSemanticCache\n",
+    "from langchain_community.vectorstores.azure_cosmos_db import (\n",
+    "    CosmosDBSimilarityType,\n",
+    "    CosmosDBVectorSearchType,\n",
+    ")\n",
+    "from langchain_openai import OpenAIEmbeddings\n",
+    "\n",
+    "# Read more about Azure CosmosDB Mongo vCore vector search here https://learn.microsoft.com/en-us/azure/cosmos-db/mongodb/vcore/vector-search\n",
+    "\n",
+    "INDEX_NAME = \"langchain-test-index\"\n",
+    "NAMESPACE = \"langchain_test_db.langchain_test_collection\"\n",
+    "CONNECTION_STRING = (\n",
+    "    \"Please provide your azure cosmos mongo vCore vector db connection string\"\n",
+    ")\n",
+    "DB_NAME, COLLECTION_NAME = NAMESPACE.split(\".\")\n",
+    "\n",
+    "# Default value for these params\n",
+    "num_lists = 3\n",
+    "dimensions = 1536\n",
+    "similarity_algorithm = CosmosDBSimilarityType.COS\n",
+    "kind = CosmosDBVectorSearchType.VECTOR_IVF\n",
+    "m = 16\n",
+    "ef_construction = 64\n",
+    "ef_search = 40\n",
+    "score_threshold = 0.1\n",
+    "\n",
+    "set_llm_cache(\n",
+    "    AzureCosmosDBSemanticCache(\n",
+    "        cosmosdb_connection_string=CONNECTION_STRING,\n",
+    "        cosmosdb_client=None,\n",
+    "        embedding=OpenAIEmbeddings(),\n",
+    "        database_name=DB_NAME,\n",
+    "        collection_name=COLLECTION_NAME,\n",
+    "        num_lists=num_lists,\n",
+    "        similarity=similarity_algorithm,\n",
+    "        kind=kind,\n",
+    "        dimensions=dimensions,\n",
+    "        m=m,\n",
+    "        ef_construction=ef_construction,\n",
+    "        ef_search=ef_search,\n",
+    "        score_threshold=score_threshold,\n",
+    "    )\n",
+    ")"
+   ],
+   "metadata": {
+    "collapsed": false,
+    "ExecuteTime": {
+     "end_time": "2024-02-02T21:34:49.457001Z",
+     "start_time": "2024-02-02T21:34:49.411293Z"
+    }
+   },
+   "id": "4a9d592db01b11b2",
+   "execution_count": 16
+  },
+  {
+   "cell_type": "code",
+   "outputs": [
+    {
+     "name": "stdout",
+     "output_type": "stream",
+     "text": [
+      "CPU times: user 43.4 ms, sys: 7.23 ms, total: 50.7 ms\n",
+      "Wall time: 1.61 s\n"
+     ]
+    },
+    {
+     "data": {
+      "text/plain": "\"\\n\\nWhy couldn't the bicycle stand up by itself?\\n\\nBecause it was two-tired!\""
+     },
+     "execution_count": 17,
+     "metadata": {},
+     "output_type": "execute_result"
+    }
+   ],
+   "source": [
+    "%%time\n",
+    "# The first time, it is not yet in cache, so it should take longer\n",
+    "llm(\"Tell me a joke\")"
+   ],
+   "metadata": {
+    "collapsed": false,
+    "ExecuteTime": {
+     "end_time": "2024-02-02T21:34:53.704234Z",
+     "start_time": "2024-02-02T21:34:52.091096Z"
+    }
+   },
+   "id": "8488cf9c97ec7ab",
+   "execution_count": 17
+  },
+  {
+   "cell_type": "code",
+   "outputs": [
+    {
+     "name": "stdout",
+     "output_type": "stream",
+     "text": [
+      "CPU times: user 6.89 ms, sys: 2.24 ms, total: 9.13 ms\n",
+      "Wall time: 337 ms\n"
+     ]
+    },
+    {
+     "data": {
+      "text/plain": "\"\\n\\nWhy couldn't the bicycle stand up by itself?\\n\\nBecause it was two-tired!\""
+     },
+     "execution_count": 18,
+     "metadata": {},
+     "output_type": "execute_result"
+    }
+   ],
+   "source": [
+    "%%time\n",
+    "# The first time, it is not yet in cache, so it should take longer\n",
+    "llm(\"Tell me a joke\")"
+   ],
+   "metadata": {
+    "collapsed": false,
+    "ExecuteTime": {
+     "end_time": "2024-02-02T21:34:56.004502Z",
+     "start_time": "2024-02-02T21:34:55.650136Z"
+    }
+   },
+   "id": "bc1570a2a77b58c8",
+   "execution_count": 18
+  },
  {
   "cell_type": "markdown",
   "id": "0c69d84d",
--- a/docs/docs/integrations/tools/you.ipynb
+++ b/docs/docs/integrations/tools/you.ipynb
--- a/docs/docs/integrations/vectorstores/azure_cosmos_db.ipynb
+++ b/docs/docs/integrations/vectorstores/azure_cosmos_db.ipynb
@@ -23,24 +23,34 @@
    "        "
   ]
  },
+  {
+   "cell_type": "markdown",
+   "source": [],
+   "metadata": {
+    "collapsed": false
+   },
+   "id": "8c493e205ce1dda5"
+  },
  {
   "cell_type": "code",
-   "execution_count": 2,
+   "execution_count": 1,
   "id": "ab8e45f5bd435ade",
   "metadata": {
+    "collapsed": false,
    "ExecuteTime": {
-     "end_time": "2023-10-10T17:20:00.721985Z",
-     "start_time": "2023-10-10T17:19:57.996265Z"
-    },
-    "collapsed": false
+     "end_time": "2024-02-08T18:25:05.278480Z",
+     "start_time": "2024-02-08T18:24:51.560677Z"
+    }
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
-      "Requirement already satisfied: pymongo in /Users/iekpo/Langchain/langchain-python/.venv/lib/python3.10/site-packages (4.5.0)\r\n",
-      "Requirement already satisfied: dnspython<3.0.0,>=1.16.0 in /Users/iekpo/Langchain/langchain-python/.venv/lib/python3.10/site-packages (from pymongo) (2.4.2)\r\n"
+      "\r\n",
+      "\u001B[1m[\u001B[0m\u001B[34;49mnotice\u001B[0m\u001B[1;39;49m]\u001B[0m\u001B[39;49m A new release of pip is available: \u001B[0m\u001B[31;49m23.2.1\u001B[0m\u001B[39;49m -> \u001B[0m\u001B[32;49m23.3.2\u001B[0m\r\n",
+      "\u001B[1m[\u001B[0m\u001B[34;49mnotice\u001B[0m\u001B[1;39;49m]\u001B[0m\u001B[39;49m To update, run: \u001B[0m\u001B[32;49mpip install --upgrade pip\u001B[0m\r\n",
+      "Note: you may need to restart the kernel to use updated packages.\n"
     ]
    }
   ],
@@ -50,20 +60,20 @@
  },
  {
   "cell_type": "code",
-   "execution_count": 24,
+   "execution_count": 2,
   "id": "9c7ce9e7b26efbb0",
   "metadata": {
+    "collapsed": false,
    "ExecuteTime": {
-     "end_time": "2023-10-10T17:50:03.615234Z",
-     "start_time": "2023-10-10T17:50:03.604289Z"
-    },
-    "collapsed": false
+     "end_time": "2024-02-08T18:25:56.926147Z",
+     "start_time": "2024-02-08T18:25:56.900087Z"
+    }
   },
   "outputs": [],
   "source": [
    "import os\n",
    "\n",
-    "CONNECTION_STRING = \"AZURE COSMOS DB MONGO vCORE connection string\"\n",
+    "CONNECTION_STRING = \"YOUR_CONNECTION_STRING\"\n",
    "INDEX_NAME = \"izzy-test-index\"\n",
    "NAMESPACE = \"izzy_test_db.izzy_test_collection\"\n",
    "DB_NAME, COLLECTION_NAME = NAMESPACE.split(\".\")"
@@ -81,14 +91,14 @@
  },
  {
   "cell_type": "code",
-   "execution_count": 25,
+   "execution_count": 3,
   "id": "4a052d99c6b8a2a7",
   "metadata": {
+    "collapsed": false,
    "ExecuteTime": {
-     "end_time": "2023-10-10T17:50:11.712929Z",
-     "start_time": "2023-10-10T17:50:11.703871Z"
-    },
-    "collapsed": false
+     "end_time": "2024-02-08T18:26:06.558294Z",
+     "start_time": "2024-02-08T18:26:06.550008Z"
+    }
   },
   "outputs": [],
   "source": [
@@ -98,7 +108,7 @@
    "os.environ[\n",
    "    \"OPENAI_API_BASE\"\n",
    "] = \"YOUR_OPEN_AI_ENDPOINT\"  # https://example.openai.azure.com/\n",
-    "os.environ[\"OPENAI_API_KEY\"] = \"YOUR_OPEN_AI_KEY\"\n",
+    "os.environ[\"OPENAI_API_KEY\"] = \"YOUR_OPENAI_API_KEY\"\n",
    "os.environ[\n",
    "    \"OPENAI_EMBEDDINGS_DEPLOYMENT\"\n",
    "] = \"smart-agent-embedding-ada\"  # the deployment name for the embedding model\n",
@@ -119,14 +129,14 @@
  },
  {
   "cell_type": "code",
-   "execution_count": 26,
+   "execution_count": 4,
   "id": "183741cf8f4c7c53",
   "metadata": {
+    "collapsed": false,
    "ExecuteTime": {
-     "end_time": "2023-10-10T17:50:16.732718Z",
-     "start_time": "2023-10-10T17:50:16.716642Z"
-    },
-    "collapsed": false
+     "end_time": "2024-02-08T18:27:00.782280Z",
+     "start_time": "2024-02-08T18:26:47.339151Z"
+    }
   },
   "outputs": [],
   "source": [
@@ -134,6 +144,7 @@
    "from langchain_community.vectorstores.azure_cosmos_db import (\n",
    "    AzureCosmosDBVectorSearch,\n",
    "    CosmosDBSimilarityType,\n",
+    "    CosmosDBVectorSearchType,\n",
    ")\n",
    "from langchain_openai import OpenAIEmbeddings\n",
    "from langchain_text_splitters import CharacterTextSplitter\n",
@@ -159,21 +170,21 @@
  },
  {
   "cell_type": "code",
-   "execution_count": 28,
+   "execution_count": 5,
   "id": "39ae6058c2f7fdf1",
   "metadata": {
+    "collapsed": false,
    "ExecuteTime": {
-     "end_time": "2023-10-10T17:51:17.980698Z",
-     "start_time": "2023-10-10T17:51:11.786336Z"
-    },
-    "collapsed": false
+     "end_time": "2024-02-08T18:31:13.486173Z",
+     "start_time": "2024-02-08T18:30:54.175890Z"
+    }
   },
   "outputs": [
    {
     "data": {
-      "text/plain": "{'raw': {'defaultShard': {'numIndexesBefore': 2,\n   'numIndexesAfter': 3,\n   'createdCollectionAutomatically': False,\n   'ok': 1}},\n 'ok': 1}"
+      "text/plain": "{'raw': {'defaultShard': {'numIndexesBefore': 1,\n   'numIndexesAfter': 2,\n   'createdCollectionAutomatically': False,\n   'ok': 1}},\n 'ok': 1}"
     },
-     "execution_count": 28,
+     "execution_count": 5,
     "metadata": {},
     "output_type": "execute_result"
    }
@@ -181,9 +192,9 @@
   "source": [
    "from pymongo import MongoClient\n",
    "\n",
-    "INDEX_NAME = \"izzy-test-index-2\"\n",
-    "NAMESPACE = \"izzy_test_db.izzy_test_collection\"\n",
-    "DB_NAME, COLLECTION_NAME = NAMESPACE.split(\".\")\n",
+    "# INDEX_NAME = \"izzy-test-index-2\"\n",
+    "# NAMESPACE = \"izzy_test_db.izzy_test_collection\"\n",
+    "# DB_NAME, COLLECTION_NAME = NAMESPACE.split(\".\")\n",
    "\n",
    "client: MongoClient = MongoClient(CONNECTION_STRING)\n",
    "collection = client[DB_NAME][COLLECTION_NAME]\n",
@@ -200,23 +211,31 @@
    "    index_name=INDEX_NAME,\n",
    ")\n",
    "\n",
+    "# Read more about these variables in detail here. https://learn.microsoft.com/en-us/azure/cosmos-db/mongodb/vcore/vector-search\n",
    "num_lists = 100\n",
    "dimensions = 1536\n",
    "similarity_algorithm = CosmosDBSimilarityType.COS\n",
+    "kind = CosmosDBVectorSearchType.VECTOR_IVF\n",
+    "m = 16\n",
+    "ef_construction = 64\n",
+    "ef_search = 40\n",
+    "score_threshold = 0.1\n",
    "\n",
-    "vectorstore.create_index(num_lists, dimensions, similarity_algorithm)"
+    "vectorstore.create_index(\n",
+    "    num_lists, dimensions, similarity_algorithm, kind, m, ef_construction\n",
+    ")"
   ]
  },
  {
   "cell_type": "code",
-   "execution_count": 29,
+   "execution_count": 6,
   "id": "32c68d3246adc21f",
   "metadata": {
+    "collapsed": false,
    "ExecuteTime": {
-     "end_time": "2023-10-10T17:51:44.840121Z",
-     "start_time": "2023-10-10T17:51:44.498639Z"
-    },
-    "collapsed": false
+     "end_time": "2024-02-08T18:31:47.468902Z",
+     "start_time": "2024-02-08T18:31:46.053602Z"
+    }
   },
   "outputs": [],
   "source": [
@@ -227,14 +246,14 @@
  },
  {
   "cell_type": "code",
-   "execution_count": 31,
+   "execution_count": 7,
   "id": "8feeeb4364efb204",
   "metadata": {
+    "collapsed": false,
    "ExecuteTime": {
-     "end_time": "2023-10-10T17:52:08.049294Z",
-     "start_time": "2023-10-10T17:52:08.038511Z"
-    },
-    "collapsed": false
+     "end_time": "2024-02-08T18:31:50.982598Z",
+     "start_time": "2024-02-08T18:31:50.977605Z"
+    }
   },
   "outputs": [
    {
@@ -267,14 +286,14 @@
  },
  {
   "cell_type": "code",
-   "execution_count": 32,
+   "execution_count": 8,
   "id": "3c218ab6f59301f7",
   "metadata": {
+    "collapsed": false,
    "ExecuteTime": {
-     "end_time": "2023-10-10T17:52:14.994861Z",
-     "start_time": "2023-10-10T17:52:13.986379Z"
-    },
-    "collapsed": false
+     "end_time": "2024-02-08T18:32:14.299599Z",
+     "start_time": "2024-02-08T18:32:12.923464Z"
+    }
   },
   "outputs": [
    {
@@ -305,14 +324,14 @@
  },
  {
   "cell_type": "code",
-   "execution_count": 33,
+   "execution_count": 9,
   "id": "fd67e4d92c9ab32f",
   "metadata": {
+    "collapsed": false,
    "ExecuteTime": {
-     "end_time": "2023-10-10T17:53:21.145431Z",
-     "start_time": "2023-10-10T17:53:20.884531Z"
-    },
-    "collapsed": false
+     "end_time": "2024-02-08T18:32:24.021434Z",
+     "start_time": "2024-02-08T18:32:22.867658Z"
+    }
   },
   "outputs": [
    {
--- a/docs/docs/langsmith/img/log_traces.png
+++ b/docs/docs/langsmith/img/log_traces.png
--- a/docs/docs/langsmith/img/test_results.png
+++ b/docs/docs/langsmith/img/test_results.png
--- a/docs/docs/langsmith/walkthrough.ipynb
+++ b/docs/docs/langsmith/walkthrough.ipynb
--- a/docs/docs/modules/model_io/quick_start.mdx
+++ b/docs/docs/modules/model_io/quick_start.mdx
@@ -7,7 +7,7 @@ sidebar_position: 0
 The quick start will cover the basics of working with language models. It will introduce the two different types of models - LLMs and ChatModels. It will then cover how to use PromptTemplates to format the inputs to these models, and how to use Output Parsers to work with the outputs. For a deeper conceptual guide into these topics - please see [this documentation](./concepts)

 ## Models
-For this getting started guide, we will provide two options: using OpenAI (a popular model available via API) or using a local open source model.
+For this getting started guide, we will provide a few options: using an API like Anthropic or OpenAI, or using a local open source model via Ollama.

 import Tabs from '@theme/Tabs';
 import TabItem from '@theme/TabItem';
@@ -62,6 +62,35 @@ from langchain_community.chat_models import ChatOllama

 llm = Ollama(model="llama2")
 chat_model = ChatOllama()
+```
+
+  </TabItem>
+  <TabItem value="anthropic" label="Anthropic (chat model only)">
+
+First we'll need to import the LangChain x Anthropic package.
+
+```shell
+pip install langchain-anthropic
+```
+
+Accessing the API requires an API key, which you can get by creating an account [here](https://claude.ai/login). Once we have a key we'll want to set it as an environment variable by running:
+
+```shell
+export ANTHROPIC_API_KEY="..."
+```
+
+We can then initialize the model:
+
+```python
+from langchain_anthropic import ChatAnthropic
+
+chat_model = ChatAnthropic(model="claude-2.1", temperature=0.2, max_tokens=1024)
+```
+
+If you'd prefer not to set an environment variable you can pass the key in directly via the `anthropic_api_key` named parameter when initiating the Anthropic Chat Model class:
+
+```python
+chat_model = ChatAnthropic(anthropic_api_key="...")
 ```

  </TabItem>
@@ -84,7 +113,7 @@ We can then initialize the model:
 ```python
 from langchain_community.chat_models import ChatCohere

-llm = ChatCohere()
+chat_model = ChatCohere()
 ```

 If you'd prefer not to set an environment variable you can pass the key in directly via the `cohere_api_key` named parameter when initiating the Cohere LLM class:
@@ -92,7 +121,7 @@ If you'd prefer not to set an environment variable you can pass the key in direc
 ```python
 from langchain_community.chat_models import ChatCohere

-llm = ChatCohere(cohere_api_key="...")
+chat_model = ChatCohere(cohere_api_key="...")
 ```

  </TabItem>
--- a/docs/docs/use_cases/query_analysis/how_to/_category_.yml
+++ b/docs/docs/use_cases/query_analysis/how_to/_category_.yml
@@ -0,0 +1,2 @@
+position: 2
+label: 'How-To Guides'
--- a/docs/docs/use_cases/query_analysis/how_to/constructing-filters.ipynb
+++ b/docs/docs/use_cases/query_analysis/how_to/constructing-filters.ipynb
@@ -0,0 +1,190 @@
+{
+ "cells": [
+  {
+   "cell_type": "raw",
+   "id": "df7d42b9-58a6-434c-a2d7-0b61142f6d3e",
+   "metadata": {},
+   "source": [
+    "---\n",
+    "sidebar_position: 6\n",
+    "---"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "f2195672-0cab-4967-ba8a-c6544635547d",
+   "metadata": {},
+   "source": [
+    "# Construct Filters\n",
+    "\n",
+    "We may want to do query analysis to extract filters to pass into retrievers. One way we ask the LLM to represent these filters is as a Pydantic model. There is then the issue of converting that Pydantic model into a filter that can be passed into a retriever. \n",
+    "\n",
+    "This can be done manually, but LangChain also provides some \"Translators\" that are able to translate from a common syntax into filters specific to each retriever. Here, we will cover how to use those translators."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 13,
+   "id": "8ca446a0",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "from typing import Optional\n",
+    "\n",
+    "from langchain.chains.query_constructor.ir import (\n",
+    "    Comparator,\n",
+    "    Comparison,\n",
+    "    Operation,\n",
+    "    Operator,\n",
+    "    StructuredQuery,\n",
+    ")\n",
+    "from langchain.retrievers.self_query.chroma import ChromaTranslator\n",
+    "from langchain.retrievers.self_query.elasticsearch import ElasticsearchTranslator\n",
+    "from langchain_core.pydantic_v1 import BaseModel"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "bc1302ff",
+   "metadata": {},
+   "source": [
+    "In this example, `year` and `author` are both attributes to filter on."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 11,
+   "id": "64055006",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "class Search(BaseModel):\n",
+    "    query: str\n",
+    "    start_year: Optional[int]\n",
+    "    author: Optional[str]"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 12,
+   "id": "44eb6d98",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "search_query = Search(query=\"RAG\", start_year=2022, author=\"LangChain\")"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 15,
+   "id": "e8ba6705",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "def construct_comparisons(query: Search):\n",
+    "    comparisons = []\n",
+    "    if query.start_year is not None:\n",
+    "        comparisons.append(\n",
+    "            Comparison(\n",
+    "                comparator=Comparator.GT,\n",
+    "                attribute=\"start_year\",\n",
+    "                value=query.start_year,\n",
+    "            )\n",
+    "        )\n",
+    "    if query.author is not None:\n",
+    "        comparisons.append(\n",
+    "            Comparison(\n",
+    "                comparator=Comparator.EQ,\n",
+    "                attribute=\"author\",\n",
+    "                value=query.author,\n",
+    "            )\n",
+    "        )\n",
+    "    return comparisons"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 16,
+   "id": "6a79c9da",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "comparisons = construct_comparisons(search_query)"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 17,
+   "id": "2d0e9689",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "_filter = Operation(operator=Operator.AND, arguments=comparisons)"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 18,
+   "id": "e4c0b2ce",
+   "metadata": {},
+   "outputs": [
+    {
+     "data": {
+      "text/plain": [
+       "{'bool': {'must': [{'range': {'metadata.start_year': {'gt': 2022}}},\n",
+       "   {'term': {'metadata.author.keyword': 'LangChain'}}]}}"
+      ]
+     },
+     "execution_count": 18,
+     "metadata": {},
+     "output_type": "execute_result"
+    }
+   ],
+   "source": [
+    "ElasticsearchTranslator().visit_operation(_filter)"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 19,
+   "id": "d75455ae",
+   "metadata": {},
+   "outputs": [
+    {
+     "data": {
+      "text/plain": [
+       "{'$and': [{'start_year': {'$gt': 2022}}, {'author': {'$eq': 'LangChain'}}]}"
+      ]
+     },
+     "execution_count": 19,
+     "metadata": {},
+     "output_type": "execute_result"
+    }
+   ],
+   "source": [
+    "ChromaTranslator().visit_operation(_filter)"
+   ]
+  }
+ ],
+ "metadata": {
+  "kernelspec": {
+   "display_name": "Python 3 (ipykernel)",
+   "language": "python",
+   "name": "python3"
+  },
+  "language_info": {
+   "codemirror_mode": {
+    "name": "ipython",
+    "version": 3
+   },
+   "file_extension": ".py",
+   "mimetype": "text/x-python",
+   "name": "python",
+   "nbconvert_exporter": "python",
+   "pygments_lexer": "ipython3",
+   "version": "3.10.1"
+  }
+ },
+ "nbformat": 4,
+ "nbformat_minor": 5
+}
--- a/docs/docs/use_cases/query_analysis/how_to/few_shot.ipynb
+++ b/docs/docs/use_cases/query_analysis/how_to/few_shot.ipynb
@@ -15,9 +15,9 @@
   "id": "f2195672-0cab-4967-ba8a-c6544635547d",
   "metadata": {},
   "source": [
-    "# Adding examples to the prompt\n",
+    "# Add Examples to the Prompt\n",
    "\n",
-    "As our query analysis becomes more complex, adding examples to the prompt can meaningfully improve performance.\n",
+    "As our query analysis becomes more complex, the LLM may struggle to understand how exactly it should respond in certain scenarios. In order to improve performance here, we can add examples to the prompt to guide the LLM.\n",
    "\n",
    "Let's take a look at how we can add examples for the LangChain YouTube video query analyzer we built in the [Quickstart](/docs/use_cases/query_analysis/quickstart)."
   ]
@@ -377,7 +377,7 @@
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
-   "version": "3.9.1"
+   "version": "3.10.1"
  }
 },
 "nbformat": 4,
--- a/docs/docs/use_cases/query_analysis/how_to/high_cardinality.ipynb
+++ b/docs/docs/use_cases/query_analysis/how_to/high_cardinality.ipynb
@@ -0,0 +1,585 @@
+{
+ "cells": [
+  {
+   "cell_type": "raw",
+   "id": "df7d42b9-58a6-434c-a2d7-0b61142f6d3e",
+   "metadata": {},
+   "source": [
+    "---\n",
+    "sidebar_position: 7\n",
+    "---"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "f2195672-0cab-4967-ba8a-c6544635547d",
+   "metadata": {},
+   "source": [
+    "# Deal with High Cardinality Categoricals\n",
+    "\n",
+    "You may want to do query analysis to create a filter on a categorical column. One of the difficulties here is that you usually need to specify the EXACT categorical value. The issue is you need to make sure the LLM generates that categorical value exactly. This can be done relatively easy with prompting when there are only a few values that are valid. When there are a high number of valid values then it becomes more difficult, as those values may not fit in the LLM context, or (if they do) there may be too many for the LLM to properly attend to.\n",
+    "\n",
+    "In this notebook we take a look at how to approach this."
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "a4079b57-4369-49c9-b2ad-c809b5408d7e",
+   "metadata": {},
+   "source": [
+    "## Setup\n",
+    "#### Install dependencies"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 1,
+   "id": "e168ef5c-e54e-49a6-8552-5502854a6f01",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "# %pip install -qU langchain langchain-community langchain-openai faker"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "79d66a45-a05c-4d22-b011-b1cdbdfc8f9c",
+   "metadata": {},
+   "source": [
+    "#### Set environment variables\n",
+    "\n",
+    "We'll use OpenAI in this example:"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 1,
+   "id": "40e2979e-a818-4b96-ac25-039336f94319",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "import getpass\n",
+    "import os\n",
+    "\n",
+    "os.environ[\"OPENAI_API_KEY\"] = getpass.getpass()\n",
+    "\n",
+    "# Optional, uncomment to trace runs with LangSmith. Sign up here: https://smith.langchain.com.\n",
+    "# os.environ[\"LANGCHAIN_TRACING_V2\"] = \"true\"\n",
+    "# os.environ[\"LANGCHAIN_API_KEY\"] = getpass.getpass()"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "d8d47f4b",
+   "metadata": {},
+   "source": [
+    "#### Set up data\n",
+    "\n",
+    "We will generate a bunch of fake names"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 1,
+   "id": "e5ba65c2",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "from faker import Faker\n",
+    "\n",
+    "fake = Faker()\n",
+    "\n",
+    "names = [fake.name() for _ in range(10000)]"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "41133694",
+   "metadata": {},
+   "source": [
+    "Let's look at some of the names"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 2,
+   "id": "c901ea97",
+   "metadata": {},
+   "outputs": [
+    {
+     "data": {
+      "text/plain": [
+       "'Hayley Gonzalez'"
+      ]
+     },
+     "execution_count": 2,
+     "metadata": {},
+     "output_type": "execute_result"
+    }
+   ],
+   "source": [
+    "names[0]"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 3,
+   "id": "b0d42ae2",
+   "metadata": {},
+   "outputs": [
+    {
+     "data": {
+      "text/plain": [
+       "'Jesse Knight'"
+      ]
+     },
+     "execution_count": 3,
+     "metadata": {},
+     "output_type": "execute_result"
+    }
+   ],
+   "source": [
+    "names[567]"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "1725883d",
+   "metadata": {},
+   "source": [
+    "## Query Analysis\n",
+    "\n",
+    "We can now set up a baseline query analysis"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 4,
+   "id": "0ae69afc",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "from langchain_core.pydantic_v1 import BaseModel, Field"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 5,
+   "id": "6c9485ce",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "class Search(BaseModel):\n",
+    "    query: str\n",
+    "    author: str"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 6,
+   "id": "aebd704a",
+   "metadata": {},
+   "outputs": [
+    {
+     "name": "stderr",
+     "output_type": "stream",
+     "text": [
+      "/Users/harrisonchase/workplace/langchain/libs/core/langchain_core/_api/beta_decorator.py:86: LangChainBetaWarning: The function `with_structured_output` is in beta. It is actively being worked on, so the API may change.\n",
+      "  warn_beta(\n"
+     ]
+    }
+   ],
+   "source": [
+    "from langchain_core.prompts import ChatPromptTemplate\n",
+    "from langchain_core.runnables import RunnablePassthrough\n",
+    "from langchain_openai import ChatOpenAI\n",
+    "\n",
+    "system = \"\"\"Generate a relevant search query for a library system\"\"\"\n",
+    "prompt = ChatPromptTemplate.from_messages(\n",
+    "    [\n",
+    "        (\"system\", system),\n",
+    "        (\"human\", \"{question}\"),\n",
+    "    ]\n",
+    ")\n",
+    "llm = ChatOpenAI(model=\"gpt-3.5-turbo-0125\", temperature=0)\n",
+    "structured_llm = llm.with_structured_output(Search)\n",
+    "query_analyzer = {\"question\": RunnablePassthrough()} | prompt | structured_llm"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "41709a2e",
+   "metadata": {},
+   "source": [
+    "We can see that if we spell the name exactly correctly, it knows how to handle it"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 33,
+   "id": "cc0d344b",
+   "metadata": {},
+   "outputs": [
+    {
+     "data": {
+      "text/plain": [
+       "Search(query='books about aliens', author='Jesse Knight')"
+      ]
+     },
+     "execution_count": 33,
+     "metadata": {},
+     "output_type": "execute_result"
+    }
+   ],
+   "source": [
+    "query_analyzer.invoke(\"what are books about aliens by Jesse Knight\")"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "a1b57eab",
+   "metadata": {},
+   "source": [
+    "The issue is that the values you want to filter on may NOT be spelled exactly correctly"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 34,
+   "id": "82b6b2ad",
+   "metadata": {},
+   "outputs": [
+    {
+     "data": {
+      "text/plain": [
+       "Search(query='books about aliens', author='Jess Knight')"
+      ]
+     },
+     "execution_count": 34,
+     "metadata": {},
+     "output_type": "execute_result"
+    }
+   ],
+   "source": [
+    "query_analyzer.invoke(\"what are books about aliens by jess knight\")"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "0b60b7c2",
+   "metadata": {},
+   "source": [
+    "### Add in all values\n",
+    "\n",
+    "One way around this is to add ALL possible values to the prompt. That will generally guide the query in the right direction"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 35,
+   "id": "98788a94",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "system = \"\"\"Generate a relevant search query for a library system.\n",
+    "\n",
+    "`author` attribute MUST be one of:\n",
+    "\n",
+    "{authors}\n",
+    "\n",
+    "Do NOT hallucinate author name!\"\"\"\n",
+    "base_prompt = ChatPromptTemplate.from_messages(\n",
+    "    [\n",
+    "        (\"system\", system),\n",
+    "        (\"human\", \"{question}\"),\n",
+    "    ]\n",
+    ")\n",
+    "prompt = base_prompt.partial(authors=\", \".join(names))"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 36,
+   "id": "e65412f5",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "query_analyzer_all = {\"question\": RunnablePassthrough()} | prompt | structured_llm"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "e639285a",
+   "metadata": {},
+   "source": [
+    "However... if the list of categoricals is long enough, it may error!"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 37,
+   "id": "696b000f",
+   "metadata": {},
+   "outputs": [
+    {
+     "name": "stdout",
+     "output_type": "stream",
+     "text": [
+      "Error code: 400 - {'error': {'message': \"This model's maximum context length is 16385 tokens. However, your messages resulted in 33885 tokens (33855 in the messages, 30 in the functions). Please reduce the length of the messages or functions.\", 'type': 'invalid_request_error', 'param': 'messages', 'code': 'context_length_exceeded'}}\n"
+     ]
+    }
+   ],
+   "source": [
+    "try:\n",
+    "    res = query_analyzer_all.invoke(\"what are books about aliens by jess knight\")\n",
+    "except Exception as e:\n",
+    "    print(e)"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "1d5d7891",
+   "metadata": {},
+   "source": [
+    "We can try to use a longer context window... but with so much information in there, it is not garunteed to pick it up reliably"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 38,
+   "id": "0f0d0757",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "llm_long = ChatOpenAI(model=\"gpt-4-turbo-preview\", temperature=0)\n",
+    "structured_llm_long = llm_long.with_structured_output(Search)\n",
+    "query_analyzer_all = {\"question\": RunnablePassthrough()} | prompt | structured_llm_long"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 39,
+   "id": "03e5b7b2",
+   "metadata": {},
+   "outputs": [
+    {
+     "data": {
+      "text/plain": [
+       "Search(query='aliens', author='Kevin Knight')"
+      ]
+     },
+     "execution_count": 39,
+     "metadata": {},
+     "output_type": "execute_result"
+    }
+   ],
+   "source": [
+    "query_analyzer_all.invoke(\"what are books about aliens by jess knight\")"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "73ecf52b",
+   "metadata": {},
+   "source": [
+    "### Find and all relevant values\n",
+    "\n",
+    "Instead, what we can do is create an index over the relevant values and then query that for the N most relevant values,"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 25,
+   "id": "32b19e07",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "from langchain_community.vectorstores import Chroma\n",
+    "from langchain_openai import OpenAIEmbeddings\n",
+    "\n",
+    "embeddings = OpenAIEmbeddings(model=\"text-embedding-3-small\")\n",
+    "vectorstore = Chroma.from_texts(names, embeddings, collection_name=\"author_names\")"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 51,
+   "id": "774cb7b0",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "def select_names(question):\n",
+    "    _docs = vectorstore.similarity_search(question, k=10)\n",
+    "    _names = [d.page_content for d in _docs]\n",
+    "    return \", \".join(_names)"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 52,
+   "id": "1173159c",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "create_prompt = {\n",
+    "    \"question\": RunnablePassthrough(),\n",
+    "    \"authors\": select_names,\n",
+    "} | base_prompt"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 53,
+   "id": "0a892607",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "query_analyzer_select = create_prompt | structured_llm"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 54,
+   "id": "8195d7cd",
+   "metadata": {},
+   "outputs": [
+    {
+     "data": {
+      "text/plain": [
+       "ChatPromptValue(messages=[SystemMessage(content='Generate a relevant search query for a library system.\\n\\n`author` attribute MUST be one of:\\n\\nJesse Knight, Kelly Knight, Scott Knight, Richard Knight, Andrew Knight, Katherine Knight, Erica Knight, Ashley Knight, Becky Knight, Kevin Knight\\n\\nDo NOT hallucinate author name!'), HumanMessage(content='what are books by jess knight')])"
+      ]
+     },
+     "execution_count": 54,
+     "metadata": {},
+     "output_type": "execute_result"
+    }
+   ],
+   "source": [
+    "create_prompt.invoke(\"what are books by jess knight\")"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 55,
+   "id": "d3228b4e",
+   "metadata": {},
+   "outputs": [
+    {
+     "data": {
+      "text/plain": [
+       "Search(query='books about aliens', author='Jesse Knight')"
+      ]
+     },
+     "execution_count": 55,
+     "metadata": {},
+     "output_type": "execute_result"
+    }
+   ],
+   "source": [
+    "query_analyzer_select.invoke(\"what are books about aliens by jess knight\")"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "46ef88bb",
+   "metadata": {},
+   "source": [
+    "### Replace after selection\n",
+    "\n",
+    "Another method is to let the LLM fill in whatever value, but then convert that value to a valid value.\n",
+    "This can actually be done with the Pydantic class itself!"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 47,
+   "id": "a2e8b434",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "from langchain_core.pydantic_v1 import validator\n",
+    "\n",
+    "\n",
+    "class Search(BaseModel):\n",
+    "    query: str\n",
+    "    author: str\n",
+    "\n",
+    "    @validator(\"author\")\n",
+    "    def double(cls, v: str) -> str:\n",
+    "        return vectorstore.similarity_search(v, k=1)[0].page_content"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 48,
+   "id": "919c0601",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "system = \"\"\"Generate a relevant search query for a library system\"\"\"\n",
+    "prompt = ChatPromptTemplate.from_messages(\n",
+    "    [\n",
+    "        (\"system\", system),\n",
+    "        (\"human\", \"{question}\"),\n",
+    "    ]\n",
+    ")\n",
+    "corrective_structure_llm = llm.with_structured_output(Search)\n",
+    "corrective_query_analyzer = (\n",
+    "    {\"question\": RunnablePassthrough()} | prompt | corrective_structure_llm\n",
+    ")"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 50,
+   "id": "6c4f3e9a",
+   "metadata": {},
+   "outputs": [
+    {
+     "data": {
+      "text/plain": [
+       "Search(query='books about aliens', author='Jesse Knight')"
+      ]
+     },
+     "execution_count": 50,
+     "metadata": {},
+     "output_type": "execute_result"
+    }
+   ],
+   "source": [
+    "corrective_query_analyzer.invoke(\"what are books about aliens by jes knight\")"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "a309cb11",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "# TODO: show trigram similarity"
+   ]
+  }
+ ],
+ "metadata": {
+  "kernelspec": {
+   "display_name": "Python 3 (ipykernel)",
+   "language": "python",
+   "name": "python3"
+  },
+  "language_info": {
+   "codemirror_mode": {
+    "name": "ipython",
+    "version": 3
+   },
+   "file_extension": ".py",
+   "mimetype": "text/x-python",
+   "name": "python",
+   "nbconvert_exporter": "python",
+   "pygments_lexer": "ipython3",
+   "version": "3.10.1"
+  }
+ },
+ "nbformat": 4,
+ "nbformat_minor": 5
+}
--- a/docs/docs/use_cases/query_analysis/how_to/multiple_queries.ipynb
+++ b/docs/docs/use_cases/query_analysis/how_to/multiple_queries.ipynb
@@ -0,0 +1,329 @@
+{
+ "cells": [
+  {
+   "cell_type": "raw",
+   "id": "df7d42b9-58a6-434c-a2d7-0b61142f6d3e",
+   "metadata": {},
+   "source": [
+    "---\n",
+    "sidebar_position: 4\n",
+    "---"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "f2195672-0cab-4967-ba8a-c6544635547d",
+   "metadata": {},
+   "source": [
+    "# Handle Multiple Queries\n",
+    "\n",
+    "Sometimes, a query analysis technique may allow for multiple queries to be generated. In these cases, we need to remember to run all queries and then to combine the results. We will show a simple example (using mock data) of how to do that."
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "a4079b57-4369-49c9-b2ad-c809b5408d7e",
+   "metadata": {},
+   "source": [
+    "## Setup\n",
+    "#### Install dependencies"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 1,
+   "id": "e168ef5c-e54e-49a6-8552-5502854a6f01",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "# %pip install -qU langchain langchain-community langchain-openai chromadb"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "79d66a45-a05c-4d22-b011-b1cdbdfc8f9c",
+   "metadata": {},
+   "source": [
+    "#### Set environment variables\n",
+    "\n",
+    "We'll use OpenAI in this example:"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "40e2979e-a818-4b96-ac25-039336f94319",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "import getpass\n",
+    "import os\n",
+    "\n",
+    "os.environ[\"OPENAI_API_KEY\"] = getpass.getpass()\n",
+    "\n",
+    "# Optional, uncomment to trace runs with LangSmith. Sign up here: https://smith.langchain.com.\n",
+    "# os.environ[\"LANGCHAIN_TRACING_V2\"] = \"true\"\n",
+    "# os.environ[\"LANGCHAIN_API_KEY\"] = getpass.getpass()"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "c20b48b8-16d7-4089-bc17-f2d240b3935a",
+   "metadata": {},
+   "source": [
+    "### Create Index\n",
+    "\n",
+    "We will create a vectorstore over fake information."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 1,
+   "id": "1f621694",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "from langchain.text_splitter import RecursiveCharacterTextSplitter\n",
+    "from langchain_community.vectorstores import Chroma\n",
+    "from langchain_openai import OpenAIEmbeddings\n",
+    "\n",
+    "texts = [\"Harrison worked at Kensho\", \"Ankush worked at Facebook\"]\n",
+    "embeddings = OpenAIEmbeddings(model=\"text-embedding-3-small\")\n",
+    "vectorstore = Chroma.from_texts(\n",
+    "    texts,\n",
+    "    embeddings,\n",
+    ")\n",
+    "retriever = vectorstore.as_retriever(search_kwargs={\"k\": 1})"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "57396e23-c192-4d97-846b-5eacea4d6b8d",
+   "metadata": {},
+   "source": [
+    "## Query analysis\n",
+    "\n",
+    "We will use function calling to structure the output. We will let it return multiple queries."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 2,
+   "id": "0b51dd76-820d-41a4-98c8-893f6fe0d1ea",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "from typing import List, Optional\n",
+    "\n",
+    "from langchain_core.pydantic_v1 import BaseModel, Field\n",
+    "\n",
+    "\n",
+    "class Search(BaseModel):\n",
+    "    \"\"\"Search over a database of job records.\"\"\"\n",
+    "\n",
+    "    queries: List[str] = Field(\n",
+    "        ...,\n",
+    "        description=\"Distinct queries to search for\",\n",
+    "    )"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 3,
+   "id": "783c03c3-8c72-4f88-9cf4-5829ce6745d6",
+   "metadata": {},
+   "outputs": [
+    {
+     "name": "stderr",
+     "output_type": "stream",
+     "text": [
+      "/Users/harrisonchase/workplace/langchain/libs/core/langchain_core/_api/beta_decorator.py:86: LangChainBetaWarning: The function `with_structured_output` is in beta. It is actively being worked on, so the API may change.\n",
+      "  warn_beta(\n"
+     ]
+    }
+   ],
+   "source": [
+    "from langchain_core.output_parsers.openai_tools import PydanticToolsParser\n",
+    "from langchain_core.prompts import ChatPromptTemplate\n",
+    "from langchain_core.runnables import RunnablePassthrough\n",
+    "from langchain_openai import ChatOpenAI\n",
+    "\n",
+    "output_parser = PydanticToolsParser(tools=[Search])\n",
+    "\n",
+    "system = \"\"\"You have the ability to issue search queries to get information to help answer user information.\n",
+    "\n",
+    "If you need to look up two distinct pieces of information, you are allowed to do that!\"\"\"\n",
+    "prompt = ChatPromptTemplate.from_messages(\n",
+    "    [\n",
+    "        (\"system\", system),\n",
+    "        (\"human\", \"{question}\"),\n",
+    "    ]\n",
+    ")\n",
+    "llm = ChatOpenAI(model=\"gpt-3.5-turbo-0125\", temperature=0)\n",
+    "structured_llm = llm.with_structured_output(Search)\n",
+    "query_analyzer = {\"question\": RunnablePassthrough()} | prompt | structured_llm"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "b9564078",
+   "metadata": {},
+   "source": [
+    "We can see that this allows for creating multiple queries"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 4,
+   "id": "bc1d3863",
+   "metadata": {},
+   "outputs": [
+    {
+     "data": {
+      "text/plain": [
+       "Search(queries=['Harrison work location'])"
+      ]
+     },
+     "execution_count": 4,
+     "metadata": {},
+     "output_type": "execute_result"
+    }
+   ],
+   "source": [
+    "query_analyzer.invoke(\"where did Harrison Work\")"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 5,
+   "id": "af62af17-4f90-4dbd-a8b4-dfff51f1db95",
+   "metadata": {},
+   "outputs": [
+    {
+     "data": {
+      "text/plain": [
+       "Search(queries=['Harrison work place', 'Ankush work place'])"
+      ]
+     },
+     "execution_count": 5,
+     "metadata": {},
+     "output_type": "execute_result"
+    }
+   ],
+   "source": [
+    "query_analyzer.invoke(\"where did Harrison and ankush Work\")"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "c7c65b2f-7881-45fc-a47b-a4eaaf48245f",
+   "metadata": {},
+   "source": [
+    "## Retrieval with query analysis\n",
+    "\n",
+    "So how would we include this in a chain? One thing that will make this a lot easier is if we call our retriever asyncronously - this will let us loop over the queries and not get blocked on the response time."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 6,
+   "id": "1e047d87",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "from langchain_core.runnables import chain"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 31,
+   "id": "8dac7866",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "@chain\n",
+    "async def custom_chain(question):\n",
+    "    response = await query_analyzer.ainvoke(question)\n",
+    "    docs = []\n",
+    "    for query in response.queries:\n",
+    "        new_docs = await retriever.ainvoke(query)\n",
+    "        docs.extend(new_docs)\n",
+    "    # You probably want to think about reranking or deduplicating documents here\n",
+    "    # But that is a separate topic\n",
+    "    return docs"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 33,
+   "id": "232ad8a7-7990-4066-9228-d35a555f7293",
+   "metadata": {},
+   "outputs": [
+    {
+     "data": {
+      "text/plain": [
+       "[Document(page_content='Harrison worked at Kensho')]"
+      ]
+     },
+     "execution_count": 33,
+     "metadata": {},
+     "output_type": "execute_result"
+    }
+   ],
+   "source": [
+    "await custom_chain.ainvoke(\"where did Harrison Work\")"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 34,
+   "id": "28e14ba5",
+   "metadata": {},
+   "outputs": [
+    {
+     "data": {
+      "text/plain": [
+       "[Document(page_content='Harrison worked at Kensho'),\n",
+       " Document(page_content='Ankush worked at Facebook')]"
+      ]
+     },
+     "execution_count": 34,
+     "metadata": {},
+     "output_type": "execute_result"
+    }
+   ],
+   "source": [
+    "await custom_chain.ainvoke(\"where did Harrison and ankush Work\")"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "88de5a36",
+   "metadata": {},
+   "outputs": [],
+   "source": []
+  }
+ ],
+ "metadata": {
+  "kernelspec": {
+   "display_name": "Python 3 (ipykernel)",
+   "language": "python",
+   "name": "python3"
+  },
+  "language_info": {
+   "codemirror_mode": {
+    "name": "ipython",
+    "version": 3
+   },
+   "file_extension": ".py",
+   "mimetype": "text/x-python",
+   "name": "python",
+   "nbconvert_exporter": "python",
+   "pygments_lexer": "ipython3",
+   "version": "3.10.1"
+  }
+ },
+ "nbformat": 4,
+ "nbformat_minor": 5
+}
--- a/docs/docs/use_cases/query_analysis/how_to/multiple_retrievers.ipynb
+++ b/docs/docs/use_cases/query_analysis/how_to/multiple_retrievers.ipynb
@@ -0,0 +1,331 @@
+{
+ "cells": [
+  {
+   "cell_type": "raw",
+   "id": "df7d42b9-58a6-434c-a2d7-0b61142f6d3e",
+   "metadata": {},
+   "source": [
+    "---\n",
+    "sidebar_position: 5\n",
+    "---"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "f2195672-0cab-4967-ba8a-c6544635547d",
+   "metadata": {},
+   "source": [
+    "# Handle Multiple Retrievers\n",
+    "\n",
+    "Sometimes, a query analysis technique may allow for selection of which retriever to use. To use this, you will need to add some logic to select the retriever to do. We will show a simple example (using mock data) of how to do that."
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "a4079b57-4369-49c9-b2ad-c809b5408d7e",
+   "metadata": {},
+   "source": [
+    "## Setup\n",
+    "#### Install dependencies"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 1,
+   "id": "e168ef5c-e54e-49a6-8552-5502854a6f01",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "# %pip install -qU langchain langchain-community langchain-openai chromadb"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "79d66a45-a05c-4d22-b011-b1cdbdfc8f9c",
+   "metadata": {},
+   "source": [
+    "#### Set environment variables\n",
+    "\n",
+    "We'll use OpenAI in this example:"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "40e2979e-a818-4b96-ac25-039336f94319",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "import getpass\n",
+    "import os\n",
+    "\n",
+    "os.environ[\"OPENAI_API_KEY\"] = getpass.getpass()\n",
+    "\n",
+    "# Optional, uncomment to trace runs with LangSmith. Sign up here: https://smith.langchain.com.\n",
+    "# os.environ[\"LANGCHAIN_TRACING_V2\"] = \"true\"\n",
+    "# os.environ[\"LANGCHAIN_API_KEY\"] = getpass.getpass()"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "c20b48b8-16d7-4089-bc17-f2d240b3935a",
+   "metadata": {},
+   "source": [
+    "### Create Index\n",
+    "\n",
+    "We will create a vectorstore over fake information."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 16,
+   "id": "1f621694",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "from langchain.text_splitter import RecursiveCharacterTextSplitter\n",
+    "from langchain_community.vectorstores import Chroma\n",
+    "from langchain_openai import OpenAIEmbeddings\n",
+    "\n",
+    "texts = [\"Harrison worked at Kensho\"]\n",
+    "embeddings = OpenAIEmbeddings(model=\"text-embedding-3-small\")\n",
+    "vectorstore = Chroma.from_texts(texts, embeddings, collection_name=\"harrison\")\n",
+    "retriever_harrison = vectorstore.as_retriever(search_kwargs={\"k\": 1})\n",
+    "\n",
+    "texts = [\"Ankush worked at Facebook\"]\n",
+    "embeddings = OpenAIEmbeddings(model=\"text-embedding-3-small\")\n",
+    "vectorstore = Chroma.from_texts(texts, embeddings, collection_name=\"ankush\")\n",
+    "retriever_ankush = vectorstore.as_retriever(search_kwargs={\"k\": 1})"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "57396e23-c192-4d97-846b-5eacea4d6b8d",
+   "metadata": {},
+   "source": [
+    "## Query analysis\n",
+    "\n",
+    "We will use function calling to structure the output. We will let it return multiple queries."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 17,
+   "id": "0b51dd76-820d-41a4-98c8-893f6fe0d1ea",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "from typing import List, Optional\n",
+    "\n",
+    "from langchain_core.pydantic_v1 import BaseModel, Field\n",
+    "\n",
+    "\n",
+    "class Search(BaseModel):\n",
+    "    \"\"\"Search for information about a person.\"\"\"\n",
+    "\n",
+    "    query: str = Field(\n",
+    "        ...,\n",
+    "        description=\"Query to look up\",\n",
+    "    )\n",
+    "    person: str = Field(\n",
+    "        ...,\n",
+    "        description=\"Person to look things up for. Should be `HARRISON` or `ANKUSH`.\",\n",
+    "    )"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 18,
+   "id": "783c03c3-8c72-4f88-9cf4-5829ce6745d6",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "from langchain_core.output_parsers.openai_tools import PydanticToolsParser\n",
+    "from langchain_core.prompts import ChatPromptTemplate\n",
+    "from langchain_core.runnables import RunnablePassthrough\n",
+    "from langchain_openai import ChatOpenAI\n",
+    "\n",
+    "output_parser = PydanticToolsParser(tools=[Search])\n",
+    "\n",
+    "system = \"\"\"You have the ability to issue search queries to get information to help answer user information.\"\"\"\n",
+    "prompt = ChatPromptTemplate.from_messages(\n",
+    "    [\n",
+    "        (\"system\", system),\n",
+    "        (\"human\", \"{question}\"),\n",
+    "    ]\n",
+    ")\n",
+    "llm = ChatOpenAI(model=\"gpt-3.5-turbo-0125\", temperature=0)\n",
+    "structured_llm = llm.with_structured_output(Search)\n",
+    "query_analyzer = {\"question\": RunnablePassthrough()} | prompt | structured_llm"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "b9564078",
+   "metadata": {},
+   "source": [
+    "We can see that this allows for routing between retrievers"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 19,
+   "id": "bc1d3863",
+   "metadata": {},
+   "outputs": [
+    {
+     "data": {
+      "text/plain": [
+       "Search(query='workplace', person='HARRISON')"
+      ]
+     },
+     "execution_count": 19,
+     "metadata": {},
+     "output_type": "execute_result"
+    }
+   ],
+   "source": [
+    "query_analyzer.invoke(\"where did Harrison Work\")"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 20,
+   "id": "af62af17-4f90-4dbd-a8b4-dfff51f1db95",
+   "metadata": {},
+   "outputs": [
+    {
+     "data": {
+      "text/plain": [
+       "Search(query='workplace', person='ANKUSH')"
+      ]
+     },
+     "execution_count": 20,
+     "metadata": {},
+     "output_type": "execute_result"
+    }
+   ],
+   "source": [
+    "query_analyzer.invoke(\"where did ankush Work\")"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "c7c65b2f-7881-45fc-a47b-a4eaaf48245f",
+   "metadata": {},
+   "source": [
+    "## Retrieval with query analysis\n",
+    "\n",
+    "So how would we include this in a chain? We just need some simple logic to select the retriever and pass in the search query"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 21,
+   "id": "1e047d87",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "from langchain_core.runnables import chain"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 22,
+   "id": "4ec0c7fe",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "retrievers = {\n",
+    "    \"HARRISON\": retriever_harrison,\n",
+    "    \"ANKUSH\": retriever_ankush,\n",
+    "}"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 23,
+   "id": "8dac7866",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "@chain\n",
+    "def custom_chain(question):\n",
+    "    response = query_analyzer.invoke(question)\n",
+    "    retriever = retrievers[response.person]\n",
+    "    return retriever.invoke(response.query)"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 24,
+   "id": "232ad8a7-7990-4066-9228-d35a555f7293",
+   "metadata": {},
+   "outputs": [
+    {
+     "data": {
+      "text/plain": [
+       "[Document(page_content='Harrison worked at Kensho')]"
+      ]
+     },
+     "execution_count": 24,
+     "metadata": {},
+     "output_type": "execute_result"
+    }
+   ],
+   "source": [
+    "custom_chain.invoke(\"where did Harrison Work\")"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 25,
+   "id": "28e14ba5",
+   "metadata": {},
+   "outputs": [
+    {
+     "data": {
+      "text/plain": [
+       "[Document(page_content='Ankush worked at Facebook')]"
+      ]
+     },
+     "execution_count": 25,
+     "metadata": {},
+     "output_type": "execute_result"
+    }
+   ],
+   "source": [
+    "custom_chain.invoke(\"where did ankush Work\")"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "33338d4f",
+   "metadata": {},
+   "outputs": [],
+   "source": []
+  }
+ ],
+ "metadata": {
+  "kernelspec": {
+   "display_name": "Python 3 (ipykernel)",
+   "language": "python",
+   "name": "python3"
+  },
+  "language_info": {
+   "codemirror_mode": {
+    "name": "ipython",
+    "version": 3
+   },
+   "file_extension": ".py",
+   "mimetype": "text/x-python",
+   "name": "python",
+   "nbconvert_exporter": "python",
+   "pygments_lexer": "ipython3",
+   "version": "3.10.1"
+  }
+ },
+ "nbformat": 4,
+ "nbformat_minor": 5
+}
--- a/docs/docs/use_cases/query_analysis/how_to/no_queries.ipynb
+++ b/docs/docs/use_cases/query_analysis/how_to/no_queries.ipynb
@@ -0,0 +1,328 @@
+{
+ "cells": [
+  {
+   "cell_type": "raw",
+   "id": "df7d42b9-58a6-434c-a2d7-0b61142f6d3e",
+   "metadata": {},
+   "source": [
+    "---\n",
+    "sidebar_position: 3\n",
+    "---"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "f2195672-0cab-4967-ba8a-c6544635547d",
+   "metadata": {},
+   "source": [
+    "# Handle Cases Where No Queries are Generated\n",
+    "\n",
+    "Sometimes, a query analysis technique may allow for any number of queries to be generated - including no queries! In this case, our overall chain will need to inspect the result of the query analysis before deciding whether to call the retriever or not.\n",
+    "\n",
+    "We will use mock data for this example."
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "a4079b57-4369-49c9-b2ad-c809b5408d7e",
+   "metadata": {},
+   "source": [
+    "## Setup\n",
+    "#### Install dependencies"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 1,
+   "id": "e168ef5c-e54e-49a6-8552-5502854a6f01",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "# %pip install -qU langchain langchain-community langchain-openai chromadb"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "79d66a45-a05c-4d22-b011-b1cdbdfc8f9c",
+   "metadata": {},
+   "source": [
+    "#### Set environment variables\n",
+    "\n",
+    "We'll use OpenAI in this example:"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "40e2979e-a818-4b96-ac25-039336f94319",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "import getpass\n",
+    "import os\n",
+    "\n",
+    "os.environ[\"OPENAI_API_KEY\"] = getpass.getpass()\n",
+    "\n",
+    "# Optional, uncomment to trace runs with LangSmith. Sign up here: https://smith.langchain.com.\n",
+    "# os.environ[\"LANGCHAIN_TRACING_V2\"] = \"true\"\n",
+    "# os.environ[\"LANGCHAIN_API_KEY\"] = getpass.getpass()"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "c20b48b8-16d7-4089-bc17-f2d240b3935a",
+   "metadata": {},
+   "source": [
+    "### Create Index\n",
+    "\n",
+    "We will create a vectorstore over fake information."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 1,
+   "id": "1f621694",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "from langchain.text_splitter import RecursiveCharacterTextSplitter\n",
+    "from langchain_community.vectorstores import Chroma\n",
+    "from langchain_openai import OpenAIEmbeddings\n",
+    "\n",
+    "texts = [\"Harrison worked at Kensho\"]\n",
+    "embeddings = OpenAIEmbeddings(model=\"text-embedding-3-small\")\n",
+    "vectorstore = Chroma.from_texts(\n",
+    "    texts,\n",
+    "    embeddings,\n",
+    ")\n",
+    "retriever = vectorstore.as_retriever()"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "57396e23-c192-4d97-846b-5eacea4d6b8d",
+   "metadata": {},
+   "source": [
+    "## Query analysis\n",
+    "\n",
+    "We will use function calling to structure the output. However, we will configure the LLM such that is doesn't NEED to call the function representing a search query (should it decide not to). We will also then use a prompt to do query analysis that explicitly lays when it should and shouldn't make a search."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 2,
+   "id": "0b51dd76-820d-41a4-98c8-893f6fe0d1ea",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "from typing import Optional\n",
+    "\n",
+    "from langchain_core.pydantic_v1 import BaseModel, Field\n",
+    "\n",
+    "\n",
+    "class Search(BaseModel):\n",
+    "    \"\"\"Search over a database of job records.\"\"\"\n",
+    "\n",
+    "    query: str = Field(\n",
+    "        ...,\n",
+    "        description=\"Similarity search query applied to job record.\",\n",
+    "    )"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 3,
+   "id": "783c03c3-8c72-4f88-9cf4-5829ce6745d6",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "from langchain_core.prompts import ChatPromptTemplate\n",
+    "from langchain_core.runnables import RunnablePassthrough\n",
+    "from langchain_openai import ChatOpenAI\n",
+    "\n",
+    "system = \"\"\"You have the ability to issue search queries to get information to help answer user information.\n",
+    "\n",
+    "You do not NEED to look things up. If you don't need to, then just respond normally.\"\"\"\n",
+    "prompt = ChatPromptTemplate.from_messages(\n",
+    "    [\n",
+    "        (\"system\", system),\n",
+    "        (\"human\", \"{question}\"),\n",
+    "    ]\n",
+    ")\n",
+    "llm = ChatOpenAI(model=\"gpt-3.5-turbo-0125\", temperature=0)\n",
+    "structured_llm = llm.bind_tools([Search])\n",
+    "query_analyzer = {\"question\": RunnablePassthrough()} | prompt | structured_llm"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "b9564078",
+   "metadata": {},
+   "source": [
+    "We can see that by invoking this we get an message that sometimes - but not always - returns a tool call."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 4,
+   "id": "bc1d3863",
+   "metadata": {},
+   "outputs": [
+    {
+     "data": {
+      "text/plain": [
+       "AIMessage(content='', additional_kwargs={'tool_calls': [{'id': 'call_ZnoVX4j9Mn8wgChaORyd1cvq', 'function': {'arguments': '{\"query\":\"Harrison\"}', 'name': 'Search'}, 'type': 'function'}]})"
+      ]
+     },
+     "execution_count": 4,
+     "metadata": {},
+     "output_type": "execute_result"
+    }
+   ],
+   "source": [
+    "query_analyzer.invoke(\"where did Harrison Work\")"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 5,
+   "id": "af62af17-4f90-4dbd-a8b4-dfff51f1db95",
+   "metadata": {},
+   "outputs": [
+    {
+     "data": {
+      "text/plain": [
+       "AIMessage(content='Hello! How can I assist you today?')"
+      ]
+     },
+     "execution_count": 5,
+     "metadata": {},
+     "output_type": "execute_result"
+    }
+   ],
+   "source": [
+    "query_analyzer.invoke(\"hi!\")"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "c7c65b2f-7881-45fc-a47b-a4eaaf48245f",
+   "metadata": {},
+   "source": [
+    "## Retrieval with query analysis\n",
+    "\n",
+    "So how would we include this in a chain? Let's look at an example below."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 6,
+   "id": "1e047d87",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "from langchain_core.output_parsers.openai_tools import PydanticToolsParser\n",
+    "from langchain_core.runnables import chain\n",
+    "\n",
+    "output_parser = PydanticToolsParser(tools=[Search])"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 7,
+   "id": "8dac7866",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "@chain\n",
+    "def custom_chain(question):\n",
+    "    response = query_analyzer.invoke(question)\n",
+    "    if \"tool_calls\" in response.additional_kwargs:\n",
+    "        query = output_parser.invoke(response)\n",
+    "        docs = retriever.invoke(query[0].query)\n",
+    "        # Could add more logic - like another LLM call - here\n",
+    "        return docs\n",
+    "    else:\n",
+    "        return response"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 8,
+   "id": "232ad8a7-7990-4066-9228-d35a555f7293",
+   "metadata": {},
+   "outputs": [
+    {
+     "name": "stderr",
+     "output_type": "stream",
+     "text": [
+      "Number of requested results 4 is greater than number of elements in index 1, updating n_results = 1\n"
+     ]
+    },
+    {
+     "data": {
+      "text/plain": [
+       "[Document(page_content='Harrison worked at Kensho')]"
+      ]
+     },
+     "execution_count": 8,
+     "metadata": {},
+     "output_type": "execute_result"
+    }
+   ],
+   "source": [
+    "custom_chain.invoke(\"where did Harrison Work\")"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 9,
+   "id": "28e14ba5",
+   "metadata": {},
+   "outputs": [
+    {
+     "data": {
+      "text/plain": [
+       "AIMessage(content='Hello! How can I assist you today?')"
+      ]
+     },
+     "execution_count": 9,
+     "metadata": {},
+     "output_type": "execute_result"
+    }
+   ],
+   "source": [
+    "custom_chain.invoke(\"hi!\")"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "33338d4f",
+   "metadata": {},
+   "outputs": [],
+   "source": []
+  }
+ ],
+ "metadata": {
+  "kernelspec": {
+   "display_name": "Python 3 (ipykernel)",
+   "language": "python",
+   "name": "python3"
+  },
+  "language_info": {
+   "codemirror_mode": {
+    "name": "ipython",
+    "version": 3
+   },
+   "file_extension": ".py",
+   "mimetype": "text/x-python",
+   "name": "python",
+   "nbconvert_exporter": "python",
+   "pygments_lexer": "ipython3",
+   "version": "3.10.1"
+  }
+ },
+ "nbformat": 4,
+ "nbformat_minor": 5
+}
--- a/docs/docs/use_cases/query_analysis/index.ipynb
+++ b/docs/docs/use_cases/query_analysis/index.ipynb
@@ -17,29 +17,26 @@
   "source": [
    "# Query analysis\n",
    "\n",
-    "In any question answering application we need to retrieve information based on a user question. The simplest way to do this involves passing the user question directly to a retriever. However, in many cases it can improve performance by \"optimizing\" the query in some way. This is typically done by an LLM. Specifically, this involves passing the raw question (or list of messages) into an LLM and returning one or more optimized queries, which typically contain a string and optionally other structured information.\n",
+    "\"Search\" powers many use cases - including the \"retrieval\" part of Retrieval Augmented Generation. The simplest way to do this involves passing the user question directly to a retriever. In order to improve performance, you can also \"optimize\" the query in some way using *query analysis*. This is traditionally done by rule-based techniques, but with the rise of LLMs it is becoming more popular and more feasible to use an LLM for this. Specifically, this involves passing the raw question (or list of messages) into an LLM and returning one or more optimized queries, which typically contain a string and optionally other structured information.\n",
    "\n",
    "![Query Analysis](../../../static/img/query_analysis.png)\n",
    "\n",
-    "## Background Information\n",
-    "\n",
-    "This guide assumes familiarity with the basic building blocks of a simple RAG application outlined in the [Q&A with RAG Quickstart](/docs/use_cases/question_answering/quickstart). Please read and understand that before diving in here.\n",
    "\n",
    "## Problems Solved\n",
    "\n",
-    "Query analysis helps solves problems where the user question is not optimal to pass into the retriever. This can be the case when:\n",
+    "Query analysis helps to optimize the search query to send to the retriever. This can be the case when:\n",
    "\n",
    "* The retriever supports searches and filters against specific fields of the data, and user input could be referring to any of these fields,\n",
    "* The user input contains multiple distinct questions in it,\n",
-    "* To get the relevant information multiple queries are needed,\n",
+    "* To retrieve relevant information multiple queries are needed,\n",
    "* Search quality is sensitive to phrasing,\n",
    "* There are multiple retrievers that could be searched over, and the user input could be reffering to any of them.\n",
    "\n",
-    "Note that different problems will require different solutions. In order to determine what query analysis technique you should use, you will want to understand exactly what the problem with your current retrieval system is. This is best done by looking at failure data points of your current application and identifying common themes. Only once you know what your problems are can you begin to solve them.\n",
+    "Note that different problems will require different solutions. In order to determine what query analysis technique you should use, you will want to understand exactly what is the problem with your current retrieval system. This is best done by looking at failure data points of your current application and identifying common themes. Only once you know what your problems are can you begin to solve them.\n",
    "\n",
    "## Quickstart\n",
    "\n",
-    "Head to the [quickstart](/docs/use_cases/query_analysis/quickstart) to see how to use query analysis in a basic end-to-end example. This will cover creating a simple index, showing a failure mode that occur when passing a raw user question to that index, and then an example of how query analysis can help address that issue. There are MANY different query analysis techniques (see below) and this end-to-end example will not show all of them.\n",
+    "Head to the [quickstart](/docs/use_cases/query_analysis/quickstart) to see how to use query analysis in a basic end-to-end example. This will cover creating a search engine over the content of LangChain YouTube videos, showing a failure mode that occurs when passing a raw user question to that index, and then an example of how query analysis can help address that issue. The quickstart focuses on **query structuring**. Below are additional query analysis techniques that may be relevant based on your data and use case\n",
    "\n",
    "\n",
    "## Techniques\n",
@@ -55,15 +52,28 @@
    "\n",
    "## How to\n",
    "\n",
-    "* [Add examples to prompt](/docs/use_cases/query_analysis/few_shot): As our query analysis becomes more complex, adding examples to the prompt can meaningfully improve performance."
+    "* [Add examples to prompt](/docs/use_cases/query_analysis/few_shot): As our query analysis becomes more complex, adding examples to the prompt can meaningfully improve performance.\n",
+    "* [Deal with High Cardinality Categoricals](/docs/use_cases/query_analysis/high_cardinality): Many structured queries you will create will involve categorical variables. When there are a lot of potential values there, it can be difficult to do this correctly.\n",
+    "* [Construct Filters](/docs/use_cases/query_analysis/constructing-filters): This guide covers how to go from a Pydantic model to a filters in the query language specific to the vectorstore you are working with\n",
+    "* [Handle Multiple Queries](/docs/use_cases/query_analysis/multiple_queries): Some query analysis techniques generate multiple queries. This guide handles how to pass them all to the retriever.\n",
+    "* [Handle No Queries](/docs/use_cases/query_analysis/no_queries): Some query analysis techniques may not generate a query at all. This guide handles how to gracefully handle those situations\n",
+    "* [Handle Multiple Retrievers](/docs/use_cases/query_analysis/multiple_retrievers): Some query analysis techniques involve routing between multiple retrievers. This guide covers how to handle that gracefully"
   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "4581f8b1",
+   "metadata": {},
+   "outputs": [],
+   "source": []
  }
 ],
 "metadata": {
  "kernelspec": {
-   "display_name": "poetry-venv-2",
+   "display_name": "Python 3 (ipykernel)",
   "language": "python",
-   "name": "poetry-venv-2"
+   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
@@ -75,7 +85,7 @@
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
-   "version": "3.9.1"
+   "version": "3.10.1"
  }
 },
 "nbformat": 4,
--- a/docs/docs/use_cases/query_analysis/quickstart.ipynb
+++ b/docs/docs/use_cases/query_analysis/quickstart.ipynb
@@ -17,7 +17,7 @@
   "source": [
    "# Quickstart\n",
    "\n",
-    "This example will show how to use query analysis in a basic end-to-end example. This will cover creating a simple index, showing a failure mode that occur when passing a raw user question to that index, and then an example of how query analysis can help address that issue. There are MANY different query analysis techniques and this end-to-end example will not show all of them.\n",
+    "This page will show how to use query analysis in a basic end-to-end example. This will cover creating a simple search engine, showing a failure mode that occurs when passing a raw user question to that search, and then an example of how query analysis can help address that issue. There are MANY different query analysis techniques and this end-to-end example will not show all of them.\n",
    "\n",
    "For the purpose of this example, we will do retrieval over the LangChain YouTube videos."
   ]
@@ -38,7 +38,7 @@
   "metadata": {},
   "outputs": [],
   "source": [
-    "# %pip install -qU langchain langchain-community langchain-openai youtube-transcript-api pytube faiss-cpu"
+    "# %pip install -qU langchain langchain-community langchain-openai youtube-transcript-api pytube chromadb"
   ]
  },
  {
@@ -337,7 +337,7 @@
   "id": "4790e2db-3c6e-440b-b6e8-ebdd6600fda5",
   "metadata": {},
   "source": [
-    "Our first result is from 2024, and not very relevant to the input. Since we're just searching against document contents, there's no way for the results to be filtered on any document attributes.\n",
+    "Our first result is from 2024 (despite us asking for videos from 2023), and not very relevant to the input. Since we're just searching against document contents, there's no way for the results to be filtered on any document attributes.\n",
    "\n",
    "This is just one failure mode that can arise. Let's now take a look at how a basic form of query analysis can fix it!"
   ]
@@ -349,7 +349,7 @@
   "source": [
    "## Query analysis\n",
    "\n",
-    "To handle these failure modes we'll do some query structuring. This will involve defining a **query schema** that contains some date filters and use a function-calling model to convert a user question into a structured queries. \n",
+    "We can use query analysis to improve the results of retrieval. This will involve defining a **query schema** that contains some date filters and use a function-calling model to convert a user question into a structured queries. \n",
    "\n",
    "### Query schema\n",
    "In this case we'll have explicit min and max attributes for publication date so that it can be filtered on."
@@ -384,7 +384,7 @@
   "source": [
    "### Query generation\n",
    "\n",
-    "To convert user questions to structured queries we'll make use of OpenAI's function-calling API. Specifically we'll use the new [ChatModel.with_structured_output()](/docs/guides/structured_output) constructor to handle passing the schema to the model and parsing the output."
+    "To convert user questions to structured queries we'll make use of OpenAI's tool-calling API. Specifically we'll use the new [ChatModel.with_structured_output()](/docs/guides/structured_output) constructor to handle passing the schema to the model and parsing the output."
   ]
  },
  {
@@ -482,7 +482,7 @@
    "\n",
    "Our query analysis looks pretty good; now let's try using our generated queries to actually perform retrieval. \n",
    "\n",
-    "**Note:** in our example, we specified `tool_choice=\"Search\"`. This will force the LLM to call one - and only one - function, meaning that we will always have one optimized query to look up. Note that this is not always the case - see other guides for how to deal with situations when no - or multiple - optmized queries are returned."
+    "**Note:** in our example, we specified `tool_choice=\"Search\"`. This will force the LLM to call one - and only one - tool, meaning that we will always have one optimized query to look up. Note that this is not always the case - see other guides for how to deal with situations when no - or multiple - optmized queries are returned."
   ]
  },
  {
@@ -583,7 +583,7 @@
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
-   "version": "3.9.1"
+   "version": "3.10.1"
  }
 },
 "nbformat": 4,
--- a/docs/docs/use_cases/query_analysis/techniques/hyde.ipynb
+++ b/docs/docs/use_cases/query_analysis/techniques/hyde.ipynb
@@ -15,7 +15,7 @@
   "id": "f2195672-0cab-4967-ba8a-c6544635547d",
   "metadata": {},
   "source": [
-    "# HyDE\n",
+    "# Hypothetical Document Embeddings\n",
    "\n",
    "If we're working with a similarity search-based index, like a vector store, then searching on raw questions may not work well because their embeddings may not be very similar to those of the relevant documents. Instead it might help to have the model generate a hypothetical relevant document, and then use that to perform similarity search. This is the key idea behind [Hypothetical Document Embedding, or HyDE](https://arxiv.org/pdf/2212.10496.pdf).\n",
    "\n",
@@ -252,9 +252,9 @@
 ],
 "metadata": {
  "kernelspec": {
-   "display_name": "poetry-venv-2",
+   "display_name": "Python 3 (ipykernel)",
   "language": "python",
-   "name": "poetry-venv-2"
+   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
@@ -266,7 +266,7 @@
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
-   "version": "3.9.1"
+   "version": "3.10.1"
  }
 },
 "nbformat": 4,
--- a/docs/docs/use_cases/query_analysis/techniques/step_back.ipynb
+++ b/docs/docs/use_cases/query_analysis/techniques/step_back.ipynb
@@ -15,7 +15,7 @@
   "id": "f2195672-0cab-4967-ba8a-c6544635547d",
   "metadata": {},
   "source": [
-    "# Step back prompting\n",
+    "# Step Back Prompting\n",
    "\n",
    "Sometimes search quality and model generations can be tripped up by the specifics of a question. One way to handle this is to first generate a more abstract, \"step back\" question and to query based on both the original and step back question.\n",
    "\n",
@@ -222,9 +222,9 @@
 ],
 "metadata": {
  "kernelspec": {
-   "display_name": "poetry-venv-2",
+   "display_name": "Python 3 (ipykernel)",
   "language": "python",
-   "name": "poetry-venv-2"
+   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
@@ -236,7 +236,7 @@
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
-   "version": "3.9.1"
+   "version": "3.10.1"
  }
 },
 "nbformat": 4,
--- a/docs/docs/use_cases/question_answering/quickstart.ipynb
+++ b/docs/docs/use_cases/question_answering/quickstart.ipynb
@@ -1,874 +0,0 @@
-{
- "cells": [
-  {
-   "cell_type": "raw",
-   "id": "34814bdb-d05b-4dd3-adf1-ca5779882d7e",
-   "metadata": {},
-   "source": [
-    "---\n",
-    "sidebar_position: 0\n",
-    "---"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "id": "86fc5bb2-017f-434e-8cd6-53ab214a5604",
-   "metadata": {},
-   "source": [
-    "# Quickstart"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "id": "de913d6d-c57f-4927-82fe-18902a636861",
-   "metadata": {},
-   "source": [
-    "[![](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/langchain-ai/langchain/blob/master/docs/docs/use_cases/question_answering/quickstart.ipynb)"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "id": "5151afed",
-   "metadata": {},
-   "source": [
-    "LangChain has a number of components designed to help build question-answering applications, and RAG applications more generally. To familiarize ourselves with these, we'll build a simple Q&A application over a text data source. Along the way we'll go over a typical Q&A architecture, discuss the relevant LangChain components, and highlight additional resources for more advanced Q&A techniques. We'll also see how LangSmith can help us trace and understand our application. LangSmith will become increasingly helpful as our application grows in complexity."
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "id": "2f25cbbd-0938-4e3d-87e4-17a204a03ffb",
-   "metadata": {},
-   "source": [
-    "## Architecture\n",
-    "We'll create a typical RAG application as outlined in the [Q&A introduction](/docs/use_cases/question_answering/), which has two main components:\n",
-    "\n",
-    "**Indexing**: a pipeline for ingesting data from a source and indexing it. *This usually happens offline.*\n",
-    "\n",
-    "**Retrieval and generation**: the actual RAG chain, which takes the user query at run time and retrieves the relevant data from the index, then passes that to the model.\n",
-    "\n",
-    "The full sequence from raw data to answer will look like:\n",
-    "\n",
-    "#### Indexing\n",
-    "1. **Load**: First we need to load our data. We'll use [DocumentLoaders](/docs/modules/data_connection/document_loaders/) for this.\n",
-    "2. **Split**: [Text splitters](/docs/modules/data_connection/document_transformers/) break large `Documents` into smaller chunks. This is useful both for indexing data and for passing it in to a model, since large chunks are harder to search over and won't fit in a model's finite context window.\n",
-    "3. **Store**: We need somewhere to store and index our splits, so that they can later be searched over. This is often done using a [VectorStore](/docs/modules/data_connection/vectorstores/) and [Embeddings](/docs/modules/data_connection/text_embedding/) model.\n",
-    "\n",
-    "#### Retrieval and generation\n",
-    "4. **Retrieve**: Given a user input, relevant splits are retrieved from storage using a [Retriever](/docs/modules/data_connection/retrievers/).\n",
-    "5. **Generate**: A [ChatModel](/docs/modules/model_io/chat/) / [LLM](/docs/modules/model_io/llms/) produces an answer using a prompt that includes the question and the retrieved data"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "id": "487d8d79-5ee9-4aa4-9fdf-cd5f4303e099",
-   "metadata": {},
-   "source": [
-    "## Setup\n",
-    "\n",
-    "### Dependencies\n",
-    "\n",
-    "We'll use an OpenAI chat model and embeddings and a Chroma vector store in this walkthrough, but everything shown here works with any [ChatModel](/docs/modules/model_io/chat/) or [LLM](/docs/modules/model_io/llms/), [Embeddings](/docs/modules/data_connection/text_embedding/), and [VectorStore](/docs/modules/data_connection/vectorstores/) or [Retriever](/docs/modules/data_connection/retrievers/). \n",
-    "\n",
-    "We'll use the following packages:"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": 1,
-   "id": "28d272cd-4e31-40aa-bbb4-0be0a1f49a14",
-   "metadata": {},
-   "outputs": [],
-   "source": [
-    "%pip install --upgrade --quiet  langchain langchain-community langchainhub langchain-openai chromadb bs4"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "id": "51ef48de-70b6-4f43-8e0b-ab9b84c9c02a",
-   "metadata": {},
-   "source": [
-    "We need to set environment variable `OPENAI_API_KEY`, which can be done directly or loaded from a `.env` file like so:"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": null,
-   "id": "143787ca-d8e6-4dc9-8281-4374f4d71720",
-   "metadata": {},
-   "outputs": [],
-   "source": [
-    "import getpass\n",
-    "import os\n",
-    "\n",
-    "os.environ[\"OPENAI_API_KEY\"] = getpass.getpass()\n",
-    "\n",
-    "# import dotenv\n",
-    "\n",
-    "# dotenv.load_dotenv()"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "id": "1665e740-ce01-4f09-b9ed-516db0bd326f",
-   "metadata": {},
-   "source": [
-    "### LangSmith\n",
-    "\n",
-    "Many of the applications you build with LangChain will contain multiple steps with multiple invocations of LLM calls. As these applications get more and more complex, it becomes crucial to be able to inspect what exactly is going on inside your chain or agent. The best way to do this is with [LangSmith](https://smith.langchain.com).\n",
-    "\n",
-    "Note that LangSmith is not needed, but it is helpful. If you do want to use LangSmith, after you sign up at the link above, make sure to set your environment variables to start logging traces:"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": null,
-   "id": "07411adb-3722-4f65-ab7f-8f6f57663d11",
-   "metadata": {},
-   "outputs": [],
-   "source": [
-    "os.environ[\"LANGCHAIN_TRACING_V2\"] = \"true\"\n",
-    "os.environ[\"LANGCHAIN_API_KEY\"] = getpass.getpass()"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "id": "fa6ba684-26cf-4860-904e-a4d51380c134",
-   "metadata": {},
-   "source": [
-    "## Preview\n",
-    "\n",
-    "In this guide we'll build a QA app over the [LLM Powered Autonomous Agents](https://lilianweng.github.io/posts/2023-06-23-agent/) blog post by Lilian Weng, which allows us to ask questions about the contents of the post. \n",
-    "\n",
-    "We can create a simple indexing pipeline and RAG chain to do this in ~20 lines of code:"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": 4,
-   "id": "d8a913b1-0eea-442a-8a64-ec73333f104b",
-   "metadata": {},
-   "outputs": [],
-   "source": [
-    "import bs4\n",
-    "from langchain import hub\n",
-    "from langchain_community.document_loaders import WebBaseLoader\n",
-    "from langchain_community.vectorstores import Chroma\n",
-    "from langchain_core.output_parsers import StrOutputParser\n",
-    "from langchain_core.runnables import RunnablePassthrough\n",
-    "from langchain_openai import ChatOpenAI, OpenAIEmbeddings\n",
-    "from langchain_text_splitters import RecursiveCharacterTextSplitter"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": 5,
-   "id": "820244ae-74b4-4593-b392-822979dd91b8",
-   "metadata": {},
-   "outputs": [],
-   "source": [
-    "# Load, chunk and index the contents of the blog.\n",
-    "loader = WebBaseLoader(\n",
-    "    web_paths=(\"https://lilianweng.github.io/posts/2023-06-23-agent/\",),\n",
-    "    bs_kwargs=dict(\n",
-    "        parse_only=bs4.SoupStrainer(\n",
-    "            class_=(\"post-content\", \"post-title\", \"post-header\")\n",
-    "        )\n",
-    "    ),\n",
-    ")\n",
-    "docs = loader.load()\n",
-    "\n",
-    "text_splitter = RecursiveCharacterTextSplitter(chunk_size=1000, chunk_overlap=200)\n",
-    "splits = text_splitter.split_documents(docs)\n",
-    "vectorstore = Chroma.from_documents(documents=splits, embedding=OpenAIEmbeddings())\n",
-    "\n",
-    "# Retrieve and generate using the relevant snippets of the blog.\n",
-    "retriever = vectorstore.as_retriever()\n",
-    "prompt = hub.pull(\"rlm/rag-prompt\")\n",
-    "llm = ChatOpenAI(model_name=\"gpt-3.5-turbo\", temperature=0)\n",
-    "\n",
-    "\n",
-    "def format_docs(docs):\n",
-    "    return \"\\n\\n\".join(doc.page_content for doc in docs)\n",
-    "\n",
-    "\n",
-    "rag_chain = (\n",
-    "    {\"context\": retriever | format_docs, \"question\": RunnablePassthrough()}\n",
-    "    | prompt\n",
-    "    | llm\n",
-    "    | StrOutputParser()\n",
-    ")"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": 6,
-   "id": "0d3b0f36-7b56-49c0-8e40-a1aa9ebcbf24",
-   "metadata": {},
-   "outputs": [
-    {
-     "data": {
-      "text/plain": [
-       "'Task decomposition is a technique used to break down complex tasks into smaller and simpler steps. It can be done through prompting techniques like Chain of Thought or Tree of Thoughts, or by using task-specific instructions or human inputs. Task decomposition helps agents plan ahead and manage complicated tasks more effectively.'"
-      ]
-     },
-     "execution_count": 6,
-     "metadata": {},
-     "output_type": "execute_result"
-    }
-   ],
-   "source": [
-    "rag_chain.invoke(\"What is Task Decomposition?\")"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": 7,
-   "id": "7cb344e0-c423-400c-a079-964c08e07e32",
-   "metadata": {},
-   "outputs": [],
-   "source": [
-    "# cleanup\n",
-    "vectorstore.delete_collection()"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "id": "639dc31a-7f16-40f6-ba2a-20e7c2ecfe60",
-   "metadata": {},
-   "source": [
-    ":::tip\n",
-    "\n",
-    "Check out the [LangSmith trace](https://smith.langchain.com/public/1c6ca97e-445b-4d00-84b4-c7befcbc59fe/r) \n",
-    "\n",
-    ":::"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "id": "842cf72d-abbc-468e-a2eb-022470347727",
-   "metadata": {},
-   "source": [
-    "## Detailed walkthrough\n",
-    "\n",
-    "Let's go through the above code step-by-step to really understand what's going on."
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "id": "ba5daed6",
-   "metadata": {},
-   "source": [
-    "## 1. Indexing: Load\n",
-    "\n",
-    "We need to first load the blog post contents. We can use [DocumentLoaders](/docs/modules/data_connection/document_loaders/) for this, which are objects that load in data from a source and return a list of [Documents](https://api.python.langchain.com/en/latest/documents/langchain_core.documents.base.Document.html).  A `Document` is an object with some `page_content` (str) and `metadata` (dict).\n",
-    "\n",
-    "In this case we'll use the [WebBaseLoader](/docs/integrations/document_loaders/web_base), which uses `urllib` to load HTML from web URLs and `BeautifulSoup` to parse it to text. We can customize the HTML -> text parsing by passing in parameters to the `BeautifulSoup` parser via `bs_kwargs` (see [BeautifulSoup docs](https://beautiful-soup-4.readthedocs.io/en/latest/#beautifulsoup)). In this case only HTML tags with class \"post-content\", \"post-title\", or \"post-header\" are relevant, so we'll remove all others."
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": 8,
-   "id": "cf4d5c72",
-   "metadata": {},
-   "outputs": [],
-   "source": [
-    "import bs4\n",
-    "from langchain_community.document_loaders import WebBaseLoader\n",
-    "\n",
-    "# Only keep post title, headers, and content from the full HTML.\n",
-    "bs4_strainer = bs4.SoupStrainer(class_=(\"post-title\", \"post-header\", \"post-content\"))\n",
-    "loader = WebBaseLoader(\n",
-    "    web_paths=(\"https://lilianweng.github.io/posts/2023-06-23-agent/\",),\n",
-    "    bs_kwargs={\"parse_only\": bs4_strainer},\n",
-    ")\n",
-    "docs = loader.load()"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": 9,
-   "id": "207f87a3-effa-4457-b013-6d233bc7a088",
-   "metadata": {},
-   "outputs": [
-    {
-     "data": {
-      "text/plain": [
-       "42824"
-      ]
-     },
-     "execution_count": 9,
-     "metadata": {},
-     "output_type": "execute_result"
-    }
-   ],
-   "source": [
-    "len(docs[0].page_content)"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": 10,
-   "id": "52469796-5ce4-4c12-bd2a-a903872dac33",
-   "metadata": {},
-   "outputs": [
-    {
-     "name": "stdout",
-     "output_type": "stream",
-     "text": [
-      "\n",
-      "\n",
-      "      LLM Powered Autonomous Agents\n",
-      "    \n",
-      "Date: June 23, 2023  |  Estimated Reading Time: 31 min  |  Author: Lilian Weng\n",
-      "\n",
-      "\n",
-      "Building agents with LLM (large language model) as its core controller is a cool concept. Several proof-of-concepts demos, such as AutoGPT, GPT-Engineer and BabyAGI, serve as inspiring examples. The potentiality of LLM extends beyond generating well-written copies, stories, essays and programs; it can be framed as a powerful general problem solver.\n",
-      "Agent System Overview#\n",
-      "In\n"
-     ]
-    }
-   ],
-   "source": [
-    "print(docs[0].page_content[:500])"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "id": "ee5c6556-56be-4067-adbc-98b5aa19ef6e",
-   "metadata": {},
-   "source": [
-    "### Go deeper\n",
-    "`DocumentLoader`: Object that loads data from a source as list of `Documents`.\n",
-    "- [Docs](/docs/modules/data_connection/document_loaders/): Detailed documentation on how to use `DocumentLoaders`.\n",
-    "- [Integrations](/docs/integrations/document_loaders/): 160+ integrations to choose from.\n",
-    "- [Interface](https://api.python.langchain.com/en/latest/document_loaders/langchain_community.document_loaders.base.BaseLoader.html): API reference  for the base interface."
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "id": "fd2cc9a7",
-   "metadata": {},
-   "source": [
-    "## 2. Indexing: Split\n",
-    "\n",
-    "Our loaded document is over 42k characters long. This is too long to fit in the context window of many models. Even for those models that could fit the full post in their context window, models can struggle to find information in very long inputs. \n",
-    "\n",
-    "To handle this we'll split the `Document` into chunks for embedding and vector storage. This should help us retrieve only the most relevant bits of the blog post at run time.\n",
-    "\n",
-    "In this case we'll split our documents into chunks of 1000 characters with 200 characters of overlap between chunks. The overlap helps mitigate the possibility of separating a statement from important context related to it. We use the [RecursiveCharacterTextSplitter](/docs/modules/data_connection/document_transformers/recursive_text_splitter), which will recursively split the document using common separators like new lines until each chunk is the appropriate size. This is the recommended text splitter for generic text use cases.\n",
-    "\n",
-    "We set `add_start_index=True` so that the character index at which each split Document starts within the initial Document is preserved as metadata attribute \"start_index\"."
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": 11,
-   "id": "4b11c01d",
-   "metadata": {},
-   "outputs": [],
-   "source": [
-    "from langchain_text_splitters import RecursiveCharacterTextSplitter\n",
-    "\n",
-    "text_splitter = RecursiveCharacterTextSplitter(\n",
-    "    chunk_size=1000, chunk_overlap=200, add_start_index=True\n",
-    ")\n",
-    "all_splits = text_splitter.split_documents(docs)"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": 12,
-   "id": "3741eb67-9caf-40f2-a001-62f49349bff5",
-   "metadata": {},
-   "outputs": [
-    {
-     "data": {
-      "text/plain": [
-       "66"
-      ]
-     },
-     "execution_count": 12,
-     "metadata": {},
-     "output_type": "execute_result"
-    }
-   ],
-   "source": [
-    "len(all_splits)"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": 13,
-   "id": "f868d0e5-5670-4d54-b562-f50265e907f4",
-   "metadata": {},
-   "outputs": [
-    {
-     "data": {
-      "text/plain": [
-       "969"
-      ]
-     },
-     "execution_count": 13,
-     "metadata": {},
-     "output_type": "execute_result"
-    }
-   ],
-   "source": [
-    "len(all_splits[0].page_content)"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": 14,
-   "id": "5c9e5f27-c8e3-4ca7-8a8e-45c5de2901cc",
-   "metadata": {},
-   "outputs": [
-    {
-     "data": {
-      "text/plain": [
-       "{'source': 'https://lilianweng.github.io/posts/2023-06-23-agent/',\n",
-       " 'start_index': 7056}"
-      ]
-     },
-     "execution_count": 14,
-     "metadata": {},
-     "output_type": "execute_result"
-    }
-   ],
-   "source": [
-    "all_splits[10].metadata"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "id": "0a33bd4d",
-   "metadata": {},
-   "source": [
-    "### Go deeper\n",
-    "\n",
-    "`TextSplitter`: Object that splits a list of `Document`s into smaller chunks. Subclass of `DocumentTransformer`s.\n",
-    "- Explore `Context-aware splitters`, which keep the location (\"context\") of each split in the original `Document`:\n",
-    "    - [Markdown files](/docs/modules/data_connection/document_transformers/markdown_header_metadata)\n",
-    "    - [Code (py or js)](/docs/integrations/document_loaders/source_code)\n",
-    "    - [Scientific papers](/docs/integrations/document_loaders/grobid)\n",
-    "- [Interface](https://api.python.langchain.com/en/latest/text_splitter/langchain_text_splitters.TextSplitter.html): API reference for the base interface.\n",
-    "\n",
-    "`DocumentTransformer`: Object that performs a transformation on a list of `Document`s.\n",
-    "- [Docs](/docs/modules/data_connection/document_transformers/): Detailed documentation on how to use `DocumentTransformers`\n",
-    "- [Integrations](/docs/integrations/document_transformers/)\n",
-    "- [Interface](https://api.python.langchain.com/en/latest/documents/langchain_core.documents.transformers.BaseDocumentTransformer.html): API reference for the base interface.\n"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "id": "46547031-2352-4321-9970-d6ea27285c2e",
-   "metadata": {},
-   "source": [
-    "## 3. Indexing: Store\n",
-    "\n",
-    "Now we need to index our 66 text chunks so that we can search over them at runtime. The most common way to do this is to embed the contents of each document split and insert these embeddings into a vector database (or vector store). When we want to search over our splits, we take a text search query, embed it, and perform some sort of \"similarity\" search to identify the stored splits with the most similar embeddings to our query embedding. The simplest similarity measure is cosine similarity — we measure the cosine of the angle between each pair of embeddings (which are high dimensional vectors).\n",
-    "\n",
-    "We can embed and store all of our document splits in a single command using the [Chroma](/docs/integrations/vectorstores/chroma) vector store and [OpenAIEmbeddings](/docs/integrations/text_embedding/openai) model."
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": 15,
-   "id": "e9c302c8",
-   "metadata": {},
-   "outputs": [],
-   "source": [
-    "from langchain_community.vectorstores import Chroma\n",
-    "from langchain_openai import OpenAIEmbeddings\n",
-    "\n",
-    "vectorstore = Chroma.from_documents(documents=all_splits, embedding=OpenAIEmbeddings())"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "id": "dc6f22b0",
-   "metadata": {},
-   "source": [
-    "### Go deeper\n",
-    "`Embeddings`: Wrapper around a text embedding model, used for converting text to embeddings.\n",
-    "- [Docs](/docs/modules/data_connection/text_embedding): Detailed documentation on how to use embeddings.\n",
-    "- [Integrations](/docs/integrations/text_embedding/): 30+ integrations to choose from.\n",
-    "- [Interface](https://api.python.langchain.com/en/latest/embeddings/langchain_core.embeddings.Embeddings.html): API reference for the base interface.\n",
-    "\n",
-    "`VectorStore`: Wrapper around a vector database, used for storing and querying embeddings.\n",
-    "- [Docs](/docs/modules/data_connection/vectorstores/): Detailed documentation on how to use vector stores.\n",
-    "- [Integrations](/docs/integrations/vectorstores/): 40+ integrations to choose from.\n",
-    "- [Interface](https://api.python.langchain.com/en/latest/vectorstores/langchain_core.vectorstores.VectorStore.html): API reference for the base interface.\n",
-    "\n",
-    "This completes the **Indexing** portion of the pipeline. At this point we have a query-able vector store containing the chunked contents of our blog post. Given a user question, we should ideally be able to return the snippets of the blog post that answer the question."
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "id": "70d64d40-e475-43d9-b64c-925922bb5ef7",
-   "metadata": {},
-   "source": [
-    "## 4. Retrieval and Generation: Retrieve\n",
-    "\n",
-    "Now let's write the actual application logic. We want to create a simple application that takes a user question, searches for documents relevant to that question, passes the retrieved documents and initial question to a model, and returns an answer.\n",
-    "\n",
-    "First we need to define our logic for searching over documents. LangChain defines a [Retriever](/docs/modules/data_connection/retrievers/) interface which wraps an index that can return relevant `Documents` given a string query.\n",
-    "\n",
-    "The most common type of `Retriever` is the [VectorStoreRetriever](/docs/modules/data_connection/retrievers/vectorstore), which uses the similarity search capabilities of a vector store to facilitate retrieval. Any `VectorStore` can easily be turned into a `Retriever` with `VectorStore.as_retriever()`:"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": 16,
-   "id": "4414df0d-5d43-46d0-85a9-5f47be0dd099",
-   "metadata": {},
-   "outputs": [],
-   "source": [
-    "retriever = vectorstore.as_retriever(search_type=\"similarity\", search_kwargs={\"k\": 6})"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": 17,
-   "id": "e2c26b7d",
-   "metadata": {},
-   "outputs": [],
-   "source": [
-    "retrieved_docs = retriever.invoke(\"What are the approaches to Task Decomposition?\")"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": 18,
-   "id": "8684291d-0f5e-453a-8d3e-ff9feea765d0",
-   "metadata": {},
-   "outputs": [
-    {
-     "data": {
-      "text/plain": [
-       "6"
-      ]
-     },
-     "execution_count": 18,
-     "metadata": {},
-     "output_type": "execute_result"
-    }
-   ],
-   "source": [
-    "len(retrieved_docs)"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": 19,
-   "id": "9a5dc074-816d-409a-b005-ab4eddfd76af",
-   "metadata": {},
-   "outputs": [
-    {
-     "name": "stdout",
-     "output_type": "stream",
-     "text": [
-      "Tree of Thoughts (Yao et al. 2023) extends CoT by exploring multiple reasoning possibilities at each step. It first decomposes the problem into multiple thought steps and generates multiple thoughts per step, creating a tree structure. The search process can be BFS (breadth-first search) or DFS (depth-first search) with each state evaluated by a classifier (via a prompt) or majority vote.\n",
-      "Task decomposition can be done (1) by LLM with simple prompting like \"Steps for XYZ.\\n1.\", \"What are the subgoals for achieving XYZ?\", (2) by using task-specific instructions; e.g. \"Write a story outline.\" for writing a novel, or (3) with human inputs.\n"
-     ]
-    }
-   ],
-   "source": [
-    "print(retrieved_docs[0].page_content)"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "id": "5d5a113b",
-   "metadata": {},
-   "source": [
-    "### Go deeper\n",
-    "Vector stores are commonly used for retrieval, but there are other ways to do retrieval, too.\n",
-    "\n",
-    "`Retriever`: An object that returns `Document`s given a text query\n",
-    "\n",
-    "- [Docs](/docs/modules/data_connection/retrievers/): Further documentation on the interface and built-in retrieval techniques. Some of which include:\n",
-    "    - `MultiQueryRetriever` [generates variants of the input question](/docs/modules/data_connection/retrievers/MultiQueryRetriever) to improve retrieval hit rate.\n",
-    "    - `MultiVectorRetriever` (diagram below) instead generates [variants of the embeddings](/docs/modules/data_connection/retrievers/multi_vector), also in order to improve retrieval hit rate.\n",
-    "    - `Max marginal relevance` selects for [relevance and diversity](https://www.cs.cmu.edu/~jgc/publication/The_Use_MMR_Diversity_Based_LTMIR_1998.pdf) among the retrieved documents to avoid passing in duplicate context.\n",
-    "    - Documents can be filtered during vector store retrieval using metadata filters, such as with a [Self Query Retriever](/docs/modules/data_connection/retrievers/self_query).\n",
-    "- [Integrations](/docs/integrations/retrievers/): Integrations with retrieval services.\n",
-    "- [Interface](https://api.python.langchain.com/en/latest/retrievers/langchain_core.retrievers.BaseRetriever.html): API reference for the base interface."
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "id": "415d6824",
-   "metadata": {},
-   "source": [
-    "## 5. Retrieval and Generation: Generate\n",
-    "\n",
-    "Let's put it all together into a chain that takes a question, retrieves relevant documents, constructs a prompt, passes that to a model, and parses the output.\n",
-    "\n",
-    "We'll use the gpt-3.5-turbo OpenAI chat model, but any LangChain `LLM` or `ChatModel` could be substituted in."
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": 20,
-   "id": "d34d998c-9abf-4e01-a4ad-06dadfcf131c",
-   "metadata": {},
-   "outputs": [],
-   "source": [
-    "from langchain_openai import ChatOpenAI\n",
-    "\n",
-    "llm = ChatOpenAI(model_name=\"gpt-3.5-turbo\", temperature=0)"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "id": "bc826723-36fc-45d1-a3ef-df8c2c8471a8",
-   "metadata": {},
-   "source": [
-    "We'll use a prompt for RAG that is checked into the LangChain prompt hub ([here](https://smith.langchain.com/hub/rlm/rag-prompt))."
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": 21,
-   "id": "bede955b-9aeb-4fd3-964d-8e43f214ce70",
-   "metadata": {},
-   "outputs": [],
-   "source": [
-    "from langchain import hub\n",
-    "\n",
-    "prompt = hub.pull(\"rlm/rag-prompt\")"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": 22,
-   "id": "11c35354-f275-47ec-9f72-ebd5c23731eb",
-   "metadata": {},
-   "outputs": [
-    {
-     "data": {
-      "text/plain": [
-       "[HumanMessage(content=\"You are an assistant for question-answering tasks. Use the following pieces of retrieved context to answer the question. If you don't know the answer, just say that you don't know. Use three sentences maximum and keep the answer concise.\\nQuestion: filler question \\nContext: filler context \\nAnswer:\")]"
-      ]
-     },
-     "execution_count": 22,
-     "metadata": {},
-     "output_type": "execute_result"
-    }
-   ],
-   "source": [
-    "example_messages = prompt.invoke(\n",
-    "    {\"context\": \"filler context\", \"question\": \"filler question\"}\n",
-    ").to_messages()\n",
-    "example_messages"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": 23,
-   "id": "2ccc50fa-5fa2-4f80-8685-58ec2255523a",
-   "metadata": {},
-   "outputs": [
-    {
-     "name": "stdout",
-     "output_type": "stream",
-     "text": [
-      "You are an assistant for question-answering tasks. Use the following pieces of retrieved context to answer the question. If you don't know the answer, just say that you don't know. Use three sentences maximum and keep the answer concise.\n",
-      "Question: filler question \n",
-      "Context: filler context \n",
-      "Answer:\n"
-     ]
-    }
-   ],
-   "source": [
-    "print(example_messages[0].content)"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "id": "51f9a210-1eee-4054-99d7-9d9ddf7e3593",
-   "metadata": {},
-   "source": [
-    "We'll use the [LCEL Runnable](/docs/expression_language/) protocol to define the chain, allowing us to \n",
-    "- pipe together components and functions in a transparent way\n",
-    "- automatically trace our chain in LangSmith\n",
-    "- get streaming, async, and batched calling out of the box"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": 24,
-   "id": "99fa1aec",
-   "metadata": {},
-   "outputs": [],
-   "source": [
-    "from langchain_core.output_parsers import StrOutputParser\n",
-    "from langchain_core.runnables import RunnablePassthrough\n",
-    "\n",
-    "\n",
-    "def format_docs(docs):\n",
-    "    return \"\\n\\n\".join(doc.page_content for doc in docs)\n",
-    "\n",
-    "\n",
-    "rag_chain = (\n",
-    "    {\"context\": retriever | format_docs, \"question\": RunnablePassthrough()}\n",
-    "    | prompt\n",
-    "    | llm\n",
-    "    | StrOutputParser()\n",
-    ")"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": 25,
-   "id": "8655a152-d7cf-466f-b1bc-fbff9ae2b889",
-   "metadata": {},
-   "outputs": [
-    {
-     "name": "stdout",
-     "output_type": "stream",
-     "text": [
-      "Task decomposition is a technique used to break down complex tasks into smaller and simpler steps. It involves transforming big tasks into multiple manageable tasks, allowing for easier interpretation and execution by autonomous agents or models. Task decomposition can be done through various methods, such as using prompting techniques, task-specific instructions, or human inputs."
-     ]
-    }
-   ],
-   "source": [
-    "for chunk in rag_chain.stream(\"What is Task Decomposition?\"):\n",
-    "    print(chunk, end=\"\", flush=True)"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "id": "2c000e5f-2b7f-4eb9-8876-9f4b186b4a08",
-   "metadata": {},
-   "source": [
-    ":::tip\n",
-    "\n",
-    "Check out the [LangSmith trace](https://smith.langchain.com/public/1799e8db-8a6d-4eb2-84d5-46e8d7d5a99b/r) \n",
-    "\n",
-    ":::"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "id": "f7d52c84",
-   "metadata": {},
-   "source": [
-    "### Go deeper\n",
-    "\n",
-    "#### Choosing a model\n",
-    "`ChatModel`: An LLM-backed chat model. Takes in a sequence of messages and returns a message.\n",
-    "- [Docs](/docs/modules/model_io/chat/): Detailed documentation on \n",
-    "- [Integrations](/docs/integrations/chat/): 25+ integrations to choose from.\n",
-    "- [Interface](https://api.python.langchain.com/en/latest/language_models/langchain_core.language_models.chat_models.BaseChatModel.html): API reference for the base interface.\n",
-    "\n",
-    "`LLM`: A text-in-text-out LLM. Takes in a string and returns a string.\n",
-    "- [Docs](/docs/modules/model_io/llms)\n",
-    "- [Integrations](/docs/integrations/llms): 75+ integrations to choose from.\n",
-    "- [Interface](https://api.python.langchain.com/en/latest/language_models/langchain_core.language_models.llms.BaseLLM.html): API reference for the base interface.\n",
-    "\n",
-    "See a guide on RAG with locally-running models [here](/docs/use_cases/question_answering/local_retrieval_qa)."
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "id": "fa82f437",
-   "metadata": {},
-   "source": [
-    "#### Customizing the prompt\n",
-    "\n",
-    "As shown above, we can load prompts (e.g., [this RAG prompt](https://smith.langchain.com/hub/rlm/rag-prompt)) from the prompt hub. The prompt can also be easily customized:"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": 26,
-   "id": "e4fee704",
-   "metadata": {},
-   "outputs": [
-    {
-     "data": {
-      "text/plain": [
-       "'Task decomposition is a technique used to break down complex tasks into smaller and simpler steps. It involves transforming big tasks into multiple manageable tasks, allowing for a more systematic and organized approach to problem-solving. Thanks for asking!'"
-      ]
-     },
-     "execution_count": 26,
-     "metadata": {},
-     "output_type": "execute_result"
-    }
-   ],
-   "source": [
-    "from langchain_core.prompts import PromptTemplate\n",
-    "\n",
-    "template = \"\"\"Use the following pieces of context to answer the question at the end.\n",
-    "If you don't know the answer, just say that you don't know, don't try to make up an answer.\n",
-    "Use three sentences maximum and keep the answer as concise as possible.\n",
-    "Always say \"thanks for asking!\" at the end of the answer.\n",
-    "\n",
-    "{context}\n",
-    "\n",
-    "Question: {question}\n",
-    "\n",
-    "Helpful Answer:\"\"\"\n",
-    "custom_rag_prompt = PromptTemplate.from_template(template)\n",
-    "\n",
-    "rag_chain = (\n",
-    "    {\"context\": retriever | format_docs, \"question\": RunnablePassthrough()}\n",
-    "    | custom_rag_prompt\n",
-    "    | llm\n",
-    "    | StrOutputParser()\n",
-    ")\n",
-    "\n",
-    "rag_chain.invoke(\"What is Task Decomposition?\")"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "id": "94b952e6-dc4b-415b-9cf3-1ad333e48366",
-   "metadata": {},
-   "source": [
-    ":::tip\n",
-    "\n",
-    "Check out the [LangSmith trace](https://smith.langchain.com/public/da23c4d8-3b33-47fd-84df-a3a582eedf84/r) \n",
-    "\n",
-    ":::"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "id": "580e18de-132d-4009-ba67-4aaf2c7717a2",
-   "metadata": {},
-   "source": [
-    "## Next steps\n",
-    "\n",
-    "That's a lot of content we've covered in a short amount of time. There's plenty of features, integrations, and extensions to explore in each of the above sections. Along from the **Go deeper** sources mentioned above, good next steps include:\n",
-    "\n",
-    "- [Return sources](/docs/use_cases/question_answering/sources): Learn how to return source documents\n",
-    "- [Streaming](/docs/use_cases/question_answering/streaming): Learn how to stream outputs and intermediate steps\n",
-    "- [Add chat history](/docs/use_cases/question_answering/chat_history): Learn how to add chat history to your app"
-   ]
-  }
- ],
- "metadata": {
-  "kernelspec": {
-   "display_name": "Python 3 (ipykernel)",
-   "language": "python",
-   "name": "python3"
-  },
-  "language_info": {
-   "codemirror_mode": {
-    "name": "ipython",
-    "version": 3
-   },
-   "file_extension": ".py",
-   "mimetype": "text/x-python",
-   "name": "python",
-   "nbconvert_exporter": "python",
-   "pygments_lexer": "ipython3",
-   "version": "3.9.1"
-  }
- },
- "nbformat": 4,
- "nbformat_minor": 5
-}
--- a/docs/docs/use_cases/question_answering/quickstart.mdx
+++ b/docs/docs/use_cases/question_answering/quickstart.mdx
@@ -0,0 +1,644 @@
+---
+sidebar_position: 0
+title: Quickstart
+---
+
+# Quickstart
+
+[![](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/langchain-ai/langchain/blob/master/docs/docs/use_cases/question_answering/quickstart.ipynb)
+
+LangChain has a number of components designed to help build
+question-answering applications, and RAG applications more generally. To
+familiarize ourselves with these, we’ll build a simple Q&A application
+over a text data source. Along the way we’ll go over a typical Q&A
+architecture, discuss the relevant LangChain components, and highlight
+additional resources for more advanced Q&A techniques. We’ll also see
+how LangSmith can help us trace and understand our application.
+LangSmith will become increasingly helpful as our application grows in
+complexity.
+
+## Architecture
+
+We’ll create a typical RAG application as outlined in the [Q&A
+introduction](../../../docs/use_cases/question_answering/), which has
+two main components:
+
+**Indexing**: a pipeline for ingesting data from a source and indexing
+it. *This usually happens offline.*
+
+**Retrieval and generation**: the actual RAG chain, which takes the user
+query at run time and retrieves the relevant data from the index, then
+passes that to the model.
+
+The full sequence from raw data to answer will look like:
+
+#### Indexing
+
+1.  **Load**: First we need to load our data. We’ll use
+    [DocumentLoaders](../../../docs/modules/data_connection/document_loaders/)
+    for this.
+2.  **Split**: [Text
+    splitters](../../../docs/modules/data_connection/document_transformers/)
+    break large `Documents` into smaller chunks. This is useful both for
+    indexing data and for passing it in to a model, since large chunks
+    are harder to search over and won’t fit in a model’s finite context
+    window.
+3.  **Store**: We need somewhere to store and index our splits, so that
+    they can later be searched over. This is often done using a
+    [VectorStore](../../../docs/modules/data_connection/vectorstores/)
+    and
+    [Embeddings](../../../docs/modules/data_connection/text_embedding/)
+    model.
+
+#### Retrieval and generation
+
+1.  **Retrieve**: Given a user input, relevant splits are retrieved from
+    storage using a
+    [Retriever](../../../docs/modules/data_connection/retrievers/).
+2.  **Generate**: A [ChatModel](../../../docs/modules/model_io/chat/) /
+    [LLM](../../../docs/modules/model_io/llms/) produces an answer using
+    a prompt that includes the question and the retrieved data
+
+## Setup
+
+### Dependencies
+
+We’ll use an OpenAI chat model and embeddings and a Chroma vector store
+in this walkthrough, but everything shown here works with any
+[ChatModel](../../../docs/modules/model_io/chat/) or
+[LLM](../../../docs/modules/model_io/llms/),
+[Embeddings](../../../docs/modules/data_connection/text_embedding/), and
+[VectorStore](../../../docs/modules/data_connection/vectorstores/) or
+[Retriever](../../../docs/modules/data_connection/retrievers/).
+
+We’ll use the following packages:
+
+```python
+%pip install --upgrade --quiet  langchain langchain-community langchainhub langchain-openai chromadb bs4
+```
+
+We need to set environment variable `OPENAI_API_KEY`, which can be done
+directly or loaded from a `.env` file like so:
+
+```python
+import getpass
+import os
+
+os.environ["OPENAI_API_KEY"] = getpass.getpass()
+
+# import dotenv
+
+# dotenv.load_dotenv()
+```
+
+### LangSmith
+
+Many of the applications you build with LangChain will contain multiple
+steps with multiple invocations of LLM calls. As these applications get
+more and more complex, it becomes crucial to be able to inspect what
+exactly is going on inside your chain or agent. The best way to do this
+is with [LangSmith](https://smith.langchain.com).
+
+Note that LangSmith is not needed, but it is helpful. If you do want to
+use LangSmith, after you sign up at the link above, make sure to set
+your environment variables to start logging traces:
+
+```python
+os.environ["LANGCHAIN_TRACING_V2"] = "true"
+os.environ["LANGCHAIN_API_KEY"] = getpass.getpass()
+```
+
+## Preview
+
+In this guide we’ll build a QA app over the [LLM Powered Autonomous
+Agents](https://lilianweng.github.io/posts/2023-06-23-agent/) blog post
+by Lilian Weng, which allows us to ask questions about the contents of
+the post.
+
+We can create a simple indexing pipeline and RAG chain to do this in ~20
+lines of code:
+
+```python
+import bs4
+from langchain import hub
+from langchain_community.document_loaders import WebBaseLoader
+from langchain_community.vectorstores import Chroma
+from langchain_core.output_parsers import StrOutputParser
+from langchain_core.runnables import RunnablePassthrough
+from langchain_openai import ChatOpenAI, OpenAIEmbeddings
+from langchain_text_splitters import RecursiveCharacterTextSplitter
+```
+
+
+```python
+# Load, chunk and index the contents of the blog.
+loader = WebBaseLoader(
+    web_paths=("https://lilianweng.github.io/posts/2023-06-23-agent/",),
+    bs_kwargs=dict(
+        parse_only=bs4.SoupStrainer(
+            class_=("post-content", "post-title", "post-header")
+        )
+    ),
+)
+docs = loader.load()
+
+text_splitter = RecursiveCharacterTextSplitter(chunk_size=1000, chunk_overlap=200)
+splits = text_splitter.split_documents(docs)
+vectorstore = Chroma.from_documents(documents=splits, embedding=OpenAIEmbeddings())
+
+# Retrieve and generate using the relevant snippets of the blog.
+retriever = vectorstore.as_retriever()
+prompt = hub.pull("rlm/rag-prompt")
+llm = ChatOpenAI(model_name="gpt-3.5-turbo", temperature=0)
+
+
+def format_docs(docs):
+    return "\n\n".join(doc.page_content for doc in docs)
+
+
+rag_chain = (
+    {"context": retriever | format_docs, "question": RunnablePassthrough()}
+    | prompt
+    | llm
+    | StrOutputParser()
+)
+```
+
+
+```python
+rag_chain.invoke("What is Task Decomposition?")
+```
+
+``` text
+'Task decomposition is a technique used to break down complex tasks into smaller and simpler steps. It can be done through prompting techniques like Chain of Thought or Tree of Thoughts, or by using task-specific instructions or human inputs. Task decomposition helps agents plan ahead and manage complicated tasks more effectively.'
+```
+
+```python
+# cleanup
+vectorstore.delete_collection()
+```
+
+Check out the [LangSmith
+trace](https://smith.langchain.com/public/1c6ca97e-445b-4d00-84b4-c7befcbc59fe/r)
+
+## Detailed walkthrough
+
+Let’s go through the above code step-by-step to really understand what’s
+going on.
+
+## 1. Indexing: Load {#indexing-load}
+
+We need to first load the blog post contents. We can use
+[DocumentLoaders](../../../docs/modules/data_connection/document_loaders/)
+for this, which are objects that load in data from a source and return a
+list of
+[Documents](https://api.python.langchain.com/en/latest/documents/langchain_core.documents.base.Document.html).
+A `Document` is an object with some `page_content` (str) and `metadata`
+(dict).
+
+In this case we’ll use the
+[WebBaseLoader](../../../docs/integrations/document_loaders/web_base),
+which uses `urllib` to load HTML from web URLs and `BeautifulSoup` to
+parse it to text. We can customize the HTML -\> text parsing by passing
+in parameters to the `BeautifulSoup` parser via `bs_kwargs` (see
+[BeautifulSoup
+docs](https://beautiful-soup-4.readthedocs.io/en/latest/#beautifulsoup)).
+In this case only HTML tags with class “post-content”, “post-title”, or
+“post-header” are relevant, so we’ll remove all others.
+
+```python
+import bs4
+from langchain_community.document_loaders import WebBaseLoader
+
+# Only keep post title, headers, and content from the full HTML.
+bs4_strainer = bs4.SoupStrainer(class_=("post-title", "post-header", "post-content"))
+loader = WebBaseLoader(
+    web_paths=("https://lilianweng.github.io/posts/2023-06-23-agent/",),
+    bs_kwargs={"parse_only": bs4_strainer},
+)
+docs = loader.load()
+```
+
+
+```python
+len(docs[0].page_content)
+```
+
+``` text
+42824
+```
+
+```python
+print(docs[0].page_content[:500])
+```
+
+``` text
+
+
+      LLM Powered Autonomous Agents
+    
+Date: June 23, 2023  |  Estimated Reading Time: 31 min  |  Author: Lilian Weng
+
+
+Building agents with LLM (large language model) as its core controller is a cool concept. Several proof-of-concepts demos, such as AutoGPT, GPT-Engineer and BabyAGI, serve as inspiring examples. The potentiality of LLM extends beyond generating well-written copies, stories, essays and programs; it can be framed as a powerful general problem solver.
+Agent System Overview#
+In
+```
+
+### Go deeper
+
+`DocumentLoader`: Object that loads data from a source as list of
+`Documents`. -
+[Docs](../../../docs/modules/data_connection/document_loaders/):
+Detailed documentation on how to use `DocumentLoaders`. -
+[Integrations](../../../docs/integrations/document_loaders/): 160+
+integrations to choose from. -
+[Interface](https://api.python.langchain.com/en/latest/document_loaders/langchain_community.document_loaders.base.BaseLoader.html):
+API reference  for the base interface.
+
+## 2. Indexing: Split {#indexing-split}
+
+Our loaded document is over 42k characters long. This is too long to fit
+in the context window of many models. Even for those models that could
+fit the full post in their context window, models can struggle to find
+information in very long inputs.
+
+To handle this we’ll split the `Document` into chunks for embedding and
+vector storage. This should help us retrieve only the most relevant bits
+of the blog post at run time.
+
+In this case we’ll split our documents into chunks of 1000 characters
+with 200 characters of overlap between chunks. The overlap helps
+mitigate the possibility of separating a statement from important
+context related to it. We use the
+[RecursiveCharacterTextSplitter](../../../docs/modules/data_connection/document_transformers/recursive_text_splitter),
+which will recursively split the document using common separators like
+new lines until each chunk is the appropriate size. This is the
+recommended text splitter for generic text use cases.
+
+We set `add_start_index=True` so that the character index at which each
+split Document starts within the initial Document is preserved as
+metadata attribute “start_index”.
+
+```python
+from langchain_text_splitters import RecursiveCharacterTextSplitter
+
+text_splitter = RecursiveCharacterTextSplitter(
+    chunk_size=1000, chunk_overlap=200, add_start_index=True
+)
+all_splits = text_splitter.split_documents(docs)
+```
+
+
+```python
+len(all_splits)
+```
+
+``` text
+66
+```
+
+```python
+len(all_splits[0].page_content)
+```
+
+``` text
+969
+```
+
+```python
+all_splits[10].metadata
+```
+
+``` text
+{'source': 'https://lilianweng.github.io/posts/2023-06-23-agent/',
+ 'start_index': 7056}
+```
+
+### Go deeper
+
+`TextSplitter`: Object that splits a list of `Document`s into smaller
+chunks. Subclass of `DocumentTransformer`s. - Explore
+`Context-aware splitters`, which keep the location (“context”) of each
+split in the original `Document`: - [Markdown
+files](../../../docs/modules/data_connection/document_transformers/markdown_header_metadata) -
+[Code (py or
+js)](../../../docs/integrations/document_loaders/source_code) -
+[Scientific
+papers](../../../docs/integrations/document_loaders/grobid) -
+[Interface](https://api.python.langchain.com/en/latest/text_splitter/langchain_text_splitters.TextSplitter.html):
+API reference for the base interface.
+
+`DocumentTransformer`: Object that performs a transformation on a list
+of `Document`s. -
+[Docs](../../../docs/modules/data_connection/document_transformers/):
+Detailed documentation on how to use `DocumentTransformers` -
+[Integrations](../../../docs/integrations/document_transformers/) -
+[Interface](https://api.python.langchain.com/en/latest/documents/langchain_core.documents.transformers.BaseDocumentTransformer.html):
+API reference for the base interface.
+
+## 3. Indexing: Store {#indexing-store}
+
+Now we need to index our 66 text chunks so that we can search over them
+at runtime. The most common way to do this is to embed the contents of
+each document split and insert these embeddings into a vector database
+(or vector store). When we want to search over our splits, we take a
+text search query, embed it, and perform some sort of “similarity”
+search to identify the stored splits with the most similar embeddings to
+our query embedding. The simplest similarity measure is cosine
+similarity — we measure the cosine of the angle between each pair of
+embeddings (which are high dimensional vectors).
+
+We can embed and store all of our document splits in a single command
+using the [Chroma](../../../docs/integrations/vectorstores/chroma)
+vector store and
+[OpenAIEmbeddings](../../../docs/integrations/text_embedding/openai)
+model.
+
+```python
+from langchain_community.vectorstores import Chroma
+from langchain_openai import OpenAIEmbeddings
+
+vectorstore = Chroma.from_documents(documents=all_splits, embedding=OpenAIEmbeddings())
+```
+
+### Go deeper
+
+`Embeddings`: Wrapper around a text embedding model, used for converting
+text to embeddings. -
+[Docs](../../../docs/modules/data_connection/text_embedding): Detailed
+documentation on how to use embeddings. -
+[Integrations](../../../docs/integrations/text_embedding/): 30+
+integrations to choose from. -
+[Interface](https://api.python.langchain.com/en/latest/embeddings/langchain_core.embeddings.Embeddings.html):
+API reference for the base interface.
+
+`VectorStore`: Wrapper around a vector database, used for storing and
+querying embeddings. -
+[Docs](../../../docs/modules/data_connection/vectorstores/): Detailed
+documentation on how to use vector stores. -
+[Integrations](../../../docs/integrations/vectorstores/): 40+
+integrations to choose from. -
+[Interface](https://api.python.langchain.com/en/latest/vectorstores/langchain_core.vectorstores.VectorStore.html):
+API reference for the base interface.
+
+This completes the **Indexing** portion of the pipeline. At this point
+we have a query-able vector store containing the chunked contents of our
+blog post. Given a user question, we should ideally be able to return
+the snippets of the blog post that answer the question.
+
+## 4. Retrieval and Generation: Retrieve {#retrieval-and-generation-retrieve}
+
+Now let’s write the actual application logic. We want to create a simple
+application that takes a user question, searches for documents relevant
+to that question, passes the retrieved documents and initial question to
+a model, and returns an answer.
+
+First we need to define our logic for searching over documents.
+LangChain defines a
+[Retriever](../../../docs/modules/data_connection/retrievers/) interface
+which wraps an index that can return relevant `Documents` given a string
+query.
+
+The most common type of `Retriever` is the
+[VectorStoreRetriever](../../../docs/modules/data_connection/retrievers/vectorstore),
+which uses the similarity search capabilities of a vector store to
+facilitate retrieval. Any `VectorStore` can easily be turned into a
+`Retriever` with `VectorStore.as_retriever()`:
+
+```python
+retriever = vectorstore.as_retriever(search_type="similarity", search_kwargs={"k": 6})
+```
+
+
+```python
+retrieved_docs = retriever.invoke("What are the approaches to Task Decomposition?")
+```
+
+
+```python
+len(retrieved_docs)
+```
+
+``` text
+6
+```
+
+```python
+print(retrieved_docs[0].page_content)
+```
+
+``` text
+Tree of Thoughts (Yao et al. 2023) extends CoT by exploring multiple reasoning possibilities at each step. It first decomposes the problem into multiple thought steps and generates multiple thoughts per step, creating a tree structure. The search process can be BFS (breadth-first search) or DFS (depth-first search) with each state evaluated by a classifier (via a prompt) or majority vote.
+Task decomposition can be done (1) by LLM with simple prompting like "Steps for XYZ.\n1.", "What are the subgoals for achieving XYZ?", (2) by using task-specific instructions; e.g. "Write a story outline." for writing a novel, or (3) with human inputs.
+```
+
+### Go deeper
+
+Vector stores are commonly used for retrieval, but there are other ways
+to do retrieval, too.
+
+`Retriever`: An object that returns `Document`s given a text query
+
+-   [Docs](../../../docs/modules/data_connection/retrievers/): Further
+    documentation on the interface and built-in retrieval techniques.
+    Some of which include:
+    -   `MultiQueryRetriever` [generates variants of the input
+        question](../../../docs/modules/data_connection/retrievers/MultiQueryRetriever)
+        to improve retrieval hit rate.
+    -   `MultiVectorRetriever` (diagram below) instead generates
+        [variants of the
+        embeddings](../../../docs/modules/data_connection/retrievers/multi_vector),
+        also in order to improve retrieval hit rate.
+    -   `Max marginal relevance` selects for [relevance and
+        diversity](https://www.cs.cmu.edu/~jgc/publication/The_Use_MMR_Diversity_Based_LTMIR_1998.pdf)
+        among the retrieved documents to avoid passing in duplicate
+        context.
+    -   Documents can be filtered during vector store retrieval using
+        metadata filters, such as with a [Self Query
+        Retriever](../../../docs/modules/data_connection/retrievers/self_query).
+-   [Integrations](../../../docs/integrations/retrievers/): Integrations
+    with retrieval services.
+-   [Interface](https://api.python.langchain.com/en/latest/retrievers/langchain_core.retrievers.BaseRetriever.html):
+    API reference for the base interface.
+
+## 5. Retrieval and Generation: Generate {#retrieval-and-generation-generate}
+
+Let’s put it all together into a chain that takes a question, retrieves
+relevant documents, constructs a prompt, passes that to a model, and
+parses the output.
+
+We’ll use the gpt-3.5-turbo OpenAI chat model, but any LangChain `LLM`
+or `ChatModel` could be substituted in.
+
+import Tabs from '@theme/Tabs';
+import TabItem from '@theme/TabItem';
+
+<Tabs>
+<TabItem value="openai" label="OpenAI" default>
+
+```python
+from langchain_openai import ChatOpenAI
+
+llm = ChatOpenAI(model_name="gpt-3.5-turbo", temperature=0)
+```
+
+</TabItem>
+<TabItem value="local" label="Anthropic">
+
+```python
+%pip install -qU langchain-anthropic
+```
+
+
+```python
+from langchain_anthropic import ChatAnthropic
+
+llm = ChatAnthropic(model="claude-2.1", temperature=0, max_tokens=1024)
+```
+
+</TabItem>
+</Tabs>
+
+We’ll use a prompt for RAG that is checked into the LangChain prompt hub
+([here](https://smith.langchain.com/hub/rlm/rag-prompt)).
+
+```python
+from langchain import hub
+
+prompt = hub.pull("rlm/rag-prompt")
+```
+
+
+```python
+example_messages = prompt.invoke(
+    {"context": "filler context", "question": "filler question"}
+).to_messages()
+example_messages
+```
+
+``` text
+[HumanMessage(content="You are an assistant for question-answering tasks. Use the following pieces of retrieved context to answer the question. If you don't know the answer, just say that you don't know. Use three sentences maximum and keep the answer concise.\nQuestion: filler question \nContext: filler context \nAnswer:")]
+```
+
+```python
+print(example_messages[0].content)
+```
+
+``` text
+You are an assistant for question-answering tasks. Use the following pieces of retrieved context to answer the question. If you don't know the answer, just say that you don't know. Use three sentences maximum and keep the answer concise.
+Question: filler question 
+Context: filler context 
+Answer:
+```
+
+We’ll use the [LCEL Runnable](../../../docs/expression_language/)
+protocol to define the chain, allowing us to - pipe together components
+and functions in a transparent way - automatically trace our chain in
+LangSmith - get streaming, async, and batched calling out of the box
+
+```python
+from langchain_core.output_parsers import StrOutputParser
+from langchain_core.runnables import RunnablePassthrough
+
+
+def format_docs(docs):
+    return "\n\n".join(doc.page_content for doc in docs)
+
+
+rag_chain = (
+    {"context": retriever | format_docs, "question": RunnablePassthrough()}
+    | prompt
+    | llm
+    | StrOutputParser()
+)
+```
+
+
+```python
+for chunk in rag_chain.stream("What is Task Decomposition?"):
+    print(chunk, end="", flush=True)
+```
+
+``` text
+Task decomposition is a technique used to break down complex tasks into smaller and simpler steps. It involves transforming big tasks into multiple manageable tasks, allowing for easier interpretation and execution by autonomous agents or models. Task decomposition can be done through various methods, such as using prompting techniques, task-specific instructions, or human inputs.
+```
+
+Check out the [LangSmith
+trace](https://smith.langchain.com/public/1799e8db-8a6d-4eb2-84d5-46e8d7d5a99b/r)
+
+### Go deeper
+
+#### Choosing a model
+
+`ChatModel`: An LLM-backed chat model. Takes in a sequence of messages
+and returns a message. - [Docs](../../../docs/modules/model_io/chat/):
+Detailed documentation on -
+[Integrations](../../../docs/integrations/chat/): 25+ integrations to
+choose from. -
+[Interface](https://api.python.langchain.com/en/latest/language_models/langchain_core.language_models.chat_models.BaseChatModel.html):
+API reference for the base interface.
+
+`LLM`: A text-in-text-out LLM. Takes in a string and returns a string. -
+[Docs](../../../docs/modules/model_io/llms) -
+[Integrations](../../../docs/integrations/llms): 75+ integrations to
+choose from. -
+[Interface](https://api.python.langchain.com/en/latest/language_models/langchain_core.language_models.llms.BaseLLM.html):
+API reference for the base interface.
+
+See a guide on RAG with locally-running models
+[here](../../../docs/use_cases/question_answering/local_retrieval_qa).
+
+#### Customizing the prompt
+
+As shown above, we can load prompts (e.g., [this RAG
+prompt](https://smith.langchain.com/hub/rlm/rag-prompt)) from the prompt
+hub. The prompt can also be easily customized:
+
+```python
+from langchain_core.prompts import PromptTemplate
+
+template = """Use the following pieces of context to answer the question at the end.
+If you don't know the answer, just say that you don't know, don't try to make up an answer.
+Use three sentences maximum and keep the answer as concise as possible.
+Always say "thanks for asking!" at the end of the answer.
+
+{context}
+
+Question: {question}
+
+Helpful Answer:"""
+custom_rag_prompt = PromptTemplate.from_template(template)
+
+rag_chain = (
+    {"context": retriever | format_docs, "question": RunnablePassthrough()}
+    | custom_rag_prompt
+    | llm
+    | StrOutputParser()
+)
+
+rag_chain.invoke("What is Task Decomposition?")
+```
+
+``` text
+'Task decomposition is a technique used to break down complex tasks into smaller and simpler steps. It involves transforming big tasks into multiple manageable tasks, allowing for a more systematic and organized approach to problem-solving. Thanks for asking!'
+```
+
+Check out the [LangSmith
+trace](https://smith.langchain.com/public/da23c4d8-3b33-47fd-84df-a3a582eedf84/r)
+
+## Next steps
+
+That’s a lot of content we’ve covered in a short amount of time. There’s
+plenty of features, integrations, and extensions to explore in each of
+the above sections. Along from the **Go deeper** sources mentioned
+above, good next steps include:
+
+-   [Return
+    sources](../../../docs/use_cases/question_answering/sources): Learn
+    how to return source documents
+-   [Streaming](../../../docs/use_cases/question_answering/streaming):
+    Learn how to stream outputs and intermediate steps
+-   [Add chat
+    history](../../../docs/use_cases/question_answering/chat_history):
+    Learn how to add chat history to your app
--- a/libs/community/langchain_community/cache.py
+++ b/libs/community/langchain_community/cache.py
@@ -29,6 +29,7 @@ import uuid
 import warnings
 from abc import ABC
 from datetime import timedelta
+from enum import Enum
 from functools import lru_cache, wraps
 from typing import (
    TYPE_CHECKING,
@@ -51,6 +52,11 @@ from sqlalchemy.engine import Row
 from sqlalchemy.engine.base import Engine
 from sqlalchemy.orm import Session

+from langchain_community.vectorstores.azure_cosmos_db import (
+    CosmosDBSimilarityType,
+    CosmosDBVectorSearchType,
+)
+
 try:
    from sqlalchemy.orm import declarative_base
 except ImportError:
@@ -68,6 +74,7 @@ from langchain_community.utilities.astradb import (
    SetupMode,
    _AstraDBCollectionEnvironment,
 )
+from langchain_community.vectorstores import AzureCosmosDBVectorSearch
 from langchain_community.vectorstores.redis import Redis as RedisVectorstore

 logger = logging.getLogger(__file__)
@@ -1837,3 +1844,194 @@ class AstraDBSemanticCache(BaseCache):
    async def aclear(self, **kwargs: Any) -> None:
        await self.astra_env.aensure_db_setup()
        await self.async_collection.clear()
+
+
+class AzureCosmosDBSemanticCache(BaseCache):
+    """Cache that uses Cosmos DB Mongo vCore vector-store backend"""
+
+    DEFAULT_DATABASE_NAME = "CosmosMongoVCoreCacheDB"
+    DEFAULT_COLLECTION_NAME = "CosmosMongoVCoreCacheColl"
+
+    def __init__(
+        self,
+        cosmosdb_connection_string: str,
+        database_name: str,
+        collection_name: str,
+        embedding: Embeddings,
+        *,
+        cosmosdb_client: Optional[Any] = None,
+        num_lists: int = 100,
+        similarity: CosmosDBSimilarityType = CosmosDBSimilarityType.COS,
+        kind: CosmosDBVectorSearchType = CosmosDBVectorSearchType.VECTOR_IVF,
+        dimensions: int = 1536,
+        m: int = 16,
+        ef_construction: int = 64,
+        ef_search: int = 40,
+        score_threshold: Optional[float] = None,
+    ):
+        """
+        Args:
+            cosmosdb_connection_string: Cosmos DB Mongo vCore connection string
+            cosmosdb_client: Cosmos DB Mongo vCore client
+            embedding (Embedding): Embedding provider for semantic encoding and search.
+            database_name: Database name for the CosmosDBMongoVCoreSemanticCache
+            collection_name: Collection name for the CosmosDBMongoVCoreSemanticCache
+            num_lists: This integer is the number of clusters that the
+                inverted file (IVF) index uses to group the vector data.
+                We recommend that numLists is set to documentCount/1000
+                for up to 1 million documents and to sqrt(documentCount)
+                for more than 1 million documents.
+                Using a numLists value of 1 is akin to performing
+                brute-force search, which has limited performance
+            dimensions: Number of dimensions for vector similarity.
+                The maximum number of supported dimensions is 2000
+            similarity: Similarity metric to use with the IVF index.
+
+                Possible options are:
+                    - CosmosDBSimilarityType.COS (cosine distance),
+                    - CosmosDBSimilarityType.L2 (Euclidean distance), and
+                    - CosmosDBSimilarityType.IP (inner product).
+            kind: Type of vector index to create.
+                Possible options are:
+                    - vector-ivf
+                    - vector-hnsw: available as a preview feature only,
+                                   to enable visit https://learn.microsoft.com/en-us/azure/azure-resource-manager/management/preview-features
+            m: The max number of connections per layer (16 by default, minimum
+               value is 2, maximum value is 100). Higher m is suitable for datasets
+               with high dimensionality and/or high accuracy requirements.
+            ef_construction: the size of the dynamic candidate list for constructing
+                            the graph (64 by default, minimum value is 4, maximum
+                            value is 1000). Higher ef_construction will result in
+                            better index quality and higher accuracy, but it will
+                            also increase the time required to build the index.
+                            ef_construction has to be at least 2 * m
+            ef_search: The size of the dynamic candidate list for search
+                       (40 by default). A higher value provides better
+                       recall at the cost of speed.
+            score_threshold: Maximum score used to filter the vector search documents.
+        """
+
+        self._validate_enum_value(similarity, CosmosDBSimilarityType)
+        self._validate_enum_value(kind, CosmosDBVectorSearchType)
+
+        if not cosmosdb_connection_string:
+            raise ValueError(" CosmosDB connection string can be empty.")
+
+        self.cosmosdb_connection_string = cosmosdb_connection_string
+        self.cosmosdb_client = cosmosdb_client
+        self.embedding = embedding
+        self.database_name = database_name or self.DEFAULT_DATABASE_NAME
+        self.collection_name = collection_name or self.DEFAULT_COLLECTION_NAME
+        self.num_lists = num_lists
+        self.dimensions = dimensions
+        self.similarity = similarity
+        self.kind = kind
+        self.m = m
+        self.ef_construction = ef_construction
+        self.ef_search = ef_search
+        self.score_threshold = score_threshold
+        self._cache_dict: Dict[str, AzureCosmosDBVectorSearch] = {}
+
+    def _index_name(self, llm_string: str) -> str:
+        hashed_index = _hash(llm_string)
+        return f"cache:{hashed_index}"
+
+    def _get_llm_cache(self, llm_string: str) -> AzureCosmosDBVectorSearch:
+        index_name = self._index_name(llm_string)
+
+        namespace = self.database_name + "." + self.collection_name
+
+        # return vectorstore client for the specific llm string
+        if index_name in self._cache_dict:
+            return self._cache_dict[index_name]
+
+        # create new vectorstore client for the specific llm string
+        if self.cosmosdb_client:
+            collection = self.cosmosdb_client[self.database_name][self.collection_name]
+            self._cache_dict[index_name] = AzureCosmosDBVectorSearch(
+                collection=collection,
+                embedding=self.embedding,
+                index_name=index_name,
+            )
+        else:
+            self._cache_dict[
+                index_name
+            ] = AzureCosmosDBVectorSearch.from_connection_string(
+                connection_string=self.cosmosdb_connection_string,
+                namespace=namespace,
+                embedding=self.embedding,
+                index_name=index_name,
+            )
+
+        # create index for the vectorstore
+        vectorstore = self._cache_dict[index_name]
+        if not vectorstore.index_exists():
+            vectorstore.create_index(
+                self.num_lists,
+                self.dimensions,
+                self.similarity,
+                self.kind,
+                self.m,
+                self.ef_construction,
+            )
+
+        return vectorstore
+
+    def lookup(self, prompt: str, llm_string: str) -> Optional[RETURN_VAL_TYPE]:
+        """Look up based on prompt and llm_string."""
+        llm_cache = self._get_llm_cache(llm_string)
+        generations: List = []
+        # Read from a Hash
+        results = llm_cache.similarity_search(
+            query=prompt,
+            k=1,
+            kind=self.kind,
+            ef_search=self.ef_search,
+            score_threshold=self.score_threshold,
+        )
+        if results:
+            for document in results:
+                try:
+                    generations.extend(loads(document.metadata["return_val"]))
+                except Exception:
+                    logger.warning(
+                        "Retrieving a cache value that could not be deserialized "
+                        "properly. This is likely due to the cache being in an "
+                        "older format. Please recreate your cache to avoid this "
+                        "error."
+                    )
+                    # In a previous life we stored the raw text directly
+                    # in the table, so assume it's in that format.
+                    generations.extend(
+                        _load_generations_from_json(document.metadata["return_val"])
+                    )
+        return generations if generations else None
+
+    def update(self, prompt: str, llm_string: str, return_val: RETURN_VAL_TYPE) -> None:
+        """Update cache based on prompt and llm_string."""
+        for gen in return_val:
+            if not isinstance(gen, Generation):
+                raise ValueError(
+                    "CosmosDBMongoVCoreSemanticCache only supports caching of "
+                    f"normal LLM generations, got {type(gen)}"
+                )
+
+        llm_cache = self._get_llm_cache(llm_string)
+        metadata = {
+            "llm_string": llm_string,
+            "prompt": prompt,
+            "return_val": dumps([g for g in return_val]),
+        }
+        llm_cache.add_texts(texts=[prompt], metadatas=[metadata])
+
+    def clear(self, **kwargs: Any) -> None:
+        """Clear semantic cache for a given llm_string."""
+        index_name = self._index_name(kwargs["llm_string"])
+        if index_name in self._cache_dict:
+            self._cache_dict[index_name].get_collection().delete_many({})
+            # self._cache_dict[index_name].clear_collection()
+
+    @staticmethod
+    def _validate_enum_value(value: Any, enum_type: Type[Enum]) -> None:
+        if not isinstance(value, enum_type):
+            raise ValueError(f"Invalid enum value: {value}. Expected {enum_type}.")
--- a/libs/community/langchain_community/chat_models/perplexity.py
+++ b/libs/community/langchain_community/chat_models/perplexity.py
@@ -230,9 +230,9 @@ class ChatPerplexity(BaseChatModel):
            )
            default_chunk_class = chunk.__class__
            chunk = ChatGenerationChunk(message=chunk, generation_info=generation_info)
-            yield chunk
            if run_manager:
                run_manager.on_llm_new_token(chunk.text, chunk=chunk)
+            yield chunk

    def _generate(
        self,
--- a/libs/community/langchain_community/llms/anthropic.py
+++ b/libs/community/langchain_community/llms/anthropic.py
@@ -309,9 +309,9 @@ class Anthropic(LLM, _AnthropicCommon):
            prompt=self._wrap_prompt(prompt), stop_sequences=stop, stream=True, **params
        ):
            chunk = GenerationChunk(text=token.completion)
-            yield chunk
            if run_manager:
                run_manager.on_llm_new_token(chunk.text, chunk=chunk)
+            yield chunk

    async def _astream(
        self,
@@ -345,9 +345,9 @@ class Anthropic(LLM, _AnthropicCommon):
            **params,
        ):
            chunk = GenerationChunk(text=token.completion)
-            yield chunk
            if run_manager:
                await run_manager.on_llm_new_token(chunk.text, chunk=chunk)
+            yield chunk

    def get_num_tokens(self, text: str) -> int:
        """Calculate number of tokens."""
--- a/libs/community/langchain_community/llms/baidu_qianfan_endpoint.py
+++ b/libs/community/langchain_community/llms/baidu_qianfan_endpoint.py
@@ -213,9 +213,9 @@ class QianfanLLMEndpoint(LLM):
        for res in self.client.do(**params):
            if res:
                chunk = GenerationChunk(text=res["result"])
-                yield chunk
                if run_manager:
                    run_manager.on_llm_new_token(chunk.text)
+                yield chunk

    async def _astream(
        self,
@@ -228,7 +228,6 @@ class QianfanLLMEndpoint(LLM):
        async for res in await self.client.ado(**params):
            if res:
                chunk = GenerationChunk(text=res["result"])
-
-                yield chunk
                if run_manager:
                    await run_manager.on_llm_new_token(chunk.text)
+                yield chunk
--- a/libs/community/langchain_community/llms/tongyi.py
+++ b/libs/community/langchain_community/llms/tongyi.py
@@ -285,13 +285,13 @@ class Tongyi(BaseLLM):
        )
        for stream_resp in stream_generate_with_retry(self, prompt=prompt, **params):
            chunk = GenerationChunk(**self._generation_from_qwen_resp(stream_resp))
-            yield chunk
            if run_manager:
                run_manager.on_llm_new_token(
                    chunk.text,
                    chunk=chunk,
                    verbose=self.verbose,
                )
+            yield chunk

    async def _astream(
        self,
@@ -307,13 +307,13 @@ class Tongyi(BaseLLM):
            self, prompt=prompt, **params
        ):
            chunk = GenerationChunk(**self._generation_from_qwen_resp(stream_resp))
-            yield chunk
            if run_manager:
                await run_manager.on_llm_new_token(
                    chunk.text,
                    chunk=chunk,
                    verbose=self.verbose,
                )
+            yield chunk

    def _invocation_params(self, stop: Any, **kwargs: Any) -> Dict[str, Any]:
        params = {
--- a/libs/community/langchain_community/llms/vertexai.py
+++ b/libs/community/langchain_community/llms/vertexai.py
@@ -382,13 +382,13 @@ class VertexAI(_VertexAICommon, BaseLLM):
            **params,
        ):
            chunk = self._response_to_generation(stream_resp)
-            yield chunk
            if run_manager:
                run_manager.on_llm_new_token(
                    chunk.text,
                    chunk=chunk,
                    verbose=self.verbose,
                )
+            yield chunk


@deprecated(
--- a/libs/community/langchain_community/retrievers/you.py
+++ b/libs/community/langchain_community/retrievers/you.py
@@ -1,6 +1,9 @@
 from typing import Any, List

-from langchain_core.callbacks import CallbackManagerForRetrieverRun
+from langchain_core.callbacks import (
+    AsyncCallbackManagerForRetrieverRun,
+    CallbackManagerForRetrieverRun,
+)
 from langchain_core.documents import Document
 from langchain_core.retrievers import BaseRetriever

@@ -21,3 +24,15 @@ class YouRetriever(BaseRetriever, YouSearchAPIWrapper):
        **kwargs: Any,
    ) -> List[Document]:
        return self.results(query, run_manager=run_manager.get_child(), **kwargs)
+
+    async def _aget_relevant_documents(
+        self,
+        query: str,
+        *,
+        run_manager: AsyncCallbackManagerForRetrieverRun,
+        **kwargs: Any,
+    ) -> List[Document]:
+        results = await self.results_async(
+            query, run_manager=run_manager.get_child(), **kwargs
+        )
+        return results
--- a/libs/community/langchain_community/tools/init.py
+++ b/libs/community/langchain_community/tools/init.py
@@ -782,6 +782,12 @@ def _import_yahoo_finance_news() -> Any:
    return YahooFinanceNewsTool


+def _import_you_tool() -> Any:
+    from langchain_community.tools.you.tool import YouSearchTool
+
+    return YouSearchTool
+
+
 def _import_youtube_search() -> Any:
    from langchain_community.tools.youtube.search import YouTubeSearchTool

@@ -1055,6 +1061,8 @@ def __getattr__(name: str) -> Any:
        return _import_wolfram_alpha_tool()
    elif name == "YahooFinanceNewsTool":
        return _import_yahoo_finance_news()
+    elif name == "YouSearchTool":
+        return _import_you_tool()
    elif name == "YouTubeSearchTool":
        return _import_youtube_search()
    elif name == "ZapierNLAListActions":
@@ -1192,6 +1200,7 @@ __all__ = [
    "WolframAlphaQueryRun",
    "WriteFileTool",
    "YahooFinanceNewsTool",
+    "YouSearchTool",
    "YouTubeSearchTool",
    "ZapierNLAListActions",
    "ZapierNLARunAction",
--- a/libs/community/langchain_community/tools/you/init.py
+++ b/libs/community/langchain_community/tools/you/init.py
@@ -0,0 +1,8 @@
+"""You.com API toolkit."""
+
+
+from langchain_community.tools.you.tool import YouSearchTool
+
+__all__ = [
+    "YouSearchTool",
+]
--- a/libs/community/langchain_community/tools/you/tool.py
+++ b/libs/community/langchain_community/tools/you/tool.py
@@ -0,0 +1,43 @@
+from typing import List, Optional, Type
+
+from langchain_core.callbacks import (
+    AsyncCallbackManagerForToolRun,
+    CallbackManagerForToolRun,
+)
+from langchain_core.documents import Document
+from langchain_core.pydantic_v1 import BaseModel, Field
+from langchain_core.tools import BaseTool
+
+from langchain_community.utilities.you import YouSearchAPIWrapper
+
+
+class YouInput(BaseModel):
+    query: str = Field(description="should be a search query")
+
+
+class YouSearchTool(BaseTool):
+    """Tool that searches the you.com API"""
+
+    name = "you_search"
+    description = (
+        "The YOU APIs make LLMs and search experiences more factual and"
+        "up to date with realtime web data."
+    )
+    args_schema: Type[BaseModel] = YouInput
+    api_wrapper: YouSearchAPIWrapper = Field(default_factory=YouSearchAPIWrapper)
+
+    def _run(
+        self,
+        query: str,
+        run_manager: Optional[CallbackManagerForToolRun] = None,
+    ) -> List[Document]:
+        """Use the you.com tool."""
+        return self.api_wrapper.results(query)
+
+    async def _arun(
+        self,
+        query: str,
+        run_manager: Optional[AsyncCallbackManagerForToolRun] = None,
+    ) -> List[Document]:
+        """Use the you.com tool asynchronously."""
+        return await self.api_wrapper.results_async(query)
--- a/libs/community/langchain_community/utilities/you.py
+++ b/libs/community/langchain_community/utilities/you.py
@@ -2,7 +2,6 @@

 In order to set this up, follow instructions at:
 """
-import json
 from typing import Any, Dict, List, Literal, Optional

 import aiohttp
@@ -113,16 +112,16 @@ class YouSearchAPIWrapper(BaseModel):

        docs = []
        for hit in raw_search_results["hits"]:
-            n_snippets_per_hit = self.n_snippets_per_hit or len(hit["snippets"])
-            for snippet in hit["snippets"][:n_snippets_per_hit]:
+            n_snippets_per_hit = self.n_snippets_per_hit or len(hit.get("snippets"))
+            for snippet in hit.get("snippets")[:n_snippets_per_hit]:
                docs.append(
                    Document(
                        page_content=snippet,
                        metadata={
-                            "url": hit["url"],
-                            "thumbnail_url": hit["thumbnail_url"],
-                            "title": hit["title"],
-                            "description": hit["description"],
+                            "url": hit.get("url"),
+                            "thumbnail_url": hit.get("thumbnail_url"),
+                            "title": hit.get("title"),
+                            "description": hit.get("description"),
                        },
                    )
                )
@@ -188,43 +187,47 @@ class YouSearchAPIWrapper(BaseModel):
    async def raw_results_async(
        self,
        query: str,
-        num_web_results: Optional[int] = 5,
-        safesearch: Optional[str] = "moderate",
-        country: Optional[str] = "US",
+        **kwargs: Any,
    ) -> Dict:
        """Get results from the you.com Search API asynchronously."""

-        # Function to perform the API call
-        async def fetch() -> str:
-            params = {
-                "query": query,
-                "num_web_results": num_web_results,
-                "safesearch": safesearch,
-                "country": country,
-            }
-            async with aiohttp.ClientSession() as session:
-                async with session.post(f"{YOU_API_URL}/search", json=params) as res:
-                    if res.status == 200:
-                        data = await res.text()
-                        return data
-                    else:
-                        raise Exception(f"Error {res.status}: {res.reason}")
+        headers = {"X-API-Key": self.ydc_api_key or ""}
+        params = {
+            "query": query,
+            "num_web_results": self.num_web_results,
+            "safesearch": self.safesearch,
+            "country": self.country,
+            **kwargs,
+        }
+        params = {k: v for k, v in params.items() if v is not None}
+        # news endpoint expects `q` instead of `query`
+        if self.endpoint_type == "news":
+            params["q"] = params["query"]
+            del params["query"]

-        results_json_str = await fetch()
-        return json.loads(results_json_str)
+        # @todo deprecate `snippet`, not part of API
+        if self.endpoint_type == "snippet":
+            self.endpoint_type = "search"
+
+        async with aiohttp.ClientSession() as session:
+            async with session.get(
+                url=f"{YOU_API_URL}/{self.endpoint_type}",
+                params=params,
+                headers=headers,
+            ) as res:
+                if res.status == 200:
+                    results = await res.json()
+                    return results
+                else:
+                    raise Exception(f"Error {res.status}: {res.reason}")

    async def results_async(
        self,
        query: str,
-        num_web_results: Optional[int] = 5,
-        safesearch: Optional[str] = "moderate",
-        country: Optional[str] = "US",
+        **kwargs: Any,
    ) -> List[Document]:
-        results_json = await self.raw_results_async(
-            query=query,
-            num_web_results=num_web_results,
-            safesearch=safesearch,
-            country=country,
+        raw_search_results_async = await self.raw_results_async(
+            query,
+            **{key: value for key, value in kwargs.items() if value is not None},
        )
-
-        return self._parse_results(results_json["results"])
+        return self._parse_results(raw_search_results_async)
--- a/libs/community/langchain_community/vectorstores/azure_cosmos_db.py
+++ b/libs/community/langchain_community/vectorstores/azure_cosmos_db.py
@@ -38,6 +38,15 @@ class CosmosDBSimilarityType(str, Enum):
    """Euclidean distance"""


+class CosmosDBVectorSearchType(str, Enum):
+    """Cosmos DB Vector Search Type as enumerator."""
+
+    VECTOR_IVF = "vector-ivf"
+    """IVF vector index"""
+    VECTOR_HNSW = "vector-hnsw"
+    """HNSW vector index"""
+
+
 CosmosDBDocumentType = TypeVar("CosmosDBDocumentType", bound=Dict[str, Any])

 logger = logging.getLogger(__name__)
@@ -166,6 +175,9 @@ class AzureCosmosDBVectorSearch(VectorStore):
        num_lists: int = 100,
        dimensions: int = 1536,
        similarity: CosmosDBSimilarityType = CosmosDBSimilarityType.COS,
+        kind: str = "vector-ivf",
+        m: int = 16,
+        ef_construction: int = 64,
    ) -> dict[str, Any]:
        """Creates an index using the index name specified at
            instance construction
@@ -195,6 +207,11 @@ class AzureCosmosDBVectorSearch(VectorStore):
            the numLists parameter using the above guidance.

        Args:
+            kind: Type of vector index to create.
+                Possible options are:
+                    - vector-ivf
+                    - vector-hnsw: available as a preview feature only,
+                                   to enable visit https://learn.microsoft.com/en-us/azure/azure-resource-manager/management/preview-features
            num_lists: This integer is the number of clusters that the
                inverted file (IVF) index uses to group the vector data.
                We recommend that numLists is set to documentCount/1000
@@ -210,27 +227,30 @@ class AzureCosmosDBVectorSearch(VectorStore):
                    - CosmosDBSimilarityType.COS (cosine distance),
                    - CosmosDBSimilarityType.L2 (Euclidean distance), and
                    - CosmosDBSimilarityType.IP (inner product).
-
+            m: The max number of connections per layer (16 by default, minimum
+               value is 2, maximum value is 100). Higher m is suitable for datasets
+               with high dimensionality and/or high accuracy requirements.
+            ef_construction: the size of the dynamic candidate list for constructing
+                            the graph (64 by default, minimum value is 4, maximum
+                            value is 1000). Higher ef_construction will result in
+                            better index quality and higher accuracy, but it will
+                            also increase the time required to build the index.
+                            ef_construction has to be at least 2 * m
        Returns:
            An object describing the created index

        """
-        # prepare the command
-        create_index_commands = {
-            "createIndexes": self._collection.name,
-            "indexes": [
-                {
-                    "name": self._index_name,
-                    "key": {self._embedding_key: "cosmosSearch"},
-                    "cosmosSearchOptions": {
-                        "kind": "vector-ivf",
-                        "numLists": num_lists,
-                        "similarity": similarity,
-                        "dimensions": dimensions,
-                    },
-                }
-            ],
-        }
+        # check the kind of vector search to be performed
+        # prepare the command accordingly
+        create_index_commands = {}
+        if kind == CosmosDBVectorSearchType.VECTOR_IVF:
+            create_index_commands = self._get_vector_index_ivf(
+                kind, num_lists, similarity, dimensions
+            )
+        elif kind == CosmosDBVectorSearchType.VECTOR_HNSW:
+            create_index_commands = self._get_vector_index_hnsw(
+                kind, m, ef_construction, similarity, dimensions
+            )

        # retrieve the database object
        current_database = self._collection.database
@@ -242,6 +262,47 @@ class AzureCosmosDBVectorSearch(VectorStore):

        return create_index_responses

+    def _get_vector_index_ivf(
+        self, kind: str, num_lists: int, similarity: str, dimensions: int
+    ) -> Dict[str, Any]:
+        command = {
+            "createIndexes": self._collection.name,
+            "indexes": [
+                {
+                    "name": self._index_name,
+                    "key": {self._embedding_key: "cosmosSearch"},
+                    "cosmosSearchOptions": {
+                        "kind": kind,
+                        "numLists": num_lists,
+                        "similarity": similarity,
+                        "dimensions": dimensions,
+                    },
+                }
+            ],
+        }
+        return command
+
+    def _get_vector_index_hnsw(
+        self, kind: str, m: int, ef_construction: int, similarity: str, dimensions: int
+    ) -> Dict[str, Any]:
+        command = {
+            "createIndexes": self._collection.name,
+            "indexes": [
+                {
+                    "name": self._index_name,
+                    "key": {self._embedding_key: "cosmosSearch"},
+                    "cosmosSearchOptions": {
+                        "kind": kind,
+                        "m": m,
+                        "efConstruction": ef_construction,
+                        "similarity": similarity,
+                        "dimensions": dimensions,
+                    },
+                }
+            ],
+        }
+        return command
+
    def add_texts(
        self,
        texts: Iterable[str],
@@ -329,17 +390,60 @@ class AzureCosmosDBVectorSearch(VectorStore):
        self._collection.delete_one({"_id": ObjectId(document_id)})

    def _similarity_search_with_score(
-        self, embeddings: List[float], k: int = 4
+        self,
+        embeddings: List[float],
+        k: int = 4,
+        kind: CosmosDBVectorSearchType = CosmosDBVectorSearchType.VECTOR_IVF,
+        ef_search: int = 40,
+        score_threshold: float = 0.0,
    ) -> List[Tuple[Document, float]]:
        """Returns a list of documents with their scores

        Args:
            embeddings: The query vector
            k: the number of documents to return
+            kind: Type of vector index to create.
+                Possible options are:
+                    - vector-ivf
+                    - vector-hnsw: available as a preview feature only,
+                                   to enable visit https://learn.microsoft.com/en-us/azure/azure-resource-manager/management/preview-features
+            ef_search: The size of the dynamic candidate list for search
+                       (40 by default). A higher value provides better
+                       recall at the cost of speed.
+            score_threshold: (Optional[float], optional): Maximum vector distance
+                between selected documents and the query vector. Defaults to None.
+                Only vector-ivf search supports this for now.

        Returns:
            A list of documents closest to the query vector
        """
+        pipeline: List[dict[str, Any]] = []
+        if kind == CosmosDBVectorSearchType.VECTOR_IVF:
+            pipeline = self._get_pipeline_vector_ivf(embeddings, k)
+        elif kind == CosmosDBVectorSearchType.VECTOR_HNSW:
+            pipeline = self._get_pipeline_vector_hnsw(embeddings, k, ef_search)
+
+        cursor = self._collection.aggregate(pipeline)
+
+        docs = []
+        for res in cursor:
+            score = res.pop("similarityScore")
+            if score < score_threshold:
+                continue
+            document_object_field = (
+                res.pop("document")
+                if kind == CosmosDBVectorSearchType.VECTOR_IVF
+                else res
+            )
+            text = document_object_field.pop(self._text_key)
+            docs.append(
+                (Document(page_content=text, metadata=document_object_field), score)
+            )
+        return docs
+
+    def _get_pipeline_vector_ivf(
+        self, embeddings: List[float], k: int = 4
+    ) -> List[dict[str, Any]]:
        pipeline: List[dict[str, Any]] = [
            {
                "$search": {
@@ -358,32 +462,65 @@ class AzureCosmosDBVectorSearch(VectorStore):
                }
            },
        ]
+        return pipeline

-        cursor = self._collection.aggregate(pipeline)
-
-        docs = []
-
-        for res in cursor:
-            score = res.pop("similarityScore")
-            document_object_field = res.pop("document")
-            text = document_object_field.pop(self._text_key)
-            docs.append(
-                (Document(page_content=text, metadata=document_object_field), score)
-            )
-
-        return docs
+    def _get_pipeline_vector_hnsw(
+        self, embeddings: List[float], k: int = 4, ef_search: int = 40
+    ) -> List[dict[str, Any]]:
+        pipeline: List[dict[str, Any]] = [
+            {
+                "$search": {
+                    "cosmosSearch": {
+                        "vector": embeddings,
+                        "path": self._embedding_key,
+                        "k": k,
+                        "efSearch": ef_search,
+                    },
+                }
+            },
+            {
+                "$project": {
+                    "similarityScore": {"$meta": "searchScore"},
+                    "document": "$$ROOT",
+                }
+            },
+        ]
+        return pipeline

    def similarity_search_with_score(
-        self, query: str, k: int = 4
+        self,
+        query: str,
+        k: int = 4,
+        kind: CosmosDBVectorSearchType = CosmosDBVectorSearchType.VECTOR_IVF,
+        ef_search: int = 40,
+        score_threshold: float = 0.0,
    ) -> List[Tuple[Document, float]]:
        embeddings = self._embedding.embed_query(query)
-        docs = self._similarity_search_with_score(embeddings=embeddings, k=k)
+        docs = self._similarity_search_with_score(
+            embeddings=embeddings,
+            k=k,
+            kind=kind,
+            ef_search=ef_search,
+            score_threshold=score_threshold,
+        )
        return docs

    def similarity_search(
-        self, query: str, k: int = 4, **kwargs: Any
+        self,
+        query: str,
+        k: int = 4,
+        kind: CosmosDBVectorSearchType = CosmosDBVectorSearchType.VECTOR_IVF,
+        ef_search: int = 40,
+        score_threshold: float = 0.0,
+        **kwargs: Any,
    ) -> List[Document]:
-        docs_and_scores = self.similarity_search_with_score(query, k=k)
+        docs_and_scores = self.similarity_search_with_score(
+            query,
+            k=k,
+            kind=kind,
+            ef_search=ef_search,
+            score_threshold=score_threshold,
+        )
        return [doc for doc, _ in docs_and_scores]

    def max_marginal_relevance_search_by_vector(
@@ -392,11 +529,20 @@ class AzureCosmosDBVectorSearch(VectorStore):
        k: int = 4,
        fetch_k: int = 20,
        lambda_mult: float = 0.5,
+        kind: CosmosDBVectorSearchType = CosmosDBVectorSearchType.VECTOR_IVF,
+        ef_search: int = 40,
+        score_threshold: float = 0.0,
        **kwargs: Any,
    ) -> List[Document]:
        # Retrieves the docs with similarity scores
        # sorted by similarity scores in DESC order
-        docs = self._similarity_search_with_score(embedding, k=fetch_k)
+        docs = self._similarity_search_with_score(
+            embedding,
+            k=fetch_k,
+            kind=kind,
+            ef_search=ef_search,
+            score_threshold=score_threshold,
+        )

        # Re-ranks the docs using MMR
        mmr_doc_indexes = maximal_marginal_relevance(
@@ -414,12 +560,24 @@ class AzureCosmosDBVectorSearch(VectorStore):
        k: int = 4,
        fetch_k: int = 20,
        lambda_mult: float = 0.5,
+        kind: CosmosDBVectorSearchType = CosmosDBVectorSearchType.VECTOR_IVF,
+        ef_search: int = 40,
+        score_threshold: float = 0.0,
        **kwargs: Any,
    ) -> List[Document]:
        # compute the embeddings vector from the query string
        embeddings = self._embedding.embed_query(query)

        docs = self.max_marginal_relevance_search_by_vector(
-            embeddings, k=k, fetch_k=fetch_k, lambda_mult=lambda_mult
+            embeddings,
+            k=k,
+            fetch_k=fetch_k,
+            lambda_mult=lambda_mult,
+            kind=kind,
+            ef_search=ef_search,
+            score_threshold=score_threshold,
        )
        return docs
+
+    def get_collection(self) -> Collection[CosmosDBDocumentType]:
+        return self._collection
--- a/libs/community/poetry.lock
+++ b/libs/community/poetry.lock
@@ -9181,4 +9181,4 @@ extended-testing = ["aiosqlite", "aleph-alpha-client", "anthropic", "arxiv", "as
 [metadata]
 lock-version = "2.0"
 python-versions = ">=3.8.1,<4.0"
-content-hash = "7af07f4d9c43d4bc23fe11776bc1afd9874f6c3696bffb063b6453e9862dc4df"
+content-hash = "d64381a1891a09e6215818c25ba7ca7b14a8708351695feab9ae53f4485f3b3e"
--- a/libs/community/pyproject.toml
+++ b/libs/community/pyproject.toml
@@ -1,6 +1,6 @@
 [tool.poetry]
 name = "langchain-community"
-version = "0.0.24"
+version = "0.0.25"
 description = "Community contributed LangChain integrations."
 authors = []
 license = "MIT"
@@ -9,7 +9,7 @@ repository = "https://github.com/langchain-ai/langchain"

 [tool.poetry.dependencies]
 python = ">=3.8.1,<4.0"
-langchain-core = ">=0.1.26,<0.2"
+langchain-core = "^0.1.28"
 SQLAlchemy = ">=1.4,<3"
 requests = "^2"
 PyYAML = ">=5.3"
--- a/libs/community/tests/integration_tests/vectorstores/test_azure_cosmos_db.py
+++ b/libs/community/tests/integration_tests/vectorstores/test_azure_cosmos_db.py
@@ -11,6 +11,7 @@ from langchain_community.embeddings import OpenAIEmbeddings
 from langchain_community.vectorstores.azure_cosmos_db import (
    AzureCosmosDBVectorSearch,
    CosmosDBSimilarityType,
+    CosmosDBVectorSearchType,
 )

 logging.basicConfig(level=logging.DEBUG)
@@ -21,6 +22,7 @@ model_deployment = os.getenv(
 model_name = os.getenv("OPENAI_EMBEDDINGS_MODEL_NAME", "text-embedding-ada-002")

 INDEX_NAME = "langchain-test-index"
+INDEX_NAME_VECTOR_HNSW = "langchain-test-index-hnsw"
 NAMESPACE = "langchain_test_db.langchain_test_collection"
 CONNECTION_STRING: str = os.environ.get("MONGODB_VCORE_URI", "")
 DB_NAME, COLLECTION_NAME = NAMESPACE.split(".")
@@ -28,6 +30,11 @@ DB_NAME, COLLECTION_NAME = NAMESPACE.split(".")
 num_lists = 3
 dimensions = 1536
 similarity_algorithm = CosmosDBSimilarityType.COS
+kind = CosmosDBVectorSearchType.VECTOR_IVF
+m = 16
+ef_construction = 64
+ef_search = 40
+score_threshold = 0.1


 def prepare_collection() -> Any:
@@ -82,7 +89,7 @@ class TestAzureCosmosDBVectorSearch:

    @pytest.fixture(scope="class", autouse=True)
    def cosmos_db_url(self) -> Union[str, Generator[str, None, None]]:
-        """Return the elasticsearch url."""
+        """Return the cosmos db url."""
        return "805.555.1212"

    def test_from_documents_cosine_distance(
@@ -105,14 +112,23 @@ class TestAzureCosmosDBVectorSearch:
        sleep(1)  # waits for Cosmos DB to save contents to the collection

        # Create the IVF index that will be leveraged later for vector search
-        vectorstore.create_index(num_lists, dimensions, similarity_algorithm)
+        vectorstore.create_index(
+            num_lists, dimensions, similarity_algorithm, kind, m, ef_construction
+        )
        sleep(2)  # waits for the index to be set up

-        output = vectorstore.similarity_search("Sandwich", k=1)
+        output = vectorstore.similarity_search(
+            "Sandwich",
+            k=1,
+            kind=kind,
+            ef_search=ef_search,
+            score_threshold=score_threshold,
+        )

        assert output
        assert output[0].page_content == "What is a sandwich?"
        assert output[0].metadata["c"] == 1
+
        vectorstore.delete_index()

    def test_from_documents_inner_product(
@@ -135,14 +151,23 @@ class TestAzureCosmosDBVectorSearch:
        sleep(1)  # waits for Cosmos DB to save contents to the collection

        # Create the IVF index that will be leveraged later for vector search
-        vectorstore.create_index(num_lists, dimensions, CosmosDBSimilarityType.IP)
+        vectorstore.create_index(
+            num_lists, dimensions, CosmosDBSimilarityType.IP, kind, m, ef_construction
+        )
        sleep(2)  # waits for the index to be set up

-        output = vectorstore.similarity_search("Sandwich", k=1)
+        output = vectorstore.similarity_search(
+            "Sandwich",
+            k=1,
+            kind=kind,
+            ef_search=ef_search,
+            score_threshold=score_threshold,
+        )

        assert output
        assert output[0].page_content == "What is a sandwich?"
        assert output[0].metadata["c"] == 1
+
        vectorstore.delete_index()

    def test_from_texts_cosine_distance(
@@ -162,12 +187,21 @@ class TestAzureCosmosDBVectorSearch:
        )

        # Create the IVF index that will be leveraged later for vector search
-        vectorstore.create_index(num_lists, dimensions, similarity_algorithm)
+        vectorstore.create_index(
+            num_lists, dimensions, CosmosDBSimilarityType.IP, kind, m, ef_construction
+        )
        sleep(2)  # waits for the index to be set up

-        output = vectorstore.similarity_search("Sandwich", k=1)
+        output = vectorstore.similarity_search(
+            "Sandwich",
+            k=1,
+            kind=kind,
+            ef_search=ef_search,
+            score_threshold=score_threshold,
+        )

        assert output[0].page_content == "What is a sandwich?"
+
        vectorstore.delete_index()

    def test_from_texts_with_metadatas_cosine_distance(
@@ -189,10 +223,18 @@ class TestAzureCosmosDBVectorSearch:
        )

        # Create the IVF index that will be leveraged later for vector search
-        vectorstore.create_index(num_lists, dimensions, similarity_algorithm)
+        vectorstore.create_index(
+            num_lists, dimensions, similarity_algorithm, kind, m, ef_construction
+        )
        sleep(2)  # waits for the index to be set up

-        output = vectorstore.similarity_search("Sandwich", k=1)
+        output = vectorstore.similarity_search(
+            "Sandwich",
+            k=1,
+            kind=kind,
+            ef_search=ef_search,
+            score_threshold=score_threshold,
+        )

        assert output
        assert output[0].page_content == "What is a sandwich?"
@@ -219,10 +261,18 @@ class TestAzureCosmosDBVectorSearch:
        )

        # Create the IVF index that will be leveraged later for vector search
-        vectorstore.create_index(num_lists, dimensions, similarity_algorithm)
+        vectorstore.create_index(
+            num_lists, dimensions, similarity_algorithm, kind, m, ef_construction
+        )
        sleep(2)  # waits for the index to be set up

-        output = vectorstore.similarity_search("Sandwich", k=1)
+        output = vectorstore.similarity_search(
+            "Sandwich",
+            k=1,
+            kind=kind,
+            ef_search=ef_search,
+            score_threshold=score_threshold,
+        )

        assert output
        assert output[0].page_content == "What is a sandwich?"
@@ -234,7 +284,13 @@ class TestAzureCosmosDBVectorSearch:
        vectorstore.delete_document_by_id(first_document_id)
        sleep(2)  # waits for the index to be updated

-        output2 = vectorstore.similarity_search("Sandwich", k=1)
+        output2 = vectorstore.similarity_search(
+            "Sandwich",
+            k=1,
+            kind=kind,
+            ef_search=ef_search,
+            score_threshold=score_threshold,
+        )
        assert output2
        assert output2[0].page_content != "What is a sandwich?"

@@ -259,25 +315,36 @@ class TestAzureCosmosDBVectorSearch:
        )

        # Create the IVF index that will be leveraged later for vector search
-        vectorstore.create_index(num_lists, dimensions, similarity_algorithm)
+        vectorstore.create_index(
+            num_lists, dimensions, similarity_algorithm, kind, m, ef_construction
+        )
        sleep(2)  # waits for the index to be set up

-        output = vectorstore.similarity_search("Sandwich", k=5)
+        output = vectorstore.similarity_search(
+            "Sandwich",
+            k=5,
+            kind=kind,
+            ef_search=ef_search,
+            score_threshold=score_threshold,
+        )

-        first_document_id_object = output[0].metadata["_id"]
-        first_document_id = str(first_document_id_object)
+        first_document_id = str(output[0].metadata["_id"])

-        output[1].metadata["_id"]
-        second_document_id = output[1].metadata["_id"]
+        second_document_id = str(output[1].metadata["_id"])

-        output[2].metadata["_id"]
-        third_document_id = output[2].metadata["_id"]
+        third_document_id = str(output[2].metadata["_id"])

        document_ids = [first_document_id, second_document_id, third_document_id]
        vectorstore.delete(document_ids)
        sleep(2)  # waits for the index to be updated

-        output_2 = vectorstore.similarity_search("Sandwich", k=5)
+        output_2 = vectorstore.similarity_search(
+            "Sandwich",
+            k=5,
+            kind=kind,
+            ef_search=ef_search,
+            score_threshold=score_threshold,
+        )
        assert output
        assert output_2

@@ -307,14 +374,23 @@ class TestAzureCosmosDBVectorSearch:
        )

        # Create the IVF index that will be leveraged later for vector search
-        vectorstore.create_index(num_lists, dimensions, CosmosDBSimilarityType.IP)
+        vectorstore.create_index(
+            num_lists, dimensions, CosmosDBSimilarityType.IP, kind, m, ef_construction
+        )
        sleep(2)  # waits for the index to be set up

-        output = vectorstore.similarity_search("Sandwich", k=1)
+        output = vectorstore.similarity_search(
+            "Sandwich",
+            k=1,
+            kind=kind,
+            ef_search=ef_search,
+            score_threshold=score_threshold,
+        )

        assert output
        assert output[0].page_content == "What is a sandwich?"
        assert output[0].metadata["c"] == 1
+
        vectorstore.delete_index()

    def test_from_texts_with_metadatas_euclidean_distance(
@@ -336,14 +412,23 @@ class TestAzureCosmosDBVectorSearch:
        )

        # Create the IVF index that will be leveraged later for vector search
-        vectorstore.create_index(num_lists, dimensions, CosmosDBSimilarityType.L2)
+        vectorstore.create_index(
+            num_lists, dimensions, CosmosDBSimilarityType.L2, kind, m, ef_construction
+        )
        sleep(2)  # waits for the index to be set up

-        output = vectorstore.similarity_search("Sandwich", k=1)
+        output = vectorstore.similarity_search(
+            "Sandwich",
+            k=1,
+            kind=kind,
+            ef_search=ef_search,
+            score_threshold=score_threshold,
+        )

        assert output
        assert output[0].page_content == "What is a sandwich?"
        assert output[0].metadata["c"] == 1
+
        vectorstore.delete_index()

    def test_max_marginal_relevance_cosine_distance(
@@ -358,15 +443,20 @@ class TestAzureCosmosDBVectorSearch:
        )

        # Create the IVF index that will be leveraged later for vector search
-        vectorstore.create_index(num_lists, dimensions, CosmosDBSimilarityType.COS)
+        vectorstore.create_index(
+            num_lists, dimensions, similarity_algorithm, kind, m, ef_construction
+        )
        sleep(2)  # waits for the index to be set up

        query = "foo"
-        output = vectorstore.max_marginal_relevance_search(query, k=10, lambda_mult=0.1)
+        output = vectorstore.max_marginal_relevance_search(
+            query, k=10, kind=kind, lambda_mult=0.1, score_threshold=score_threshold
+        )

        assert len(output) == len(texts)
        assert output[0].page_content == "foo"
        assert output[1].page_content != "foo"
+
        vectorstore.delete_index()

    def test_max_marginal_relevance_inner_product(
@@ -381,19 +471,439 @@ class TestAzureCosmosDBVectorSearch:
        )

        # Create the IVF index that will be leveraged later for vector search
-        vectorstore.create_index(num_lists, dimensions, CosmosDBSimilarityType.IP)
+        vectorstore.create_index(
+            num_lists, dimensions, CosmosDBSimilarityType.IP, kind, m, ef_construction
+        )
        sleep(2)  # waits for the index to be set up

        query = "foo"
-        output = vectorstore.max_marginal_relevance_search(query, k=10, lambda_mult=0.1)
+        output = vectorstore.max_marginal_relevance_search(
+            query, k=10, kind=kind, lambda_mult=0.1, score_threshold=score_threshold
+        )

        assert len(output) == len(texts)
        assert output[0].page_content == "foo"
        assert output[1].page_content != "foo"
+
        vectorstore.delete_index()

-    def invoke_delete_with_no_args(
+    """
+        Test cases for the similarity algorithm using vector-hnsw
+    """
+
+    def test_from_documents_cosine_distance_vector_hnsw(
        self, azure_openai_embeddings: OpenAIEmbeddings, collection: Any
+    ) -> None:
+        """Test end to end construction and search."""
+        documents = [
+            Document(page_content="Dogs are tough.", metadata={"a": 1}),
+            Document(page_content="Cats have fluff.", metadata={"b": 1}),
+            Document(page_content="What is a sandwich?", metadata={"c": 1}),
+            Document(page_content="That fence is purple.", metadata={"d": 1, "e": 2}),
+        ]
+
+        vectorstore = AzureCosmosDBVectorSearch.from_documents(
+            documents,
+            azure_openai_embeddings,
+            collection=collection,
+            index_name=INDEX_NAME_VECTOR_HNSW,
+        )
+        sleep(1)  # waits for Cosmos DB to save contents to the collection
+
+        # Create the IVF index that will be leveraged later for vector search
+        vectorstore.create_index(
+            num_lists,
+            dimensions,
+            similarity_algorithm,
+            CosmosDBVectorSearchType.VECTOR_HNSW,
+            m,
+            ef_construction,
+        )
+        sleep(2)  # waits for the index to be set up
+
+        output = vectorstore.similarity_search(
+            "Sandwich",
+            k=1,
+            kind=CosmosDBVectorSearchType.VECTOR_HNSW,
+            ef_search=ef_search,
+            score_threshold=score_threshold,
+        )
+
+        assert output
+        assert output[0].page_content == "What is a sandwich?"
+        assert output[0].metadata["c"] == 1
+
+        vectorstore.delete_index()
+
+    def test_from_documents_inner_product_vector_hnsw(
+        self, azure_openai_embeddings: OpenAIEmbeddings, collection: Any
+    ) -> None:
+        """Test end to end construction and search."""
+        documents = [
+            Document(page_content="Dogs are tough.", metadata={"a": 1}),
+            Document(page_content="Cats have fluff.", metadata={"b": 1}),
+            Document(page_content="What is a sandwich?", metadata={"c": 1}),
+            Document(page_content="That fence is purple.", metadata={"d": 1, "e": 2}),
+        ]
+
+        vectorstore = AzureCosmosDBVectorSearch.from_documents(
+            documents,
+            azure_openai_embeddings,
+            collection=collection,
+            index_name=INDEX_NAME_VECTOR_HNSW,
+        )
+        sleep(1)  # waits for Cosmos DB to save contents to the collection
+
+        # Create the IVF index that will be leveraged later for vector search
+        vectorstore.create_index(
+            num_lists,
+            dimensions,
+            similarity_algorithm,
+            CosmosDBVectorSearchType.VECTOR_HNSW,
+            m,
+            ef_construction,
+        )
+        sleep(2)  # waits for the index to be set up
+
+        output = vectorstore.similarity_search(
+            "Sandwich",
+            k=1,
+            kind=CosmosDBVectorSearchType.VECTOR_HNSW,
+            ef_search=ef_search,
+            score_threshold=score_threshold,
+        )
+
+        assert output
+        assert output[0].page_content == "What is a sandwich?"
+        assert output[0].metadata["c"] == 1
+
+        vectorstore.delete_index()
+
+    def test_from_texts_cosine_distance_vector_hnsw(
+        self, azure_openai_embeddings: OpenAIEmbeddings, collection: Any
+    ) -> None:
+        texts = [
+            "Dogs are tough.",
+            "Cats have fluff.",
+            "What is a sandwich?",
+            "That fence is purple.",
+        ]
+        vectorstore = AzureCosmosDBVectorSearch.from_texts(
+            texts,
+            azure_openai_embeddings,
+            collection=collection,
+            index_name=INDEX_NAME_VECTOR_HNSW,
+        )
+
+        # Create the IVF index that will be leveraged later for vector search
+        vectorstore.create_index(
+            num_lists,
+            dimensions,
+            similarity_algorithm,
+            CosmosDBVectorSearchType.VECTOR_HNSW,
+            m,
+            ef_construction,
+        )
+        sleep(2)  # waits for the index to be set up
+
+        output = vectorstore.similarity_search(
+            "Sandwich",
+            k=1,
+            kind=CosmosDBVectorSearchType.VECTOR_HNSW,
+            ef_search=ef_search,
+            score_threshold=score_threshold,
+        )
+
+        assert output[0].page_content == "What is a sandwich?"
+
+        vectorstore.delete_index()
+
+    def test_from_texts_with_metadatas_cosine_distance_vector_hnsw(
+        self, azure_openai_embeddings: OpenAIEmbeddings, collection: Any
+    ) -> None:
+        texts = [
+            "Dogs are tough.",
+            "Cats have fluff.",
+            "What is a sandwich?",
+            "The fence is purple.",
+        ]
+        metadatas = [{"a": 1}, {"b": 1}, {"c": 1}, {"d": 1, "e": 2}]
+        vectorstore = AzureCosmosDBVectorSearch.from_texts(
+            texts,
+            azure_openai_embeddings,
+            metadatas=metadatas,
+            collection=collection,
+            index_name=INDEX_NAME_VECTOR_HNSW,
+        )
+
+        # Create the IVF index that will be leveraged later for vector search
+        vectorstore.create_index(
+            num_lists,
+            dimensions,
+            similarity_algorithm,
+            CosmosDBVectorSearchType.VECTOR_HNSW,
+            m,
+            ef_construction,
+        )
+        sleep(2)  # waits for the index to be set up
+
+        output = vectorstore.similarity_search(
+            "Sandwich",
+            k=1,
+            kind=CosmosDBVectorSearchType.VECTOR_HNSW,
+            ef_search=ef_search,
+            score_threshold=score_threshold,
+        )
+
+        assert output
+        assert output[0].page_content == "What is a sandwich?"
+        assert output[0].metadata["c"] == 1
+
+        vectorstore.delete_index()
+
+    def test_from_texts_with_metadatas_delete_one_vector_hnsw(
+        self, azure_openai_embeddings: OpenAIEmbeddings, collection: Any
+    ) -> None:
+        texts = [
+            "Dogs are tough.",
+            "Cats have fluff.",
+            "What is a sandwich?",
+            "The fence is purple.",
+        ]
+        metadatas = [{"a": 1}, {"b": 1}, {"c": 1}, {"d": 1, "e": 2}]
+        vectorstore = AzureCosmosDBVectorSearch.from_texts(
+            texts,
+            azure_openai_embeddings,
+            metadatas=metadatas,
+            collection=collection,
+            index_name=INDEX_NAME_VECTOR_HNSW,
+        )
+
+        # Create the IVF index that will be leveraged later for vector search
+        vectorstore.create_index(
+            num_lists,
+            dimensions,
+            similarity_algorithm,
+            CosmosDBVectorSearchType.VECTOR_HNSW,
+            m,
+            ef_construction,
+        )
+        sleep(2)  # waits for the index to be set up
+
+        output = vectorstore.similarity_search(
+            "Sandwich",
+            k=1,
+            kind=CosmosDBVectorSearchType.VECTOR_HNSW,
+            ef_search=ef_search,
+            score_threshold=score_threshold,
+        )
+
+        assert output
+        assert output[0].page_content == "What is a sandwich?"
+        assert output[0].metadata["c"] == 1
+
+        first_document_id_object = output[0].metadata["_id"]
+        first_document_id = str(first_document_id_object)
+
+        vectorstore.delete_document_by_id(first_document_id)
+        sleep(2)  # waits for the index to be updated
+
+        output2 = vectorstore.similarity_search(
+            "Sandwich",
+            k=1,
+            kind=CosmosDBVectorSearchType.VECTOR_HNSW,
+            ef_search=ef_search,
+            score_threshold=score_threshold,
+        )
+        assert output2
+        assert output2[0].page_content != "What is a sandwich?"
+
+        vectorstore.delete_index()
+
+    def test_from_texts_with_metadatas_delete_multiple_vector_hnsw(
+        self, azure_openai_embeddings: OpenAIEmbeddings, collection: Any
+    ) -> None:
+        texts = [
+            "Dogs are tough.",
+            "Cats have fluff.",
+            "What is a sandwich?",
+            "The fence is purple.",
+        ]
+        metadatas = [{"a": 1}, {"b": 1}, {"c": 1}, {"d": 1, "e": 2}]
+        vectorstore = AzureCosmosDBVectorSearch.from_texts(
+            texts,
+            azure_openai_embeddings,
+            metadatas=metadatas,
+            collection=collection,
+            index_name=INDEX_NAME_VECTOR_HNSW,
+        )
+
+        # Create the IVF index that will be leveraged later for vector search
+        vectorstore.create_index(
+            num_lists,
+            dimensions,
+            similarity_algorithm,
+            CosmosDBVectorSearchType.VECTOR_HNSW,
+            m,
+            ef_construction,
+        )
+        sleep(2)  # waits for the index to be set up
+
+        output = vectorstore.similarity_search(
+            "Sandwich",
+            k=5,
+            kind=CosmosDBVectorSearchType.VECTOR_HNSW,
+            ef_search=ef_search,
+            score_threshold=score_threshold,
+        )
+
+        first_document_id = str(output[0].metadata["_id"])
+
+        second_document_id = str(output[1].metadata["_id"])
+
+        third_document_id = str(output[2].metadata["_id"])
+
+        document_ids = [first_document_id, second_document_id, third_document_id]
+        vectorstore.delete(document_ids)
+        sleep(2)  # waits for the index to be updated
+
+        output_2 = vectorstore.similarity_search(
+            "Sandwich",
+            k=5,
+            kind=CosmosDBVectorSearchType.VECTOR_HNSW,
+            ef_search=ef_search,
+            score_threshold=score_threshold,
+        )
+        assert output
+        assert output_2
+
+        assert len(output) == 4  # we should see all the four documents
+        assert (
+            len(output_2) == 1
+        )  # we should see only one document left after three have been deleted
+
+        vectorstore.delete_index()
+
+    def test_from_texts_with_metadatas_inner_product_vector_hnsw(
+        self, azure_openai_embeddings: OpenAIEmbeddings, collection: Any
+    ) -> None:
+        texts = [
+            "Dogs are tough.",
+            "Cats have fluff.",
+            "What is a sandwich?",
+            "The fence is purple.",
+        ]
+        metadatas = [{"a": 1}, {"b": 1}, {"c": 1}, {"d": 1, "e": 2}]
+        vectorstore = AzureCosmosDBVectorSearch.from_texts(
+            texts,
+            azure_openai_embeddings,
+            metadatas=metadatas,
+            collection=collection,
+            index_name=INDEX_NAME_VECTOR_HNSW,
+        )
+
+        # Create the IVF index that will be leveraged later for vector search
+        vectorstore.create_index(
+            num_lists,
+            dimensions,
+            similarity_algorithm,
+            CosmosDBVectorSearchType.VECTOR_HNSW,
+            m,
+            ef_construction,
+        )
+        sleep(2)  # waits for the index to be set up
+
+        output = vectorstore.similarity_search(
+            "Sandwich",
+            k=1,
+            kind=CosmosDBVectorSearchType.VECTOR_HNSW,
+            ef_search=ef_search,
+            score_threshold=score_threshold,
+        )
+
+        assert output
+        assert output[0].page_content == "What is a sandwich?"
+        assert output[0].metadata["c"] == 1
+
+        vectorstore.delete_index()
+
+    def test_max_marginal_relevance_cosine_distance_vector_hnsw(
+        self, azure_openai_embeddings: OpenAIEmbeddings, collection: Any
+    ) -> None:
+        texts = ["foo", "foo", "fou", "foy"]
+        vectorstore = AzureCosmosDBVectorSearch.from_texts(
+            texts,
+            azure_openai_embeddings,
+            collection=collection,
+            index_name=INDEX_NAME_VECTOR_HNSW,
+        )
+
+        # Create the IVF index that will be leveraged later for vector search
+        vectorstore.create_index(
+            num_lists,
+            dimensions,
+            similarity_algorithm,
+            CosmosDBVectorSearchType.VECTOR_HNSW,
+            m,
+            ef_construction,
+        )
+        sleep(2)  # waits for the index to be set up
+
+        query = "foo"
+        output = vectorstore.max_marginal_relevance_search(
+            query,
+            k=10,
+            kind=CosmosDBVectorSearchType.VECTOR_HNSW,
+            lambda_mult=0.1,
+            score_threshold=score_threshold,
+        )
+
+        assert len(output) == len(texts)
+        assert output[0].page_content == "foo"
+        assert output[1].page_content != "foo"
+
+        vectorstore.delete_index()
+
+    def test_max_marginal_relevance_inner_product_vector_hnsw(
+        self, azure_openai_embeddings: OpenAIEmbeddings, collection: Any
+    ) -> None:
+        texts = ["foo", "foo", "fou", "foy"]
+        vectorstore = AzureCosmosDBVectorSearch.from_texts(
+            texts,
+            azure_openai_embeddings,
+            collection=collection,
+            index_name=INDEX_NAME_VECTOR_HNSW,
+        )
+
+        # Create the IVF index that will be leveraged later for vector search
+        vectorstore.create_index(
+            num_lists,
+            dimensions,
+            similarity_algorithm,
+            CosmosDBVectorSearchType.VECTOR_HNSW,
+            m,
+            ef_construction,
+        )
+        sleep(2)  # waits for the index to be set up
+
+        query = "foo"
+        output = vectorstore.max_marginal_relevance_search(
+            query,
+            k=10,
+            kind=CosmosDBVectorSearchType.VECTOR_HNSW,
+            lambda_mult=0.1,
+            score_threshold=score_threshold,
+        )
+
+        assert len(output) == len(texts)
+        assert output[0].page_content == "foo"
+        assert output[1].page_content != "foo"
+
+        vectorstore.delete_index()
+
+    @staticmethod
+    def invoke_delete_with_no_args(
+        azure_openai_embeddings: OpenAIEmbeddings, collection: Any
    ) -> Optional[bool]:
        vectorstore: AzureCosmosDBVectorSearch = (
            AzureCosmosDBVectorSearch.from_connection_string(
@@ -406,8 +916,9 @@ class TestAzureCosmosDBVectorSearch:

        return vectorstore.delete()

+    @staticmethod
    def invoke_delete_by_id_with_no_args(
-        self, azure_openai_embeddings: OpenAIEmbeddings, collection: Any
+        azure_openai_embeddings: OpenAIEmbeddings, collection: Any
    ) -> None:
        vectorstore: AzureCosmosDBVectorSearch = (
            AzureCosmosDBVectorSearch.from_connection_string(
@@ -431,5 +942,7 @@ class TestAzureCosmosDBVectorSearch:
        self, azure_openai_embeddings: OpenAIEmbeddings, collection: Any
    ) -> None:
        with pytest.raises(Exception) as exception_info:
-            self.invoke_delete_by_id_with_no_args(azure_openai_embeddings, collection)
+            self.invoke_delete_by_id_with_no_args(
+                azure_openai_embeddings=azure_openai_embeddings, collection=collection
+            )
        assert str(exception_info.value) == "No document id provided to delete."
--- a/libs/community/tests/unit_tests/retrievers/test_you.py
+++ b/libs/community/tests/unit_tests/retrievers/test_you.py
@@ -1,3 +1,6 @@
+from unittest.mock import AsyncMock, patch
+
+import pytest
 import responses

 from langchain_community.retrievers.you import YouRetriever
@@ -70,3 +73,39 @@ class TestYouRetriever:
        results = you_wrapper.results(query)
        expected_result = NEWS_RESPONSE_PARSED
        assert results == expected_result
+
+    @pytest.mark.asyncio
+    async def test_aget_relevant_documents(self) -> None:
+        instance = YouRetriever(ydc_api_key="test_api_key")
+
+        # Mock response object to simulate aiohttp response
+        mock_response = AsyncMock()
+        mock_response.__aenter__.return_value = (
+            mock_response  # Make the context manager return itself
+        )
+        mock_response.__aexit__.return_value = None  # No value needed for exit
+        mock_response.status = 200
+        mock_response.json = AsyncMock(return_value=MOCK_RESPONSE_RAW)
+
+        # Patch the aiohttp.ClientSession object
+        with patch("aiohttp.ClientSession.get", return_value=mock_response):
+            results = await instance.aget_relevant_documents("test query")
+            assert results == MOCK_PARSED_OUTPUT
+
+    @pytest.mark.asyncio
+    async def test_ainvoke(self) -> None:
+        instance = YouRetriever(ydc_api_key="test_api_key")
+
+        # Mock response object to simulate aiohttp response
+        mock_response = AsyncMock()
+        mock_response.__aenter__.return_value = (
+            mock_response  # Make the context manager return itself
+        )
+        mock_response.__aexit__.return_value = None  # No value needed for exit
+        mock_response.status = 200
+        mock_response.json = AsyncMock(return_value=MOCK_RESPONSE_RAW)
+
+        # Patch the aiohttp.ClientSession object
+        with patch("aiohttp.ClientSession.get", return_value=mock_response):
+            results = await instance.ainvoke("test query")
+            assert results == MOCK_PARSED_OUTPUT
--- a/libs/community/tests/unit_tests/tools/test_imports.py
+++ b/libs/community/tests/unit_tests/tools/test_imports.py
@@ -122,6 +122,7 @@ EXPECTED_ALL = [
    "WolframAlphaQueryRun",
    "WriteFileTool",
    "YahooFinanceNewsTool",
+    "YouSearchTool",
    "YouTubeSearchTool",
    "ZapierNLAListActions",
    "ZapierNLARunAction",
--- a/libs/community/tests/unit_tests/tools/test_public_api.py
+++ b/libs/community/tests/unit_tests/tools/test_public_api.py
@@ -124,6 +124,7 @@ _EXPECTED = [
    "WolframAlphaQueryRun",
    "WriteFileTool",
    "YahooFinanceNewsTool",
+    "YouSearchTool",
    "YouTubeSearchTool",
    "ZapierNLAListActions",
    "ZapierNLARunAction",
--- a/libs/community/tests/unit_tests/tools/test_you.py
+++ b/libs/community/tests/unit_tests/tools/test_you.py
@@ -0,0 +1,87 @@
+from unittest.mock import AsyncMock, patch
+
+import pytest
+import responses
+
+from langchain_community.tools.you import YouSearchTool
+from langchain_community.utilities.you import YouSearchAPIWrapper
+
+from ..utilities.test_you import (
+    LIMITED_PARSED_OUTPUT,
+    MOCK_PARSED_OUTPUT,
+    MOCK_RESPONSE_RAW,
+    NEWS_RESPONSE_PARSED,
+    NEWS_RESPONSE_RAW,
+    TEST_ENDPOINT,
+)
+
+
+class TestYouSearchTool:
+    @responses.activate
+    def test_invoke(self) -> None:
+        responses.add(
+            responses.GET, f"{TEST_ENDPOINT}/search", json=MOCK_RESPONSE_RAW, status=200
+        )
+        query = "Test query text"
+        you_tool = YouSearchTool(api_wrapper=YouSearchAPIWrapper(ydc_api_key="test"))
+        results = you_tool.invoke(query)
+        expected_result = MOCK_PARSED_OUTPUT
+        assert results == expected_result
+
+    @responses.activate
+    def test_invoke_max_docs(self) -> None:
+        responses.add(
+            responses.GET, f"{TEST_ENDPOINT}/search", json=MOCK_RESPONSE_RAW, status=200
+        )
+        query = "Test query text"
+        you_tool = YouSearchTool(
+            api_wrapper=YouSearchAPIWrapper(ydc_api_key="test", k=2)
+        )
+        results = you_tool.invoke(query)
+        expected_result = [MOCK_PARSED_OUTPUT[0], MOCK_PARSED_OUTPUT[1]]
+        assert results == expected_result
+
+    @responses.activate
+    def test_invoke_limit_snippets(self) -> None:
+        responses.add(
+            responses.GET, f"{TEST_ENDPOINT}/search", json=MOCK_RESPONSE_RAW, status=200
+        )
+        query = "Test query text"
+        you_tool = YouSearchTool(
+            api_wrapper=YouSearchAPIWrapper(ydc_api_key="test", n_snippets_per_hit=1)
+        )
+        results = you_tool.invoke(query)
+        expected_result = LIMITED_PARSED_OUTPUT
+        assert results == expected_result
+
+    @responses.activate
+    def test_invoke_news(self) -> None:
+        responses.add(
+            responses.GET, f"{TEST_ENDPOINT}/news", json=NEWS_RESPONSE_RAW, status=200
+        )
+
+        query = "Test news text"
+        you_tool = YouSearchTool(
+            api_wrapper=YouSearchAPIWrapper(ydc_api_key="test", endpoint_type="news")
+        )
+        results = you_tool.invoke(query)
+        expected_result = NEWS_RESPONSE_PARSED
+        assert results == expected_result
+
+    @pytest.mark.asyncio
+    async def test_ainvoke(self) -> None:
+        you_tool = YouSearchTool(api_wrapper=YouSearchAPIWrapper(ydc_api_key="test"))
+
+        # Mock response object to simulate aiohttp response
+        mock_response = AsyncMock()
+        mock_response.__aenter__.return_value = (
+            mock_response  # Make the context manager return itself
+        )
+        mock_response.__aexit__.return_value = None  # No value needed for exit
+        mock_response.status = 200
+        mock_response.json = AsyncMock(return_value=MOCK_RESPONSE_RAW)
+
+        # Patch the aiohttp.ClientSession object
+        with patch("aiohttp.ClientSession.get", return_value=mock_response):
+            results = await you_tool.ainvoke("test query")
+            assert results == MOCK_PARSED_OUTPUT
--- a/libs/community/tests/unit_tests/utilities/test_you.py
+++ b/libs/community/tests/unit_tests/utilities/test_you.py
@@ -1,5 +1,7 @@
 from typing import Any, Dict, List, Optional, Union
+from unittest.mock import AsyncMock, patch

+import pytest
 import responses
 from langchain_core.documents import Document

@@ -187,4 +189,58 @@ def test_results_news() -> None:
    assert raw_results == expected_result


-# @todo test async methods
+@pytest.mark.asyncio
+async def test_raw_results_async() -> None:
+    instance = YouSearchAPIWrapper(ydc_api_key="test_api_key")
+
+    # Mock response object to simulate aiohttp response
+    mock_response = AsyncMock()
+    mock_response.__aenter__.return_value = (
+        mock_response  # Make the context manager return itself
+    )
+    mock_response.__aexit__.return_value = None  # No value needed for exit
+    mock_response.status = 200
+    mock_response.json = AsyncMock(return_value=MOCK_RESPONSE_RAW)
+
+    # Patch the aiohttp.ClientSession object
+    with patch("aiohttp.ClientSession.get", return_value=mock_response):
+        results = await instance.raw_results_async("test query")
+        assert results == MOCK_RESPONSE_RAW
+
+
+@pytest.mark.asyncio
+async def test_results_async() -> None:
+    instance = YouSearchAPIWrapper(ydc_api_key="test_api_key")
+
+    # Mock response object to simulate aiohttp response
+    mock_response = AsyncMock()
+    mock_response.__aenter__.return_value = (
+        mock_response  # Make the context manager return itself
+    )
+    mock_response.__aexit__.return_value = None  # No value needed for exit
+    mock_response.status = 200
+    mock_response.json = AsyncMock(return_value=MOCK_RESPONSE_RAW)
+
+    # Patch the aiohttp.ClientSession object
+    with patch("aiohttp.ClientSession.get", return_value=mock_response):
+        results = await instance.results_async("test query")
+        assert results == MOCK_PARSED_OUTPUT
+
+
+@pytest.mark.asyncio
+async def test_results_news_async() -> None:
+    instance = YouSearchAPIWrapper(endpoint_type="news", ydc_api_key="test_api_key")
+
+    # Mock response object to simulate aiohttp response
+    mock_response = AsyncMock()
+    mock_response.__aenter__.return_value = (
+        mock_response  # Make the context manager return itself
+    )
+    mock_response.__aexit__.return_value = None  # No value needed for exit
+    mock_response.status = 200
+    mock_response.json = AsyncMock(return_value=NEWS_RESPONSE_RAW)
+
+    # Patch the aiohttp.ClientSession object
+    with patch("aiohttp.ClientSession.get", return_value=mock_response):
+        results = await instance.results_async("test query")
+        assert results == NEWS_RESPONSE_PARSED
--- a/libs/langchain/langchain/agents/json_chat/base.py
+++ b/libs/langchain/langchain/agents/json_chat/base.py
@@ -8,7 +8,7 @@ from langchain_core.tools import BaseTool
 from langchain.agents.format_scratchpad import format_log_to_messages
 from langchain.agents.json_chat.prompt import TEMPLATE_TOOL_RESPONSE
 from langchain.agents.output_parsers import JSONAgentOutputParser
-from langchain.tools.render import render_text_description
+from langchain.tools.render import ToolsRenderer, render_text_description


 def create_json_chat_agent(
@@ -16,6 +16,7 @@ def create_json_chat_agent(
    tools: Sequence[BaseTool],
    prompt: ChatPromptTemplate,
    stop_sequence: bool = True,
+    tools_renderer: ToolsRenderer = render_text_description,
 ) -> Runnable:
    """Create an agent that uses JSON to format its logic, build for Chat Models.

@@ -26,6 +27,9 @@ def create_json_chat_agent(
        stop_sequence: Adds a stop token of "Observation:" to avoid hallucinates. 
            Default is True. You may to set this to False if the LLM you are using
            does not support stop sequences.
+        tools_renderer: This controls how the tools are converted into a string and
+            then passed into the LLM. Default is `render_text_description`.
+
    Returns:
        A Runnable sequence representing an agent. It takes as input all the same input
        variables as the prompt passed in does. It returns as output either an
@@ -150,7 +154,7 @@ def create_json_chat_agent(
        raise ValueError(f"Prompt missing required variables: {missing_vars}")

    prompt = prompt.partial(
-        tools=render_text_description(list(tools)),
+        tools=tools_renderer(list(tools)),
        tool_names=", ".join([t.name for t in tools]),
    )
    if stop_sequence:
--- a/libs/langchain/langchain/agents/mrkl/base.py
+++ b/libs/langchain/langchain/agents/mrkl/base.py
@@ -17,6 +17,7 @@ from langchain.agents.mrkl.prompt import FORMAT_INSTRUCTIONS, PREFIX, SUFFIX
 from langchain.agents.tools import Tool
 from langchain.agents.utils import validate_tools_single_input
 from langchain.chains import LLMChain
+from langchain.tools.render import render_text_description


 class ChainConfig(NamedTuple):
@@ -79,7 +80,7 @@ class ZeroShotAgent(Agent):
        Returns:
            A PromptTemplate with the template assembled from the pieces here.
        """
-        tool_strings = "\n".join([f"{tool.name}: {tool.description}" for tool in tools])
+        tool_strings = render_text_description(list(tools))
        tool_names = ", ".join([tool.name for tool in tools])
        format_instructions = format_instructions.format(tool_names=tool_names)
        template = "\n\n".join([prefix, tool_strings, format_instructions, suffix])
--- a/libs/langchain/langchain/agents/react/agent.py
+++ b/libs/langchain/langchain/agents/react/agent.py
@@ -10,7 +10,7 @@ from langchain_core.tools import BaseTool
 from langchain.agents import AgentOutputParser
 from langchain.agents.format_scratchpad import format_log_to_str
 from langchain.agents.output_parsers import ReActSingleInputOutputParser
-from langchain.tools.render import render_text_description
+from langchain.tools.render import ToolsRenderer, render_text_description


 def create_react_agent(
@@ -18,6 +18,7 @@ def create_react_agent(
    tools: Sequence[BaseTool],
    prompt: BasePromptTemplate,
    output_parser: Optional[AgentOutputParser] = None,
+    tools_renderer: ToolsRenderer = render_text_description,
 ) -> Runnable:
    """Create an agent that uses ReAct prompting.

@@ -26,6 +27,8 @@ def create_react_agent(
        tools: Tools this agent has access to.
        prompt: The prompt to use. See Prompt section below for more.
        output_parser: AgentOutputParser for parse the LLM output.
+        tools_renderer: This controls how the tools are converted into a string and
+            then passed into the LLM. Default is `render_text_description`.

    Returns:
        A Runnable sequence representing an agent. It takes as input all the same input
@@ -102,7 +105,7 @@ def create_react_agent(
        raise ValueError(f"Prompt missing required variables: {missing_vars}")

    prompt = prompt.partial(
-        tools=render_text_description(list(tools)),
+        tools=tools_renderer(list(tools)),
        tool_names=", ".join([t.name for t in tools]),
    )
    llm_with_stop = llm.bind(stop=["\nObservation"])
--- a/libs/langchain/langchain/agents/structured_chat/base.py
+++ b/libs/langchain/langchain/agents/structured_chat/base.py
@@ -23,7 +23,7 @@ from langchain.agents.structured_chat.output_parser import (
 )
 from langchain.agents.structured_chat.prompt import FORMAT_INSTRUCTIONS, PREFIX, SUFFIX
 from langchain.chains.llm import LLMChain
-from langchain.tools.render import render_text_description_and_args
+from langchain.tools.render import ToolsRenderer, render_text_description_and_args

 HUMAN_MESSAGE_TEMPLATE = "{input}\n\n{agent_scratchpad}"

@@ -151,7 +151,10 @@ class StructuredChatAgent(Agent):


 def create_structured_chat_agent(
-    llm: BaseLanguageModel, tools: Sequence[BaseTool], prompt: ChatPromptTemplate
+    llm: BaseLanguageModel,
+    tools: Sequence[BaseTool],
+    prompt: ChatPromptTemplate,
+    tools_renderer: ToolsRenderer = render_text_description_and_args,
 ) -> Runnable:
    """Create an agent aimed at supporting tools with multiple inputs.

@@ -159,6 +162,8 @@ def create_structured_chat_agent(
        llm: LLM to use as the agent.
        tools: Tools this agent has access to.
        prompt: The prompt to use. See Prompt section below for more.
+        tools_renderer: This controls how the tools are converted into a string and
+            then passed into the LLM. Default is `render_text_description`.

    Returns:
        A Runnable sequence representing an agent. It takes as input all the same input
@@ -265,7 +270,7 @@ def create_structured_chat_agent(
        raise ValueError(f"Prompt missing required variables: {missing_vars}")

    prompt = prompt.partial(
-        tools=render_text_description_and_args(list(tools)),
+        tools=tools_renderer(list(tools)),
        tool_names=", ".join([t.name for t in tools]),
    )
    llm_with_stop = llm.bind(stop=["Observation"])
--- a/libs/langchain/langchain/agents/xml/base.py
+++ b/libs/langchain/langchain/agents/xml/base.py
@@ -14,7 +14,7 @@ from langchain.agents.format_scratchpad import format_xml
 from langchain.agents.output_parsers import XMLAgentOutputParser
 from langchain.agents.xml.prompt import agent_instructions
 from langchain.chains.llm import LLMChain
-from langchain.tools.render import render_text_description
+from langchain.tools.render import ToolsRenderer, render_text_description


@deprecated("0.1.0", alternative="create_xml_agent", removal="0.2.0")
@@ -108,7 +108,10 @@ class XMLAgent(BaseSingleActionAgent):


 def create_xml_agent(
-    llm: BaseLanguageModel, tools: Sequence[BaseTool], prompt: BasePromptTemplate
+    llm: BaseLanguageModel,
+    tools: Sequence[BaseTool],
+    prompt: BasePromptTemplate,
+    tools_renderer: ToolsRenderer = render_text_description,
 ) -> Runnable:
    """Create an agent that uses XML to format its logic.

@@ -118,6 +121,8 @@ def create_xml_agent(
        prompt: The prompt to use, must have input keys
            `tools`: contains descriptions for each tool.
            `agent_scratchpad`: contains previous agent actions and tool outputs.
+        tools_renderer: This controls how the tools are converted into a string and
+            then passed into the LLM. Default is `render_text_description`.

    Returns:
        A Runnable sequence representing an agent. It takes as input all the same input
@@ -194,7 +199,7 @@ def create_xml_agent(
        raise ValueError(f"Prompt missing required variables: {missing_vars}")

    prompt = prompt.partial(
-        tools=render_text_description(list(tools)),
+        tools=tools_renderer(list(tools)),
    )
    llm_with_stop = llm.bind(stop=["</tool_input>"])

--- a/libs/langchain/langchain/smith/evaluation/config.py
+++ b/libs/langchain/langchain/smith/evaluation/config.py
@@ -1,12 +1,14 @@
 """Configuration for run evaluators."""

-from typing import Any, Dict, List, Optional, Union
+from typing import Any, Callable, Dict, List, Optional, Sequence, Union

 from langchain_core.embeddings import Embeddings
 from langchain_core.language_models import BaseLanguageModel
 from langchain_core.prompts import BasePromptTemplate
 from langchain_core.pydantic_v1 import BaseModel, Field
 from langsmith import RunEvaluator
+from langsmith.evaluation.evaluator import EvaluationResult, EvaluationResults
+from langsmith.schemas import Example, Run

 from langchain.evaluation.criteria.eval_chain import CRITERIA_TYPE
 from langchain.evaluation.embedding_distance.base import (
@@ -17,6 +19,14 @@ from langchain.evaluation.string_distance.base import (
    StringDistance as StringDistanceEnum,
 )

+RUN_EVALUATOR_LIKE = Callable[
+    [Run, Optional[Example]], Union[EvaluationResult, EvaluationResults, dict]
+]
+BATCH_EVALUATOR_LIKE = Callable[
+    [Sequence[Run], Optional[Sequence[Example]]],
+    Union[EvaluationResult, EvaluationResults, dict],
+]
+

 class EvalConfig(BaseModel):
    """Configuration for a given run evaluator.
@@ -76,12 +86,16 @@ class SingleKeyEvalConfig(EvalConfig):
        return kwargs


+CUSTOM_EVALUATOR_TYPE = Union[RUN_EVALUATOR_LIKE, RunEvaluator, StringEvaluator]
+SINGLE_EVAL_CONFIG_TYPE = Union[EvaluatorType, str, EvalConfig]
+
+
 class RunEvalConfig(BaseModel):
    """Configuration for a run evaluation.

    Parameters
    ----------
-    evaluators : List[Union[EvaluatorType, EvalConfig]]
+    evaluators : List[Union[EvaluatorType, EvalConfig, RunEvaluator, Callable]]
        Configurations for which evaluators to apply to the dataset run.
        Each can be the string of an :class:`EvaluatorType <langchain.evaluation.schema.EvaluatorType>`, such
        as EvaluatorType.QA, the evaluator type string ("qa"), or a configuration for a
@@ -107,9 +121,12 @@ class RunEvalConfig(BaseModel):
        The language model to pass to any evaluators that use a language model.
    """  # noqa: E501

-    evaluators: List[Union[EvaluatorType, str, EvalConfig]] = Field(
-        default_factory=list
-    )
+    evaluators: List[
+        Union[
+            SINGLE_EVAL_CONFIG_TYPE,
+            CUSTOM_EVALUATOR_TYPE,
+        ]
+    ] = Field(default_factory=list)
    """Configurations for which evaluators to apply to the dataset run.
    Each can be the string of an
    :class:`EvaluatorType <langchain.evaluation.schema.EvaluatorType>`, such
@@ -117,8 +134,15 @@ class RunEvalConfig(BaseModel):
    given evaluator
    (e.g., 
    :class:`RunEvalConfig.QA <langchain.smith.evaluation.config.RunEvalConfig.QA>`)."""  # noqa: E501
-    custom_evaluators: Optional[List[Union[RunEvaluator, StringEvaluator]]] = None
+    custom_evaluators: Optional[List[CUSTOM_EVALUATOR_TYPE]] = None
    """Custom evaluators to apply to the dataset run."""
+    batch_evaluators: Optional[List[BATCH_EVALUATOR_LIKE]] = None
+    """Evaluators that run on an aggregate/batch level.
+
+    These generate 1 or more metrics that are assigned to the full test run.
+    As a result, they are not associated with individual traces.
+    """
+
    reference_key: Optional[str] = None
    """The key in the dataset run to use as the reference string.
    If not provided, we will attempt to infer automatically."""
--- a/libs/langchain/langchain/smith/evaluation/runner_utils.py
+++ b/libs/langchain/langchain/smith/evaluation/runner_utils.py
@@ -2,6 +2,7 @@

 from __future__ import annotations

+import concurrent.futures
 import dataclasses
 import functools
 import inspect
@@ -35,9 +36,15 @@ from langchain_core.tracers.evaluation import (
 from langchain_core.tracers.langchain import LangChainTracer
 from langsmith.client import Client
 from langsmith.env import get_git_info, get_langchain_env_var_metadata
-from langsmith.evaluation import EvaluationResult, RunEvaluator
+from langsmith.evaluation import (
+    EvaluationResult,
+    RunEvaluator,
+)
+from langsmith.evaluation import (
+    run_evaluator as run_evaluator_dec,
+)
 from langsmith.run_helpers import as_runnable, is_traceable_function
-from langsmith.schemas import Dataset, DataType, Example, TracerSession
+from langsmith.schemas import Dataset, DataType, Example, Run, TracerSession
 from langsmith.utils import LangSmithError
 from requests import HTTPError
 from typing_extensions import TypedDict
@@ -513,7 +520,10 @@ def _determine_reference_key(


 def _construct_run_evaluator(
-    eval_config: Union[EvaluatorType, str, smith_eval_config.EvalConfig],
+    eval_config: Union[
+        smith_eval_config.SINGLE_EVAL_CONFIG_TYPE,
+        smith_eval_config.CUSTOM_EVALUATOR_TYPE,
+    ],
    eval_llm: Optional[BaseLanguageModel],
    run_type: str,
    data_type: DataType,
@@ -522,12 +532,14 @@ def _construct_run_evaluator(
    input_key: Optional[str],
    prediction_key: Optional[str],
 ) -> RunEvaluator:
+    if isinstance(eval_config, RunEvaluator):
+        return eval_config
    if isinstance(eval_config, (EvaluatorType, str)):
        if not isinstance(eval_config, EvaluatorType):
            eval_config = EvaluatorType(eval_config)
        evaluator_ = load_evaluator(eval_config, llm=eval_llm)
        eval_type_tag = eval_config.value
-    else:
+    elif isinstance(eval_config, smith_eval_config.EvalConfig):
        kwargs = {"llm": eval_llm, **eval_config.get_kwargs()}
        evaluator_ = load_evaluator(eval_config.evaluator_type, **kwargs)
        eval_type_tag = eval_config.evaluator_type.value
@@ -536,6 +548,11 @@ def _construct_run_evaluator(
            input_key = eval_config.input_key or input_key
            prediction_key = eval_config.prediction_key or prediction_key
            reference_key = eval_config.reference_key or reference_key
+    elif callable(eval_config):
+        # Assume we can decorate
+        return run_evaluator_dec(eval_config)
+    else:
+        raise ValueError(f"Unknown evaluator type: {type(eval_config)}")

    if isinstance(evaluator_, StringEvaluator):
        if evaluator_.requires_reference and reference_key is None:
@@ -600,13 +617,9 @@ def _load_run_evaluators(
    """
    run_evaluators = []
    input_key, prediction_key, reference_key = None, None, None
-    if (
-        config.evaluators
-        or any([isinstance(e, EvaluatorType) for e in config.evaluators])
-        or (
-            config.custom_evaluators
-            and any([isinstance(e, StringEvaluator) for e in config.custom_evaluators])
-        )
+    if config.evaluators or (
+        config.custom_evaluators
+        and any([isinstance(e, StringEvaluator) for e in config.custom_evaluators])
    ):
        input_key, prediction_key, reference_key = _get_keys(
            config, run_inputs, run_outputs, example_outputs
@@ -638,6 +651,8 @@ def _load_run_evaluators(
                    reference_key=reference_key,
                )
            )
+        elif callable(custom_evaluator):
+            run_evaluators.append(run_evaluator_dec(custom_evaluator))
        else:
            raise ValueError(
                f"Unsupported custom evaluator: {custom_evaluator}."
@@ -953,9 +968,6 @@ def _run_llm_or_chain(
    return result


-## Public API
-
-
 def _prepare_eval_run(
    client: Client,
    dataset_name: str,
@@ -963,10 +975,17 @@ def _prepare_eval_run(
    project_name: str,
    project_metadata: Optional[Dict[str, Any]] = None,
    tags: Optional[List[str]] = None,
+    dataset_version: Optional[Union[str, datetime]] = None,
 ) -> Tuple[MCF, TracerSession, Dataset, List[Example]]:
    wrapped_model = _wrap_in_chain_factory(llm_or_chain_factory, dataset_name)
    dataset = client.read_dataset(dataset_name=dataset_name)
-    examples = list(client.list_examples(dataset_id=dataset.id))
+    as_of = dataset_version if isinstance(dataset_version, datetime) else None
+    if isinstance(dataset_version, str):
+        raise NotImplementedError(
+            "Selecting dataset_version by tag is not yet supported."
+            " Please use a datetime object."
+        )
+    examples = list(client.list_examples(dataset_id=dataset.id, as_of=as_of))
    if not examples:
        raise ValueError(f"Dataset {dataset_name} has no example rows.")
    modified_at = [ex.modified_at for ex in examples if ex.modified_at]
@@ -1032,6 +1051,7 @@ class _DatasetRunContainer:
    wrapped_model: MCF
    examples: List[Example]
    configs: List[RunnableConfig]
+    batch_evaluators: Optional[List[smith_eval_config.BATCH_EVALUATOR_LIKE]] = None

    def _merge_test_outputs(
        self,
@@ -1055,8 +1075,34 @@ class _DatasetRunContainer:
                results[str(example.id)]["reference"] = example.outputs
        return results

-    def _collect_metrics(self) -> Dict[str, _RowResult]:
+    def _run_batch_evaluators(self, runs: Dict[str, Run]) -> List[dict]:
+        evaluators = self.batch_evaluators
+        if not evaluators:
+            return []
+        runs_list = [runs[str(example.id)] for example in self.examples]
+        aggregate_feedback = []
+        with concurrent.futures.ThreadPoolExecutor() as executor:
+            for evaluator in evaluators:
+                try:
+                    result = evaluator(runs_list, self.examples)
+                    if isinstance(result, EvaluationResult):
+                        result = result.dict()
+                    aggregate_feedback.append(cast(dict, result))
+                    executor.submit(
+                        self.client.create_feedback,
+                        **result,
+                        run_id=None,
+                        project_id=self.project.id,
+                    )
+                except Exception as e:
+                    logger.error(
+                        f"Error running batch evaluator {repr(evaluator)}: {e}"
+                    )
+        return aggregate_feedback
+
+    def _collect_metrics(self) -> Tuple[Dict[str, _RowResult], Dict[str, Run]]:
        all_eval_results: dict = {}
+        all_runs: dict = {}
        for c in self.configs:
            for callback in cast(list, c["callbacks"]):
                if isinstance(callback, EvaluatorCallbackHandler):
@@ -1077,20 +1123,28 @@ class _DatasetRunContainer:
                        {
                            "execution_time": execution_time,
                            "run_id": run_id,
+                            "run": run,
                        }
                    )
-        return cast(Dict[str, _RowResult], all_eval_results)
+                    all_runs[str(callback.example_id)] = run
+        return cast(Dict[str, _RowResult], all_eval_results), all_runs

    def _collect_test_results(
        self,
        batch_results: List[Union[dict, str, LLMResult, ChatResult]],
    ) -> TestResult:
+        logger.info("Waiting for evaluators to complete.")
        wait_for_all_evaluators()
-        all_eval_results = self._collect_metrics()
+        all_eval_results, all_runs = self._collect_metrics()
+        aggregate_feedback = None
+        if self.batch_evaluators:
+            logger.info("Running session evaluators.")
+            aggregate_feedback = self._run_batch_evaluators(all_runs)
        results = self._merge_test_outputs(batch_results, all_eval_results)
        return TestResult(
            project_name=self.project.name,
            results=results,
+            aggregate_metrics=aggregate_feedback,
        )

    def finish(self, batch_results: list, verbose: bool = False) -> TestResult:
@@ -1123,6 +1177,7 @@ class _DatasetRunContainer:
        concurrency_level: int = 5,
        project_metadata: Optional[Dict[str, Any]] = None,
        revision_id: Optional[str] = None,
+        dataset_version: Optional[Union[datetime, str]] = None,
    ) -> _DatasetRunContainer:
        project_name = project_name or name_generation.random_name()
        if revision_id:
@@ -1136,10 +1191,14 @@ class _DatasetRunContainer:
            project_name,
            project_metadata=project_metadata,
            tags=tags,
+            dataset_version=dataset_version,
        )
        tags = tags or []
        for k, v in (project.metadata.get("git") or {}).items():
            tags.append(f"git:{k}={v}")
+        run_metadata = {"dataset_version": project.metadata["dataset_version"]}
+        if revision_id:
+            run_metadata["revision_id"] = revision_id
        wrapped_model = _wrap_in_chain_factory(llm_or_chain_factory)
        run_evaluators = _setup_evaluation(
            wrapped_model, examples, evaluation, dataset.data_type or DataType.kv
@@ -1164,7 +1223,7 @@ class _DatasetRunContainer:
                ],
                tags=tags,
                max_concurrency=concurrency_level,
-                metadata={"revision_id": revision_id} if revision_id else {},
+                metadata=run_metadata,
            )
            for example in examples
        ]
@@ -1174,6 +1233,7 @@ class _DatasetRunContainer:
            wrapped_model=wrapped_model,
            examples=examples,
            configs=configs,
+            batch_evaluators=evaluation.batch_evaluators if evaluation else None,
        )


@@ -1215,6 +1275,8 @@ _INPUT_MAPPER_DEP_WARNING = (
    "langchain.schema.runnable.base.RunnableLambda.html)"
 )

+## Public API
+

 async def arun_on_dataset(
    client: Optional[Client],
@@ -1222,11 +1284,11 @@ async def arun_on_dataset(
    llm_or_chain_factory: MODEL_OR_CHAIN_FACTORY,
    *,
    evaluation: Optional[smith_eval.RunEvalConfig] = None,
+    dataset_version: Optional[Union[datetime, str]] = None,
    concurrency_level: int = 5,
    project_name: Optional[str] = None,
    project_metadata: Optional[Dict[str, Any]] = None,
    verbose: bool = False,
-    tags: Optional[List[str]] = None,
    revision_id: Optional[str] = None,
    **kwargs: Any,
 ) -> Dict[str, Any]:
@@ -1235,6 +1297,13 @@ async def arun_on_dataset(
        warn_deprecated("0.0.305", message=_INPUT_MAPPER_DEP_WARNING, pending=True)
    if revision_id is None:
        revision_id = get_langchain_env_var_metadata().get("revision_id")
+    tags = kwargs.pop("tags", None)
+    if tags:
+        warn_deprecated(
+            "0.1.9",
+            message="The tags argument is deprecated and will be"
+            " removed in a future release. Please specify project_metadata instead.",
+        )

    if kwargs:
        warn_deprecated(
@@ -1256,6 +1325,7 @@ async def arun_on_dataset(
        concurrency_level,
        project_metadata=project_metadata,
        revision_id=revision_id,
+        dataset_version=dataset_version,
    )
    batch_results = await runnable_utils.gather_with_concurrency(
        container.configs[0].get("max_concurrency"),
@@ -1278,17 +1348,24 @@ def run_on_dataset(
    llm_or_chain_factory: MODEL_OR_CHAIN_FACTORY,
    *,
    evaluation: Optional[smith_eval.RunEvalConfig] = None,
+    dataset_version: Optional[Union[datetime, str]] = None,
    concurrency_level: int = 5,
    project_name: Optional[str] = None,
    project_metadata: Optional[Dict[str, Any]] = None,
    verbose: bool = False,
-    tags: Optional[List[str]] = None,
    revision_id: Optional[str] = None,
    **kwargs: Any,
 ) -> Dict[str, Any]:
    input_mapper = kwargs.pop("input_mapper", None)
    if input_mapper:
        warn_deprecated("0.0.305", message=_INPUT_MAPPER_DEP_WARNING, pending=True)
+    tags = kwargs.pop("tags", None)
+    if tags:
+        warn_deprecated(
+            "0.1.9",
+            message="The tags argument is deprecated and will be"
+            " removed in a future release. Please specify project_metadata instead.",
+        )
    if revision_id is None:
        revision_id = get_langchain_env_var_metadata().get("revision_id")

@@ -1312,6 +1389,7 @@ def run_on_dataset(
        concurrency_level,
        project_metadata=project_metadata,
        revision_id=revision_id,
+        dataset_version=dataset_version,
    )
    if concurrency_level == 0:
        batch_results = [
@@ -1404,8 +1482,8 @@ Examples
    client = Client()
    run_on_dataset(
        client,
-        "<my_dataset_name>",
-        construct_chain,
+        dataset_name="<my_dataset_name>",
+        llm_or_chain_factory=construct_chain,
        evaluation=evaluation_config,
    )

@@ -1442,8 +1520,8 @@ or LangSmith's `RunEvaluator` classes.

    run_on_dataset(
        client,
-        "<my_dataset_name>",
-        construct_chain,
+        dataset_name="<my_dataset_name>",
+        llm_or_chain_factory=construct_chain,
        evaluation=evaluation_config,
    )
 """  # noqa: E501
--- a/libs/langchain/langchain/tools/render.py
+++ b/libs/langchain/langchain/tools/render.py
@@ -4,7 +4,7 @@ Depending on the LLM you are using and the prompting strategy you are using,
 you may want Tools to be rendered in a different way.
 This module contains various ways to render tools.
 """
-from typing import List
+from typing import Callable, List

 # For backwards compatibility
 from langchain_core.tools import BaseTool
@@ -14,6 +14,7 @@ from langchain_core.utils.function_calling import (
 )

 __all__ = [
+    "ToolsRenderer",
    "render_text_description",
    "render_text_description_and_args",
    "format_tool_to_openai_tool",
@@ -21,6 +22,9 @@ __all__ = [
 ]


+ToolsRenderer = Callable[[List[BaseTool]], str]
+
+
 def render_text_description(tools: List[BaseTool]) -> str:
    """Render the tool name and description in plain text.

--- a/libs/langchain/poetry.lock
+++ b/libs/langchain/poetry.lock
@@ -1,4 +1,4 @@
-# This file is automatically @generated by Poetry 1.7.1 and should not be changed by hand.
+# This file is automatically @generated by Poetry 1.6.1 and should not be changed by hand.

 [[package]]
 name = "aiodns"
@@ -544,17 +544,6 @@ azure-core = ">=1.24.0,<2.0.0"
 isodate = ">=0.6.1,<1.0.0"
 typing-extensions = ">=4.0.1"

-[[package]]
-name = "azure-ai-vision"
-version = "0.11.1b1"
-description = "Microsoft Azure AI Vision SDK for Python"
-optional = true
-python-versions = ">=3.7"
-files = [
-    {file = "azure_ai_vision-0.11.1b1-py3-none-manylinux1_x86_64.whl", hash = "sha256:6f8563ae26689da6cdee9b2de009a53546ae2fd86c6c180236ce5da5b45f41d3"},
-    {file = "azure_ai_vision-0.11.1b1-py3-none-win_amd64.whl", hash = "sha256:f5df03b9156feaa1d8c776631967b1455028d30dfd4cd1c732aa0f9c03d01517"},
-]
-
 [[package]]
 name = "azure-cognitiveservices-speech"
 version = "1.32.1"
@@ -1768,13 +1757,13 @@ test = ["pytest (>=6)"]

 [[package]]
 name = "executing"
-version = "2.0.0"
+version = "2.0.1"
 description = "Get the currently executing AST node of a frame, and other information"
 optional = false
-python-versions = "*"
+python-versions = ">=3.5"
 files = [
-    {file = "executing-2.0.0-py2.py3-none-any.whl", hash = "sha256:06df6183df67389625f4e763921c6cf978944721abf3e714000200aab95b0657"},
-    {file = "executing-2.0.0.tar.gz", hash = "sha256:0ff053696fdeef426cda5bd18eacd94f82c91f49823a2e9090124212ceea9b08"},
+    {file = "executing-2.0.1-py2.py3-none-any.whl", hash = "sha256:eac49ca94516ccc753f9fb5ce82603156e590b27525a8bc32cce8ae302eb61bc"},
+    {file = "executing-2.0.1.tar.gz", hash = "sha256:35afe2ce3affba8ee97f2d69927fa823b08b472b7b994e36a52a964b93d16147"},
 ]

 [package.extras]
@@ -3049,6 +3038,7 @@ files = [
    {file = "jq-1.6.0-cp37-cp37m-musllinux_1_1_i686.whl", hash = "sha256:227b178b22a7f91ae88525810441791b1ca1fc71c86f03190911793be15cec3d"},
    {file = "jq-1.6.0-cp37-cp37m-musllinux_1_1_x86_64.whl", hash = "sha256:780eb6383fbae12afa819ef676fc93e1548ae4b076c004a393af26a04b460742"},
    {file = "jq-1.6.0-cp38-cp38-macosx_10_9_x86_64.whl", hash = "sha256:08ded6467f4ef89fec35b2bf310f210f8cd13fbd9d80e521500889edf8d22441"},
+    {file = "jq-1.6.0-cp38-cp38-macosx_11_0_arm64.whl", hash = "sha256:49e44ed677713f4115bd5bf2dbae23baa4cd503be350e12a1c1f506b0687848f"},
    {file = "jq-1.6.0-cp38-cp38-manylinux_2_17_aarch64.manylinux2014_aarch64.whl", hash = "sha256:984f33862af285ad3e41e23179ac4795f1701822473e1a26bf87ff023e5a89ea"},
    {file = "jq-1.6.0-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl", hash = "sha256:f42264fafc6166efb5611b5d4cb01058887d050a6c19334f6a3f8a13bb369df5"},
    {file = "jq-1.6.0-cp38-cp38-manylinux_2_5_i686.manylinux1_i686.manylinux_2_17_i686.manylinux2014_i686.whl", hash = "sha256:a67154f150aaf76cc1294032ed588436eb002097dd4fd1e283824bf753a05080"},
@@ -3132,7 +3122,6 @@ optional = false
 python-versions = ">=2.7, !=3.0.*, !=3.1.*, !=3.2.*, !=3.3.*, !=3.4.*, !=3.5.*, !=3.6.*"
 files = [
    {file = "jsonpointer-2.4-py2.py3-none-any.whl", hash = "sha256:15d51bba20eea3165644553647711d150376234112651b4f1811022aecad7d7a"},
-    {file = "jsonpointer-2.4.tar.gz", hash = "sha256:585cee82b70211fa9e6043b7bb89db6e1aa49524340dde8ad6b63206ea689d88"},
 ]

 [[package]]
@@ -3446,7 +3435,7 @@ files = [

 [[package]]
 name = "langchain-community"
-version = "0.0.24"
+version = "0.0.25"
 description = "Community contributed LangChain integrations."
 optional = false
 python-versions = ">=3.8.1,<4.0"
@@ -3456,7 +3445,7 @@ develop = true
 [package.dependencies]
 aiohttp = "^3.8.3"
 dataclasses-json = ">= 0.5.7, < 0.7"
-langchain-core = ">=0.1.26,<0.2"
+langchain-core = "^0.1.28"
 langsmith = "^0.1.0"
 numpy = "^1"
 PyYAML = ">=5.3"
@@ -3536,16 +3525,17 @@ url = "../text-splitters"

 [[package]]
 name = "langsmith"
-version = "0.1.1"
+version = "0.1.14"
 description = "Client library to connect to the LangSmith LLM Tracing and Evaluation Platform."
 optional = false
 python-versions = ">=3.8.1,<4.0"
 files = [
-    {file = "langsmith-0.1.1-py3-none-any.whl", hash = "sha256:10ff2b977a41e3f6351d1a4239d9bd57af0547aa909e839d2791e16cc197a6f9"},
-    {file = "langsmith-0.1.1.tar.gz", hash = "sha256:09df0c2ca9085105f97a4e4f281b083e312c99d162f3fe2b2d5eefd5c3692e60"},
+    {file = "langsmith-0.1.14-py3-none-any.whl", hash = "sha256:ecb243057d2a43c2da0524fe395585bc3421bb5d24f1cdd53eb06fbe63e43a69"},
+    {file = "langsmith-0.1.14.tar.gz", hash = "sha256:b95f267d25681f4c9862bb68236fba8a57a60ec7921ecfdaa125936807e51bde"},
 ]

 [package.dependencies]
+orjson = ">=3.9.14,<4.0.0"
 pydantic = ">=1,<3"
 requests = ">=2,<3"

@@ -3762,16 +3752,6 @@ files = [
    {file = "MarkupSafe-2.1.3-cp311-cp311-musllinux_1_1_x86_64.whl", hash = "sha256:5bbe06f8eeafd38e5d0a4894ffec89378b6c6a625ff57e3028921f8ff59318ac"},
    {file = "MarkupSafe-2.1.3-cp311-cp311-win32.whl", hash = "sha256:dd15ff04ffd7e05ffcb7fe79f1b98041b8ea30ae9234aed2a9168b5797c3effb"},
    {file = "MarkupSafe-2.1.3-cp311-cp311-win_amd64.whl", hash = "sha256:134da1eca9ec0ae528110ccc9e48041e0828d79f24121a1a146161103c76e686"},
-    {file = "MarkupSafe-2.1.3-cp312-cp312-macosx_10_9_universal2.whl", hash = "sha256:f698de3fd0c4e6972b92290a45bd9b1536bffe8c6759c62471efaa8acb4c37bc"},
-    {file = "MarkupSafe-2.1.3-cp312-cp312-macosx_10_9_x86_64.whl", hash = "sha256:aa57bd9cf8ae831a362185ee444e15a93ecb2e344c8e52e4d721ea3ab6ef1823"},
-    {file = "MarkupSafe-2.1.3-cp312-cp312-manylinux_2_17_aarch64.manylinux2014_aarch64.whl", hash = "sha256:ffcc3f7c66b5f5b7931a5aa68fc9cecc51e685ef90282f4a82f0f5e9b704ad11"},
-    {file = "MarkupSafe-2.1.3-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl", hash = "sha256:47d4f1c5f80fc62fdd7777d0d40a2e9dda0a05883ab11374334f6c4de38adffd"},
-    {file = "MarkupSafe-2.1.3-cp312-cp312-manylinux_2_5_i686.manylinux1_i686.manylinux_2_17_i686.manylinux2014_i686.whl", hash = "sha256:1f67c7038d560d92149c060157d623c542173016c4babc0c1913cca0564b9939"},
-    {file = "MarkupSafe-2.1.3-cp312-cp312-musllinux_1_1_aarch64.whl", hash = "sha256:9aad3c1755095ce347e26488214ef77e0485a3c34a50c5a5e2471dff60b9dd9c"},
-    {file = "MarkupSafe-2.1.3-cp312-cp312-musllinux_1_1_i686.whl", hash = "sha256:14ff806850827afd6b07a5f32bd917fb7f45b046ba40c57abdb636674a8b559c"},
-    {file = "MarkupSafe-2.1.3-cp312-cp312-musllinux_1_1_x86_64.whl", hash = "sha256:8f9293864fe09b8149f0cc42ce56e3f0e54de883a9de90cd427f191c346eb2e1"},
-    {file = "MarkupSafe-2.1.3-cp312-cp312-win32.whl", hash = "sha256:715d3562f79d540f251b99ebd6d8baa547118974341db04f5ad06d5ea3eb8007"},
-    {file = "MarkupSafe-2.1.3-cp312-cp312-win_amd64.whl", hash = "sha256:1b8dd8c3fd14349433c79fa8abeb573a55fc0fdd769133baac1f5e07abf54aeb"},
    {file = "MarkupSafe-2.1.3-cp37-cp37m-macosx_10_9_x86_64.whl", hash = "sha256:8e254ae696c88d98da6555f5ace2279cf7cd5b3f52be2b5cf97feafe883b58d2"},
    {file = "MarkupSafe-2.1.3-cp37-cp37m-manylinux_2_17_aarch64.manylinux2014_aarch64.whl", hash = "sha256:cb0932dc158471523c9637e807d9bfb93e06a95cbf010f1a38b98623b929ef2b"},
    {file = "MarkupSafe-2.1.3-cp37-cp37m-manylinux_2_17_x86_64.manylinux2014_x86_64.whl", hash = "sha256:9402b03f1a1b4dc4c19845e5c749e3ab82d5078d16a2a4c2cd2df62d57bb0707"},
@@ -4734,61 +4714,61 @@ requests = ">=2,<3"

 [[package]]
 name = "orjson"
-version = "3.9.10"
+version = "3.9.15"
 description = "Fast, correct Python JSON library supporting dataclasses, datetimes, and numpy"
-optional = true
+optional = false
 python-versions = ">=3.8"
 files = [
-    {file = "orjson-3.9.10-cp310-cp310-macosx_10_15_x86_64.macosx_11_0_arm64.macosx_10_15_universal2.whl", hash = "sha256:c18a4da2f50050a03d1da5317388ef84a16013302a5281d6f64e4a3f406aabc4"},
-    {file = "orjson-3.9.10-cp310-cp310-manylinux_2_17_aarch64.manylinux2014_aarch64.whl", hash = "sha256:5148bab4d71f58948c7c39d12b14a9005b6ab35a0bdf317a8ade9a9e4d9d0bd5"},
-    {file = "orjson-3.9.10-cp310-cp310-manylinux_2_17_armv7l.manylinux2014_armv7l.whl", hash = "sha256:4cf7837c3b11a2dfb589f8530b3cff2bd0307ace4c301e8997e95c7468c1378e"},
-    {file = "orjson-3.9.10-cp310-cp310-manylinux_2_17_ppc64le.manylinux2014_ppc64le.whl", hash = "sha256:c62b6fa2961a1dcc51ebe88771be5319a93fd89bd247c9ddf732bc250507bc2b"},
-    {file = "orjson-3.9.10-cp310-cp310-manylinux_2_17_s390x.manylinux2014_s390x.whl", hash = "sha256:deeb3922a7a804755bbe6b5be9b312e746137a03600f488290318936c1a2d4dc"},
-    {file = "orjson-3.9.10-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl", hash = "sha256:1234dc92d011d3554d929b6cf058ac4a24d188d97be5e04355f1b9223e98bbe9"},
-    {file = "orjson-3.9.10-cp310-cp310-musllinux_1_1_aarch64.whl", hash = "sha256:06ad5543217e0e46fd7ab7ea45d506c76f878b87b1b4e369006bdb01acc05a83"},
-    {file = "orjson-3.9.10-cp310-cp310-musllinux_1_1_x86_64.whl", hash = "sha256:4fd72fab7bddce46c6826994ce1e7de145ae1e9e106ebb8eb9ce1393ca01444d"},
-    {file = "orjson-3.9.10-cp310-none-win32.whl", hash = "sha256:b5b7d4a44cc0e6ff98da5d56cde794385bdd212a86563ac321ca64d7f80c80d1"},
-    {file = "orjson-3.9.10-cp310-none-win_amd64.whl", hash = "sha256:61804231099214e2f84998316f3238c4c2c4aaec302df12b21a64d72e2a135c7"},
-    {file = "orjson-3.9.10-cp311-cp311-macosx_10_15_x86_64.macosx_11_0_arm64.macosx_10_15_universal2.whl", hash = "sha256:cff7570d492bcf4b64cc862a6e2fb77edd5e5748ad715f487628f102815165e9"},
-    {file = "orjson-3.9.10-cp311-cp311-manylinux_2_17_aarch64.manylinux2014_aarch64.whl", hash = "sha256:ed8bc367f725dfc5cabeed1ae079d00369900231fbb5a5280cf0736c30e2adf7"},
-    {file = "orjson-3.9.10-cp311-cp311-manylinux_2_17_armv7l.manylinux2014_armv7l.whl", hash = "sha256:c812312847867b6335cfb264772f2a7e85b3b502d3a6b0586aa35e1858528ab1"},
-    {file = "orjson-3.9.10-cp311-cp311-manylinux_2_17_ppc64le.manylinux2014_ppc64le.whl", hash = "sha256:9edd2856611e5050004f4722922b7b1cd6268da34102667bd49d2a2b18bafb81"},
-    {file = "orjson-3.9.10-cp311-cp311-manylinux_2_17_s390x.manylinux2014_s390x.whl", hash = "sha256:674eb520f02422546c40401f4efaf8207b5e29e420c17051cddf6c02783ff5ca"},
-    {file = "orjson-3.9.10-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl", hash = "sha256:1d0dc4310da8b5f6415949bd5ef937e60aeb0eb6b16f95041b5e43e6200821fb"},
-    {file = "orjson-3.9.10-cp311-cp311-musllinux_1_1_aarch64.whl", hash = "sha256:e99c625b8c95d7741fe057585176b1b8783d46ed4b8932cf98ee145c4facf499"},
-    {file = "orjson-3.9.10-cp311-cp311-musllinux_1_1_x86_64.whl", hash = "sha256:ec6f18f96b47299c11203edfbdc34e1b69085070d9a3d1f302810cc23ad36bf3"},
-    {file = "orjson-3.9.10-cp311-none-win32.whl", hash = "sha256:ce0a29c28dfb8eccd0f16219360530bc3cfdf6bf70ca384dacd36e6c650ef8e8"},
-    {file = "orjson-3.9.10-cp311-none-win_amd64.whl", hash = "sha256:cf80b550092cc480a0cbd0750e8189247ff45457e5a023305f7ef1bcec811616"},
-    {file = "orjson-3.9.10-cp312-cp312-macosx_10_15_x86_64.macosx_11_0_arm64.macosx_10_15_universal2.whl", hash = "sha256:602a8001bdf60e1a7d544be29c82560a7b49319a0b31d62586548835bbe2c862"},
-    {file = "orjson-3.9.10-cp312-cp312-manylinux_2_17_aarch64.manylinux2014_aarch64.whl", hash = "sha256:f295efcd47b6124b01255d1491f9e46f17ef40d3d7eabf7364099e463fb45f0f"},
-    {file = "orjson-3.9.10-cp312-cp312-manylinux_2_17_armv7l.manylinux2014_armv7l.whl", hash = "sha256:92af0d00091e744587221e79f68d617b432425a7e59328ca4c496f774a356071"},
-    {file = "orjson-3.9.10-cp312-cp312-manylinux_2_17_ppc64le.manylinux2014_ppc64le.whl", hash = "sha256:c5a02360e73e7208a872bf65a7554c9f15df5fe063dc047f79738998b0506a14"},
-    {file = "orjson-3.9.10-cp312-cp312-manylinux_2_17_s390x.manylinux2014_s390x.whl", hash = "sha256:858379cbb08d84fe7583231077d9a36a1a20eb72f8c9076a45df8b083724ad1d"},
-    {file = "orjson-3.9.10-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl", hash = "sha256:666c6fdcaac1f13eb982b649e1c311c08d7097cbda24f32612dae43648d8db8d"},
-    {file = "orjson-3.9.10-cp312-cp312-musllinux_1_1_aarch64.whl", hash = "sha256:3fb205ab52a2e30354640780ce4587157a9563a68c9beaf52153e1cea9aa0921"},
-    {file = "orjson-3.9.10-cp312-cp312-musllinux_1_1_x86_64.whl", hash = "sha256:7ec960b1b942ee3c69323b8721df2a3ce28ff40e7ca47873ae35bfafeb4555ca"},
-    {file = "orjson-3.9.10-cp312-none-win_amd64.whl", hash = "sha256:3e892621434392199efb54e69edfff9f699f6cc36dd9553c5bf796058b14b20d"},
-    {file = "orjson-3.9.10-cp38-cp38-macosx_10_15_x86_64.macosx_11_0_arm64.macosx_10_15_universal2.whl", hash = "sha256:8b9ba0ccd5a7f4219e67fbbe25e6b4a46ceef783c42af7dbc1da548eb28b6531"},
-    {file = "orjson-3.9.10-cp38-cp38-manylinux_2_17_aarch64.manylinux2014_aarch64.whl", hash = "sha256:2e2ecd1d349e62e3960695214f40939bbfdcaeaaa62ccc638f8e651cf0970e5f"},
-    {file = "orjson-3.9.10-cp38-cp38-manylinux_2_17_armv7l.manylinux2014_armv7l.whl", hash = "sha256:7f433be3b3f4c66016d5a20e5b4444ef833a1f802ced13a2d852c637f69729c1"},
-    {file = "orjson-3.9.10-cp38-cp38-manylinux_2_17_ppc64le.manylinux2014_ppc64le.whl", hash = "sha256:4689270c35d4bb3102e103ac43c3f0b76b169760aff8bcf2d401a3e0e58cdb7f"},
-    {file = "orjson-3.9.10-cp38-cp38-manylinux_2_17_s390x.manylinux2014_s390x.whl", hash = "sha256:4bd176f528a8151a6efc5359b853ba3cc0e82d4cd1fab9c1300c5d957dc8f48c"},
-    {file = "orjson-3.9.10-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl", hash = "sha256:3a2ce5ea4f71681623f04e2b7dadede3c7435dfb5e5e2d1d0ec25b35530e277b"},
-    {file = "orjson-3.9.10-cp38-cp38-musllinux_1_1_aarch64.whl", hash = "sha256:49f8ad582da6e8d2cf663c4ba5bf9f83cc052570a3a767487fec6af839b0e777"},
-    {file = "orjson-3.9.10-cp38-cp38-musllinux_1_1_x86_64.whl", hash = "sha256:2a11b4b1a8415f105d989876a19b173f6cdc89ca13855ccc67c18efbd7cbd1f8"},
-    {file = "orjson-3.9.10-cp38-none-win32.whl", hash = "sha256:a353bf1f565ed27ba71a419b2cd3db9d6151da426b61b289b6ba1422a702e643"},
-    {file = "orjson-3.9.10-cp38-none-win_amd64.whl", hash = "sha256:e28a50b5be854e18d54f75ef1bb13e1abf4bc650ab9d635e4258c58e71eb6ad5"},
-    {file = "orjson-3.9.10-cp39-cp39-macosx_10_15_x86_64.macosx_11_0_arm64.macosx_10_15_universal2.whl", hash = "sha256:ee5926746232f627a3be1cc175b2cfad24d0170d520361f4ce3fa2fd83f09e1d"},
-    {file = "orjson-3.9.10-cp39-cp39-manylinux_2_17_aarch64.manylinux2014_aarch64.whl", hash = "sha256:0a73160e823151f33cdc05fe2cea557c5ef12fdf276ce29bb4f1c571c8368a60"},
-    {file = "orjson-3.9.10-cp39-cp39-manylinux_2_17_armv7l.manylinux2014_armv7l.whl", hash = "sha256:c338ed69ad0b8f8f8920c13f529889fe0771abbb46550013e3c3d01e5174deef"},
-    {file = "orjson-3.9.10-cp39-cp39-manylinux_2_17_ppc64le.manylinux2014_ppc64le.whl", hash = "sha256:5869e8e130e99687d9e4be835116c4ebd83ca92e52e55810962446d841aba8de"},
-    {file = "orjson-3.9.10-cp39-cp39-manylinux_2_17_s390x.manylinux2014_s390x.whl", hash = "sha256:d2c1e559d96a7f94a4f581e2a32d6d610df5840881a8cba8f25e446f4d792df3"},
-    {file = "orjson-3.9.10-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl", hash = "sha256:81a3a3a72c9811b56adf8bcc829b010163bb2fc308877e50e9910c9357e78521"},
-    {file = "orjson-3.9.10-cp39-cp39-musllinux_1_1_aarch64.whl", hash = "sha256:7f8fb7f5ecf4f6355683ac6881fd64b5bb2b8a60e3ccde6ff799e48791d8f864"},
-    {file = "orjson-3.9.10-cp39-cp39-musllinux_1_1_x86_64.whl", hash = "sha256:c943b35ecdf7123b2d81d225397efddf0bce2e81db2f3ae633ead38e85cd5ade"},
-    {file = "orjson-3.9.10-cp39-none-win32.whl", hash = "sha256:fb0b361d73f6b8eeceba47cd37070b5e6c9de5beaeaa63a1cb35c7e1a73ef088"},
-    {file = "orjson-3.9.10-cp39-none-win_amd64.whl", hash = "sha256:b90f340cb6397ec7a854157fac03f0c82b744abdd1c0941a024c3c29d1340aff"},
-    {file = "orjson-3.9.10.tar.gz", hash = "sha256:9ebbdbd6a046c304b1845e96fbcc5559cd296b4dfd3ad2509e33c4d9ce07d6a1"},
+    {file = "orjson-3.9.15-cp310-cp310-macosx_10_15_x86_64.macosx_11_0_arm64.macosx_10_15_universal2.whl", hash = "sha256:d61f7ce4727a9fa7680cd6f3986b0e2c732639f46a5e0156e550e35258aa313a"},
+    {file = "orjson-3.9.15-cp310-cp310-manylinux_2_17_aarch64.manylinux2014_aarch64.whl", hash = "sha256:4feeb41882e8aa17634b589533baafdceb387e01e117b1ec65534ec724023d04"},
+    {file = "orjson-3.9.15-cp310-cp310-manylinux_2_17_armv7l.manylinux2014_armv7l.whl", hash = "sha256:fbbeb3c9b2edb5fd044b2a070f127a0ac456ffd079cb82746fc84af01ef021a4"},
+    {file = "orjson-3.9.15-cp310-cp310-manylinux_2_17_ppc64le.manylinux2014_ppc64le.whl", hash = "sha256:b66bcc5670e8a6b78f0313bcb74774c8291f6f8aeef10fe70e910b8040f3ab75"},
+    {file = "orjson-3.9.15-cp310-cp310-manylinux_2_17_s390x.manylinux2014_s390x.whl", hash = "sha256:2973474811db7b35c30248d1129c64fd2bdf40d57d84beed2a9a379a6f57d0ab"},
+    {file = "orjson-3.9.15-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl", hash = "sha256:9fe41b6f72f52d3da4db524c8653e46243c8c92df826ab5ffaece2dba9cccd58"},
+    {file = "orjson-3.9.15-cp310-cp310-musllinux_1_2_aarch64.whl", hash = "sha256:4228aace81781cc9d05a3ec3a6d2673a1ad0d8725b4e915f1089803e9efd2b99"},
+    {file = "orjson-3.9.15-cp310-cp310-musllinux_1_2_x86_64.whl", hash = "sha256:6f7b65bfaf69493c73423ce9db66cfe9138b2f9ef62897486417a8fcb0a92bfe"},
+    {file = "orjson-3.9.15-cp310-none-win32.whl", hash = "sha256:2d99e3c4c13a7b0fb3792cc04c2829c9db07838fb6973e578b85c1745e7d0ce7"},
+    {file = "orjson-3.9.15-cp310-none-win_amd64.whl", hash = "sha256:b725da33e6e58e4a5d27958568484aa766e825e93aa20c26c91168be58e08cbb"},
+    {file = "orjson-3.9.15-cp311-cp311-macosx_10_15_x86_64.macosx_11_0_arm64.macosx_10_15_universal2.whl", hash = "sha256:c8e8fe01e435005d4421f183038fc70ca85d2c1e490f51fb972db92af6e047c2"},
+    {file = "orjson-3.9.15-cp311-cp311-manylinux_2_17_aarch64.manylinux2014_aarch64.whl", hash = "sha256:87f1097acb569dde17f246faa268759a71a2cb8c96dd392cd25c668b104cad2f"},
+    {file = "orjson-3.9.15-cp311-cp311-manylinux_2_17_armv7l.manylinux2014_armv7l.whl", hash = "sha256:ff0f9913d82e1d1fadbd976424c316fbc4d9c525c81d047bbdd16bd27dd98cfc"},
+    {file = "orjson-3.9.15-cp311-cp311-manylinux_2_17_ppc64le.manylinux2014_ppc64le.whl", hash = "sha256:8055ec598605b0077e29652ccfe9372247474375e0e3f5775c91d9434e12d6b1"},
+    {file = "orjson-3.9.15-cp311-cp311-manylinux_2_17_s390x.manylinux2014_s390x.whl", hash = "sha256:d6768a327ea1ba44c9114dba5fdda4a214bdb70129065cd0807eb5f010bfcbb5"},
+    {file = "orjson-3.9.15-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl", hash = "sha256:12365576039b1a5a47df01aadb353b68223da413e2e7f98c02403061aad34bde"},
+    {file = "orjson-3.9.15-cp311-cp311-musllinux_1_2_aarch64.whl", hash = "sha256:71c6b009d431b3839d7c14c3af86788b3cfac41e969e3e1c22f8a6ea13139404"},
+    {file = "orjson-3.9.15-cp311-cp311-musllinux_1_2_x86_64.whl", hash = "sha256:e18668f1bd39e69b7fed19fa7cd1cd110a121ec25439328b5c89934e6d30d357"},
+    {file = "orjson-3.9.15-cp311-none-win32.whl", hash = "sha256:62482873e0289cf7313461009bf62ac8b2e54bc6f00c6fabcde785709231a5d7"},
+    {file = "orjson-3.9.15-cp311-none-win_amd64.whl", hash = "sha256:b3d336ed75d17c7b1af233a6561cf421dee41d9204aa3cfcc6c9c65cd5bb69a8"},
+    {file = "orjson-3.9.15-cp312-cp312-macosx_10_15_x86_64.macosx_11_0_arm64.macosx_10_15_universal2.whl", hash = "sha256:82425dd5c7bd3adfe4e94c78e27e2fa02971750c2b7ffba648b0f5d5cc016a73"},
+    {file = "orjson-3.9.15-cp312-cp312-manylinux_2_17_aarch64.manylinux2014_aarch64.whl", hash = "sha256:2c51378d4a8255b2e7c1e5cc430644f0939539deddfa77f6fac7b56a9784160a"},
+    {file = "orjson-3.9.15-cp312-cp312-manylinux_2_17_armv7l.manylinux2014_armv7l.whl", hash = "sha256:6ae4e06be04dc00618247c4ae3f7c3e561d5bc19ab6941427f6d3722a0875ef7"},
+    {file = "orjson-3.9.15-cp312-cp312-manylinux_2_17_ppc64le.manylinux2014_ppc64le.whl", hash = "sha256:bcef128f970bb63ecf9a65f7beafd9b55e3aaf0efc271a4154050fc15cdb386e"},
+    {file = "orjson-3.9.15-cp312-cp312-manylinux_2_17_s390x.manylinux2014_s390x.whl", hash = "sha256:b72758f3ffc36ca566ba98a8e7f4f373b6c17c646ff8ad9b21ad10c29186f00d"},
+    {file = "orjson-3.9.15-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl", hash = "sha256:10c57bc7b946cf2efa67ac55766e41764b66d40cbd9489041e637c1304400494"},
+    {file = "orjson-3.9.15-cp312-cp312-musllinux_1_2_aarch64.whl", hash = "sha256:946c3a1ef25338e78107fba746f299f926db408d34553b4754e90a7de1d44068"},
+    {file = "orjson-3.9.15-cp312-cp312-musllinux_1_2_x86_64.whl", hash = "sha256:2f256d03957075fcb5923410058982aea85455d035607486ccb847f095442bda"},
+    {file = "orjson-3.9.15-cp312-none-win_amd64.whl", hash = "sha256:5bb399e1b49db120653a31463b4a7b27cf2fbfe60469546baf681d1b39f4edf2"},
+    {file = "orjson-3.9.15-cp38-cp38-macosx_10_15_x86_64.macosx_11_0_arm64.macosx_10_15_universal2.whl", hash = "sha256:b17f0f14a9c0ba55ff6279a922d1932e24b13fc218a3e968ecdbf791b3682b25"},
+    {file = "orjson-3.9.15-cp38-cp38-manylinux_2_17_aarch64.manylinux2014_aarch64.whl", hash = "sha256:7f6cbd8e6e446fb7e4ed5bac4661a29e43f38aeecbf60c4b900b825a353276a1"},
+    {file = "orjson-3.9.15-cp38-cp38-manylinux_2_17_armv7l.manylinux2014_armv7l.whl", hash = "sha256:76bc6356d07c1d9f4b782813094d0caf1703b729d876ab6a676f3aaa9a47e37c"},
+    {file = "orjson-3.9.15-cp38-cp38-manylinux_2_17_ppc64le.manylinux2014_ppc64le.whl", hash = "sha256:fdfa97090e2d6f73dced247a2f2d8004ac6449df6568f30e7fa1a045767c69a6"},
+    {file = "orjson-3.9.15-cp38-cp38-manylinux_2_17_s390x.manylinux2014_s390x.whl", hash = "sha256:7413070a3e927e4207d00bd65f42d1b780fb0d32d7b1d951f6dc6ade318e1b5a"},
+    {file = "orjson-3.9.15-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl", hash = "sha256:9cf1596680ac1f01839dba32d496136bdd5d8ffb858c280fa82bbfeb173bdd40"},
+    {file = "orjson-3.9.15-cp38-cp38-musllinux_1_2_aarch64.whl", hash = "sha256:809d653c155e2cc4fd39ad69c08fdff7f4016c355ae4b88905219d3579e31eb7"},
+    {file = "orjson-3.9.15-cp38-cp38-musllinux_1_2_x86_64.whl", hash = "sha256:920fa5a0c5175ab14b9c78f6f820b75804fb4984423ee4c4f1e6d748f8b22bc1"},
+    {file = "orjson-3.9.15-cp38-none-win32.whl", hash = "sha256:2b5c0f532905e60cf22a511120e3719b85d9c25d0e1c2a8abb20c4dede3b05a5"},
+    {file = "orjson-3.9.15-cp38-none-win_amd64.whl", hash = "sha256:67384f588f7f8daf040114337d34a5188346e3fae6c38b6a19a2fe8c663a2f9b"},
+    {file = "orjson-3.9.15-cp39-cp39-macosx_10_15_x86_64.macosx_11_0_arm64.macosx_10_15_universal2.whl", hash = "sha256:6fc2fe4647927070df3d93f561d7e588a38865ea0040027662e3e541d592811e"},
+    {file = "orjson-3.9.15-cp39-cp39-manylinux_2_17_aarch64.manylinux2014_aarch64.whl", hash = "sha256:34cbcd216e7af5270f2ffa63a963346845eb71e174ea530867b7443892d77180"},
+    {file = "orjson-3.9.15-cp39-cp39-manylinux_2_17_armv7l.manylinux2014_armv7l.whl", hash = "sha256:f541587f5c558abd93cb0de491ce99a9ef8d1ae29dd6ab4dbb5a13281ae04cbd"},
+    {file = "orjson-3.9.15-cp39-cp39-manylinux_2_17_ppc64le.manylinux2014_ppc64le.whl", hash = "sha256:92255879280ef9c3c0bcb327c5a1b8ed694c290d61a6a532458264f887f052cb"},
+    {file = "orjson-3.9.15-cp39-cp39-manylinux_2_17_s390x.manylinux2014_s390x.whl", hash = "sha256:05a1f57fb601c426635fcae9ddbe90dfc1ed42245eb4c75e4960440cac667262"},
+    {file = "orjson-3.9.15-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl", hash = "sha256:ede0bde16cc6e9b96633df1631fbcd66491d1063667f260a4f2386a098393790"},
+    {file = "orjson-3.9.15-cp39-cp39-musllinux_1_2_aarch64.whl", hash = "sha256:e88b97ef13910e5f87bcbc4dd7979a7de9ba8702b54d3204ac587e83639c0c2b"},
+    {file = "orjson-3.9.15-cp39-cp39-musllinux_1_2_x86_64.whl", hash = "sha256:57d5d8cf9c27f7ef6bc56a5925c7fbc76b61288ab674eb352c26ac780caa5b10"},
+    {file = "orjson-3.9.15-cp39-none-win32.whl", hash = "sha256:001f4eb0ecd8e9ebd295722d0cbedf0748680fb9998d3993abaed2f40587257a"},
+    {file = "orjson-3.9.15-cp39-none-win_amd64.whl", hash = "sha256:ea0b183a5fe6b2b45f3b854b0d19c4e932d6f5934ae1f723b07cf9560edd4ec7"},
+    {file = "orjson-3.9.15.tar.gz", hash = "sha256:95cae920959d772f30ab36d3b25f83bb0f3be671e986c72ce22f8fa700dae061"},
 ]

 [[package]]
@@ -5298,8 +5278,6 @@ files = [
    {file = "psycopg2-2.9.9-cp310-cp310-win_amd64.whl", hash = "sha256:426f9f29bde126913a20a96ff8ce7d73fd8a216cfb323b1f04da402d452853c3"},
    {file = "psycopg2-2.9.9-cp311-cp311-win32.whl", hash = "sha256:ade01303ccf7ae12c356a5e10911c9e1c51136003a9a1d92f7aa9d010fb98372"},
    {file = "psycopg2-2.9.9-cp311-cp311-win_amd64.whl", hash = "sha256:121081ea2e76729acfb0673ff33755e8703d45e926e416cb59bae3a86c6a4981"},
-    {file = "psycopg2-2.9.9-cp312-cp312-win32.whl", hash = "sha256:d735786acc7dd25815e89cc4ad529a43af779db2e25aa7c626de864127e5a024"},
-    {file = "psycopg2-2.9.9-cp312-cp312-win_amd64.whl", hash = "sha256:a7653d00b732afb6fc597e29c50ad28087dcb4fbfb28e86092277a559ae4e693"},
    {file = "psycopg2-2.9.9-cp37-cp37m-win32.whl", hash = "sha256:5e0d98cade4f0e0304d7d6f25bbfbc5bd186e07b38eac65379309c4ca3193efa"},
    {file = "psycopg2-2.9.9-cp37-cp37m-win_amd64.whl", hash = "sha256:7e2dacf8b009a1c1e843b5213a87f7c544b2b042476ed7755be813eaf4e8347a"},
    {file = "psycopg2-2.9.9-cp38-cp38-win32.whl", hash = "sha256:ff432630e510709564c01dafdbe996cb552e0b9f3f065eb89bdce5bd31fabf4c"},
@@ -5342,7 +5320,6 @@ files = [
    {file = "psycopg2_binary-2.9.9-cp311-cp311-win32.whl", hash = "sha256:dc4926288b2a3e9fd7b50dc6a1909a13bbdadfc67d93f3374d984e56f885579d"},
    {file = "psycopg2_binary-2.9.9-cp311-cp311-win_amd64.whl", hash = "sha256:b76bedd166805480ab069612119ea636f5ab8f8771e640ae103e05a4aae3e417"},
    {file = "psycopg2_binary-2.9.9-cp312-cp312-macosx_10_9_x86_64.whl", hash = "sha256:8532fd6e6e2dc57bcb3bc90b079c60de896d2128c5d9d6f24a63875a95a088cf"},
-    {file = "psycopg2_binary-2.9.9-cp312-cp312-macosx_11_0_arm64.whl", hash = "sha256:b0605eaed3eb239e87df0d5e3c6489daae3f7388d455d0c0b4df899519c6a38d"},
    {file = "psycopg2_binary-2.9.9-cp312-cp312-manylinux_2_17_aarch64.manylinux2014_aarch64.whl", hash = "sha256:8f8544b092a29a6ddd72f3556a9fcf249ec412e10ad28be6a0c0d948924f2212"},
    {file = "psycopg2_binary-2.9.9-cp312-cp312-manylinux_2_17_i686.manylinux2014_i686.whl", hash = "sha256:2d423c8d8a3c82d08fe8af900ad5b613ce3632a1249fd6a223941d0735fce493"},
    {file = "psycopg2_binary-2.9.9-cp312-cp312-manylinux_2_17_ppc64le.manylinux2014_ppc64le.whl", hash = "sha256:2e5afae772c00980525f6d6ecf7cbca55676296b580c0e6abb407f15f3706996"},
@@ -5351,8 +5328,6 @@ files = [
    {file = "psycopg2_binary-2.9.9-cp312-cp312-musllinux_1_1_i686.whl", hash = "sha256:cb16c65dcb648d0a43a2521f2f0a2300f40639f6f8c1ecbc662141e4e3e1ee07"},
    {file = "psycopg2_binary-2.9.9-cp312-cp312-musllinux_1_1_ppc64le.whl", hash = "sha256:911dda9c487075abd54e644ccdf5e5c16773470a6a5d3826fda76699410066fb"},
    {file = "psycopg2_binary-2.9.9-cp312-cp312-musllinux_1_1_x86_64.whl", hash = "sha256:57fede879f08d23c85140a360c6a77709113efd1c993923c59fde17aa27599fe"},
-    {file = "psycopg2_binary-2.9.9-cp312-cp312-win32.whl", hash = "sha256:64cf30263844fa208851ebb13b0732ce674d8ec6a0c86a4e160495d299ba3c93"},
-    {file = "psycopg2_binary-2.9.9-cp312-cp312-win_amd64.whl", hash = "sha256:81ff62668af011f9a48787564ab7eded4e9fb17a4a6a74af5ffa6a457400d2ab"},
    {file = "psycopg2_binary-2.9.9-cp37-cp37m-macosx_10_9_x86_64.whl", hash = "sha256:2293b001e319ab0d869d660a704942c9e2cce19745262a8aba2115ef41a0a42a"},
    {file = "psycopg2_binary-2.9.9-cp37-cp37m-manylinux_2_17_aarch64.manylinux2014_aarch64.whl", hash = "sha256:03ef7df18daf2c4c07e2695e8cfd5ee7f748a1d54d802330985a78d2a5a6dca9"},
    {file = "psycopg2_binary-2.9.9-cp37-cp37m-manylinux_2_17_i686.manylinux2014_i686.whl", hash = "sha256:0a602ea5aff39bb9fac6308e9c9d82b9a35c2bf288e184a816002c9fae930b77"},
@@ -5825,7 +5800,6 @@ files = [
    {file = "pymongo-4.5.0-cp312-cp312-manylinux_2_5_i686.manylinux1_i686.manylinux_2_17_i686.manylinux2014_i686.whl", hash = "sha256:6422b6763b016f2ef2beedded0e546d6aa6ba87910f9244d86e0ac7690f75c96"},
    {file = "pymongo-4.5.0-cp312-cp312-win32.whl", hash = "sha256:77cfff95c1fafd09e940b3fdcb7b65f11442662fad611d0e69b4dd5d17a81c60"},
    {file = "pymongo-4.5.0-cp312-cp312-win_amd64.whl", hash = "sha256:e57d859b972c75ee44ea2ef4758f12821243e99de814030f69a3decb2aa86807"},
-    {file = "pymongo-4.5.0-cp37-cp37m-macosx_10_9_x86_64.whl", hash = "sha256:8443f3a8ab2d929efa761c6ebce39a6c1dca1c9ac186ebf11b62c8fe1aef53f4"},
    {file = "pymongo-4.5.0-cp37-cp37m-manylinux1_i686.whl", hash = "sha256:2b0176f9233a5927084c79ff80b51bd70bfd57e4f3d564f50f80238e797f0c8a"},
    {file = "pymongo-4.5.0-cp37-cp37m-manylinux1_x86_64.whl", hash = "sha256:89b3f2da57a27913d15d2a07d58482f33d0a5b28abd20b8e643ab4d625e36257"},
    {file = "pymongo-4.5.0-cp37-cp37m-manylinux2014_aarch64.whl", hash = "sha256:5caee7bd08c3d36ec54617832b44985bd70c4cbd77c5b313de6f7fce0bb34f93"},
@@ -6342,7 +6316,6 @@ files = [
    {file = "PyYAML-6.0.1-cp310-cp310-manylinux_2_17_aarch64.manylinux2014_aarch64.whl", hash = "sha256:69b023b2b4daa7548bcfbd4aa3da05b3a74b772db9e23b982788168117739938"},
    {file = "PyYAML-6.0.1-cp310-cp310-manylinux_2_17_s390x.manylinux2014_s390x.whl", hash = "sha256:81e0b275a9ecc9c0c0c07b4b90ba548307583c125f54d5b6946cfee6360c733d"},
    {file = "PyYAML-6.0.1-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl", hash = "sha256:ba336e390cd8e4d1739f42dfe9bb83a3cc2e80f567d8805e11b46f4a943f5515"},
-    {file = "PyYAML-6.0.1-cp310-cp310-musllinux_1_1_x86_64.whl", hash = "sha256:326c013efe8048858a6d312ddd31d56e468118ad4cdeda36c719bf5bb6192290"},
    {file = "PyYAML-6.0.1-cp310-cp310-win32.whl", hash = "sha256:bd4af7373a854424dabd882decdc5579653d7868b8fb26dc7d0e99f823aa5924"},
    {file = "PyYAML-6.0.1-cp310-cp310-win_amd64.whl", hash = "sha256:fd1592b3fdf65fff2ad0004b5e363300ef59ced41c2e6b3a99d4089fa8c5435d"},
    {file = "PyYAML-6.0.1-cp311-cp311-macosx_10_9_x86_64.whl", hash = "sha256:6965a7bc3cf88e5a1c3bd2e0b5c22f8d677dc88a455344035f03399034eb3007"},
@@ -6350,16 +6323,8 @@ files = [
    {file = "PyYAML-6.0.1-cp311-cp311-manylinux_2_17_aarch64.manylinux2014_aarch64.whl", hash = "sha256:42f8152b8dbc4fe7d96729ec2b99c7097d656dc1213a3229ca5383f973a5ed6d"},
    {file = "PyYAML-6.0.1-cp311-cp311-manylinux_2_17_s390x.manylinux2014_s390x.whl", hash = "sha256:062582fca9fabdd2c8b54a3ef1c978d786e0f6b3a1510e0ac93ef59e0ddae2bc"},
    {file = "PyYAML-6.0.1-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl", hash = "sha256:d2b04aac4d386b172d5b9692e2d2da8de7bfb6c387fa4f801fbf6fb2e6ba4673"},
-    {file = "PyYAML-6.0.1-cp311-cp311-musllinux_1_1_x86_64.whl", hash = "sha256:e7d73685e87afe9f3b36c799222440d6cf362062f78be1013661b00c5c6f678b"},
    {file = "PyYAML-6.0.1-cp311-cp311-win32.whl", hash = "sha256:1635fd110e8d85d55237ab316b5b011de701ea0f29d07611174a1b42f1444741"},
    {file = "PyYAML-6.0.1-cp311-cp311-win_amd64.whl", hash = "sha256:bf07ee2fef7014951eeb99f56f39c9bb4af143d8aa3c21b1677805985307da34"},
-    {file = "PyYAML-6.0.1-cp312-cp312-macosx_10_9_x86_64.whl", hash = "sha256:855fb52b0dc35af121542a76b9a84f8d1cd886ea97c84703eaa6d88e37a2ad28"},
-    {file = "PyYAML-6.0.1-cp312-cp312-macosx_11_0_arm64.whl", hash = "sha256:40df9b996c2b73138957fe23a16a4f0ba614f4c0efce1e9406a184b6d07fa3a9"},
-    {file = "PyYAML-6.0.1-cp312-cp312-manylinux_2_17_aarch64.manylinux2014_aarch64.whl", hash = "sha256:a08c6f0fe150303c1c6b71ebcd7213c2858041a7e01975da3a99aed1e7a378ef"},
-    {file = "PyYAML-6.0.1-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl", hash = "sha256:6c22bec3fbe2524cde73d7ada88f6566758a8f7227bfbf93a408a9d86bcc12a0"},
-    {file = "PyYAML-6.0.1-cp312-cp312-musllinux_1_1_x86_64.whl", hash = "sha256:8d4e9c88387b0f5c7d5f281e55304de64cf7f9c0021a3525bd3b1c542da3b0e4"},
-    {file = "PyYAML-6.0.1-cp312-cp312-win32.whl", hash = "sha256:d483d2cdf104e7c9fa60c544d92981f12ad66a457afae824d146093b8c294c54"},
-    {file = "PyYAML-6.0.1-cp312-cp312-win_amd64.whl", hash = "sha256:0d3304d8c0adc42be59c5f8a4d9e3d7379e6955ad754aa9d6ab7a398b59dd1df"},
    {file = "PyYAML-6.0.1-cp36-cp36m-macosx_10_9_x86_64.whl", hash = "sha256:50550eb667afee136e9a77d6dc71ae76a44df8b3e51e41b77f6de2932bfe0f47"},
    {file = "PyYAML-6.0.1-cp36-cp36m-manylinux_2_17_aarch64.manylinux2014_aarch64.whl", hash = "sha256:1fe35611261b29bd1de0070f0b2f47cb6ff71fa6595c077e42bd0c419fa27b98"},
    {file = "PyYAML-6.0.1-cp36-cp36m-manylinux_2_17_s390x.manylinux2014_s390x.whl", hash = "sha256:704219a11b772aea0d8ecd7058d0082713c3562b4e271b849ad7dc4a5c90c13c"},
@@ -6376,7 +6341,6 @@ files = [
    {file = "PyYAML-6.0.1-cp38-cp38-manylinux_2_17_aarch64.manylinux2014_aarch64.whl", hash = "sha256:a0cd17c15d3bb3fa06978b4e8958dcdc6e0174ccea823003a106c7d4d7899ac5"},
    {file = "PyYAML-6.0.1-cp38-cp38-manylinux_2_17_s390x.manylinux2014_s390x.whl", hash = "sha256:28c119d996beec18c05208a8bd78cbe4007878c6dd15091efb73a30e90539696"},
    {file = "PyYAML-6.0.1-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl", hash = "sha256:7e07cbde391ba96ab58e532ff4803f79c4129397514e1413a7dc761ccd755735"},
-    {file = "PyYAML-6.0.1-cp38-cp38-musllinux_1_1_x86_64.whl", hash = "sha256:49a183be227561de579b4a36efbb21b3eab9651dd81b1858589f796549873dd6"},
    {file = "PyYAML-6.0.1-cp38-cp38-win32.whl", hash = "sha256:184c5108a2aca3c5b3d3bf9395d50893a7ab82a38004c8f61c258d4428e80206"},
    {file = "PyYAML-6.0.1-cp38-cp38-win_amd64.whl", hash = "sha256:1e2722cc9fbb45d9b87631ac70924c11d3a401b2d7f410cc0e3bbf249f2dca62"},
    {file = "PyYAML-6.0.1-cp39-cp39-macosx_10_9_x86_64.whl", hash = "sha256:9eb6caa9a297fc2c2fb8862bc5370d0303ddba53ba97e71f08023b6cd73d16a8"},
@@ -6384,7 +6348,6 @@ files = [
    {file = "PyYAML-6.0.1-cp39-cp39-manylinux_2_17_aarch64.manylinux2014_aarch64.whl", hash = "sha256:5773183b6446b2c99bb77e77595dd486303b4faab2b086e7b17bc6bef28865f6"},
    {file = "PyYAML-6.0.1-cp39-cp39-manylinux_2_17_s390x.manylinux2014_s390x.whl", hash = "sha256:b786eecbdf8499b9ca1d697215862083bd6d2a99965554781d0d8d1ad31e13a0"},
    {file = "PyYAML-6.0.1-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl", hash = "sha256:bc1bf2925a1ecd43da378f4db9e4f799775d6367bdb94671027b73b393a7c42c"},
-    {file = "PyYAML-6.0.1-cp39-cp39-musllinux_1_1_x86_64.whl", hash = "sha256:04ac92ad1925b2cff1db0cfebffb6ffc43457495c9b3c39d3fcae417d7125dc5"},
    {file = "PyYAML-6.0.1-cp39-cp39-win32.whl", hash = "sha256:faca3bdcf85b2fc05d06ff3fbc1f83e1391b3e724afa3feba7d13eeab355484c"},
    {file = "PyYAML-6.0.1-cp39-cp39-win_amd64.whl", hash = "sha256:510c9deebc5c0225e8c96813043e62b680ba2f9c50a08d3724c7f28a747d1486"},
    {file = "PyYAML-6.0.1.tar.gz", hash = "sha256:bfdf460b1736c775f2ba9f6a92bca30bc2095067b8a9d77876d1fad6cc3b4a43"},
@@ -9153,7 +9116,7 @@ testing = ["big-O", "jaraco.functools", "jaraco.itertools", "more-itertools", "p

 [extras]
 all = []
-azure = ["azure-ai-formrecognizer", "azure-ai-textanalytics", "azure-ai-vision", "azure-cognitiveservices-speech", "azure-core", "azure-cosmos", "azure-identity", "azure-search-documents", "openai"]
+azure = ["azure-ai-formrecognizer", "azure-ai-textanalytics", "azure-cognitiveservices-speech", "azure-core", "azure-cosmos", "azure-identity", "azure-search-documents", "openai"]
 clarifai = ["clarifai"]
 cli = ["typer"]
 cohere = ["cohere"]
@@ -9169,4 +9132,4 @@ text-helpers = ["chardet"]
 [metadata]
 lock-version = "2.0"
 python-versions = ">=3.8.1,<4.0"
-content-hash = "a9b21852298f682bdb77689c6f31044a62407185fa3c23d4d65cb9dbbc4522ba"
+content-hash = "095a661dc5f767d2a3c92541d66c2d3070e86fb455b98f313e6c9a40b699b4ef"
--- a/libs/langchain/pyproject.toml
+++ b/libs/langchain/pyproject.toml
@@ -1,6 +1,6 @@
 [tool.poetry]
 name = "langchain"
-version = "0.1.9"
+version = "0.1.10"
 description = "Building applications with LLMs through composability"
 authors = []
 license = "MIT"
@@ -12,10 +12,10 @@ langchain-server = "langchain.server:main"

 [tool.poetry.dependencies]
 python = ">=3.8.1,<4.0"
-langchain-core = ">=0.1.26,<0.2"
+langchain-core = ">=0.1.28,<0.2"
 langchain-text-splitters = ">=0.0.1,<0.1"
-langchain-community = ">=0.0.21,<0.1"
-langsmith = "^0.1.0"
+langchain-community = ">=0.0.25,<0.1"
+langsmith = "^0.1.14"
 pydantic = ">=1,<3"
 SQLAlchemy = ">=1.4,<3"
 requests = "^2"
@@ -66,7 +66,6 @@ requests-toolbelt = {version = "^1.0.0", optional = true}
 openlm = {version = "^0.0.5", optional = true}
 scikit-learn = {version = "^1.2.2", optional = true}
 azure-ai-formrecognizer = {version = "^3.2.1", optional = true}
-azure-ai-vision = {version = "^0.11.1b1", optional = true}
 azure-cognitiveservices-speech = {version = "^1.28.0", optional = true}
 py-trello = {version = "^0.19.0", optional = true}
 bibtexparser = {version = "^1.4.0", optional = true}
@@ -223,7 +222,6 @@ azure = [
    "openai",
    "azure-core",
    "azure-ai-formrecognizer",
-    "azure-ai-vision",
    "azure-cognitiveservices-speech",
    "azure-search-documents",
    "azure-ai-textanalytics",
--- a/libs/langchain/tests/integration_tests/cache/test_azure_cosmosdb_cache.py
+++ b/libs/langchain/tests/integration_tests/cache/test_azure_cosmosdb_cache.py
@@ -0,0 +1,350 @@
+"""Test Azure CosmosDB cache functionality.
+
+Required to run this test:
+    - a recent 'pymongo' Python package available
+    - an Azure CosmosDB Mongo vCore instance
+    - one environment variable set:
+        export MONGODB_VCORE_URI="connection string for azure cosmos db mongo vCore"
+"""
+import os
+import uuid
+
+import pytest
+from langchain_community.cache import AzureCosmosDBSemanticCache
+from langchain_community.vectorstores.azure_cosmos_db import (
+    CosmosDBSimilarityType,
+    CosmosDBVectorSearchType,
+)
+from langchain_core.outputs import Generation
+
+from langchain.globals import get_llm_cache, set_llm_cache
+from tests.integration_tests.cache.fake_embeddings import (
+    FakeEmbeddings,
+)
+from tests.unit_tests.llms.fake_llm import FakeLLM
+
+INDEX_NAME = "langchain-test-index"
+NAMESPACE = "langchain_test_db.langchain_test_collection"
+CONNECTION_STRING: str = os.environ.get("MONGODB_VCORE_URI", "")
+DB_NAME, COLLECTION_NAME = NAMESPACE.split(".")
+
+num_lists = 3
+dimensions = 10
+similarity_algorithm = CosmosDBSimilarityType.COS
+kind = CosmosDBVectorSearchType.VECTOR_IVF
+m = 16
+ef_construction = 64
+ef_search = 40
+score_threshold = 0.1
+
+
+def _has_env_vars() -> bool:
+    return all(["MONGODB_VCORE_URI" in os.environ])
+
+
+def random_string() -> str:
+    return str(uuid.uuid4())
+
+
+@pytest.mark.requires("pymongo")
+@pytest.mark.skipif(
+    not _has_env_vars(), reason="Missing Azure CosmosDB Mongo vCore env. vars"
+)
+def test_azure_cosmos_db_semantic_cache() -> None:
+    set_llm_cache(
+        AzureCosmosDBSemanticCache(
+            cosmosdb_connection_string=CONNECTION_STRING,
+            cosmosdb_client=None,
+            embedding=FakeEmbeddings(),
+            database_name=DB_NAME,
+            collection_name=COLLECTION_NAME,
+            num_lists=num_lists,
+            similarity=similarity_algorithm,
+            kind=kind,
+            dimensions=dimensions,
+            m=m,
+            ef_construction=ef_construction,
+            ef_search=ef_search,
+            score_threshold=score_threshold,
+        )
+    )
+
+    llm = FakeLLM()
+    params = llm.dict()
+    params["stop"] = None
+    llm_string = str(sorted([(k, v) for k, v in params.items()]))
+    get_llm_cache().update("foo", llm_string, [Generation(text="fizz")])
+
+    # foo and bar will have the same embedding produced by FakeEmbeddings
+    cache_output = get_llm_cache().lookup("bar", llm_string)
+    assert cache_output == [Generation(text="fizz")]
+
+    # clear the cache
+    get_llm_cache().clear(llm_string=llm_string)
+
+
+@pytest.mark.requires("pymongo")
+@pytest.mark.skipif(
+    not _has_env_vars(), reason="Missing Azure CosmosDB Mongo vCore env. vars"
+)
+def test_azure_cosmos_db_semantic_cache_inner_product() -> None:
+    set_llm_cache(
+        AzureCosmosDBSemanticCache(
+            cosmosdb_connection_string=CONNECTION_STRING,
+            cosmosdb_client=None,
+            embedding=FakeEmbeddings(),
+            database_name=DB_NAME,
+            collection_name=COLLECTION_NAME,
+            num_lists=num_lists,
+            similarity=CosmosDBSimilarityType.IP,
+            kind=kind,
+            dimensions=dimensions,
+            m=m,
+            ef_construction=ef_construction,
+            ef_search=ef_search,
+            score_threshold=score_threshold,
+        )
+    )
+
+    llm = FakeLLM()
+    params = llm.dict()
+    params["stop"] = None
+    llm_string = str(sorted([(k, v) for k, v in params.items()]))
+    get_llm_cache().update("foo", llm_string, [Generation(text="fizz")])
+
+    # foo and bar will have the same embedding produced by FakeEmbeddings
+    cache_output = get_llm_cache().lookup("bar", llm_string)
+    assert cache_output == [Generation(text="fizz")]
+
+    # clear the cache
+    get_llm_cache().clear(llm_string=llm_string)
+
+
+@pytest.mark.requires("pymongo")
+@pytest.mark.skipif(
+    not _has_env_vars(), reason="Missing Azure CosmosDB Mongo vCore env. vars"
+)
+def test_azure_cosmos_db_semantic_cache_multi() -> None:
+    set_llm_cache(
+        AzureCosmosDBSemanticCache(
+            cosmosdb_connection_string=CONNECTION_STRING,
+            cosmosdb_client=None,
+            embedding=FakeEmbeddings(),
+            database_name=DB_NAME,
+            collection_name=COLLECTION_NAME,
+            num_lists=num_lists,
+            similarity=similarity_algorithm,
+            kind=kind,
+            dimensions=dimensions,
+            m=m,
+            ef_construction=ef_construction,
+            ef_search=ef_search,
+            score_threshold=score_threshold,
+        )
+    )
+
+    llm = FakeLLM()
+    params = llm.dict()
+    params["stop"] = None
+    llm_string = str(sorted([(k, v) for k, v in params.items()]))
+    get_llm_cache().update(
+        "foo", llm_string, [Generation(text="fizz"), Generation(text="Buzz")]
+    )
+
+    # foo and bar will have the same embedding produced by FakeEmbeddings
+    cache_output = get_llm_cache().lookup("bar", llm_string)
+    assert cache_output == [Generation(text="fizz"), Generation(text="Buzz")]
+
+    # clear the cache
+    get_llm_cache().clear(llm_string=llm_string)
+
+
+@pytest.mark.requires("pymongo")
+@pytest.mark.skipif(
+    not _has_env_vars(), reason="Missing Azure CosmosDB Mongo vCore env. vars"
+)
+def test_azure_cosmos_db_semantic_cache_multi_inner_product() -> None:
+    set_llm_cache(
+        AzureCosmosDBSemanticCache(
+            cosmosdb_connection_string=CONNECTION_STRING,
+            cosmosdb_client=None,
+            embedding=FakeEmbeddings(),
+            database_name=DB_NAME,
+            collection_name=COLLECTION_NAME,
+            num_lists=num_lists,
+            similarity=CosmosDBSimilarityType.IP,
+            kind=kind,
+            dimensions=dimensions,
+            m=m,
+            ef_construction=ef_construction,
+            ef_search=ef_search,
+            score_threshold=score_threshold,
+        )
+    )
+
+    llm = FakeLLM()
+    params = llm.dict()
+    params["stop"] = None
+    llm_string = str(sorted([(k, v) for k, v in params.items()]))
+    get_llm_cache().update(
+        "foo", llm_string, [Generation(text="fizz"), Generation(text="Buzz")]
+    )
+
+    # foo and bar will have the same embedding produced by FakeEmbeddings
+    cache_output = get_llm_cache().lookup("bar", llm_string)
+    assert cache_output == [Generation(text="fizz"), Generation(text="Buzz")]
+
+    # clear the cache
+    get_llm_cache().clear(llm_string=llm_string)
+
+
+@pytest.mark.requires("pymongo")
+@pytest.mark.skipif(
+    not _has_env_vars(), reason="Missing Azure CosmosDB Mongo vCore env. vars"
+)
+def test_azure_cosmos_db_semantic_cache_hnsw() -> None:
+    set_llm_cache(
+        AzureCosmosDBSemanticCache(
+            cosmosdb_connection_string=CONNECTION_STRING,
+            cosmosdb_client=None,
+            embedding=FakeEmbeddings(),
+            database_name=DB_NAME,
+            collection_name=COLLECTION_NAME,
+            num_lists=num_lists,
+            similarity=similarity_algorithm,
+            kind=CosmosDBVectorSearchType.VECTOR_HNSW,
+            dimensions=dimensions,
+            m=m,
+            ef_construction=ef_construction,
+            ef_search=ef_search,
+            score_threshold=score_threshold,
+        )
+    )
+
+    llm = FakeLLM()
+    params = llm.dict()
+    params["stop"] = None
+    llm_string = str(sorted([(k, v) for k, v in params.items()]))
+    get_llm_cache().update("foo", llm_string, [Generation(text="fizz")])
+
+    # foo and bar will have the same embedding produced by FakeEmbeddings
+    cache_output = get_llm_cache().lookup("bar", llm_string)
+    assert cache_output == [Generation(text="fizz")]
+
+    # clear the cache
+    get_llm_cache().clear(llm_string=llm_string)
+
+
+@pytest.mark.requires("pymongo")
+@pytest.mark.skipif(
+    not _has_env_vars(), reason="Missing Azure CosmosDB Mongo vCore env. vars"
+)
+def test_azure_cosmos_db_semantic_cache_inner_product_hnsw() -> None:
+    set_llm_cache(
+        AzureCosmosDBSemanticCache(
+            cosmosdb_connection_string=CONNECTION_STRING,
+            cosmosdb_client=None,
+            embedding=FakeEmbeddings(),
+            database_name=DB_NAME,
+            collection_name=COLLECTION_NAME,
+            num_lists=num_lists,
+            similarity=CosmosDBSimilarityType.IP,
+            kind=CosmosDBVectorSearchType.VECTOR_HNSW,
+            dimensions=dimensions,
+            m=m,
+            ef_construction=ef_construction,
+            ef_search=ef_search,
+            score_threshold=score_threshold,
+        )
+    )
+
+    llm = FakeLLM()
+    params = llm.dict()
+    params["stop"] = None
+    llm_string = str(sorted([(k, v) for k, v in params.items()]))
+    get_llm_cache().update("foo", llm_string, [Generation(text="fizz")])
+
+    # foo and bar will have the same embedding produced by FakeEmbeddings
+    cache_output = get_llm_cache().lookup("bar", llm_string)
+    assert cache_output == [Generation(text="fizz")]
+
+    # clear the cache
+    get_llm_cache().clear(llm_string=llm_string)
+
+
+@pytest.mark.requires("pymongo")
+@pytest.mark.skipif(
+    not _has_env_vars(), reason="Missing Azure CosmosDB Mongo vCore env. vars"
+)
+def test_azure_cosmos_db_semantic_cache_multi_hnsw() -> None:
+    set_llm_cache(
+        AzureCosmosDBSemanticCache(
+            cosmosdb_connection_string=CONNECTION_STRING,
+            cosmosdb_client=None,
+            embedding=FakeEmbeddings(),
+            database_name=DB_NAME,
+            collection_name=COLLECTION_NAME,
+            num_lists=num_lists,
+            similarity=similarity_algorithm,
+            kind=CosmosDBVectorSearchType.VECTOR_HNSW,
+            dimensions=dimensions,
+            m=m,
+            ef_construction=ef_construction,
+            ef_search=ef_search,
+            score_threshold=score_threshold,
+        )
+    )
+
+    llm = FakeLLM()
+    params = llm.dict()
+    params["stop"] = None
+    llm_string = str(sorted([(k, v) for k, v in params.items()]))
+    get_llm_cache().update(
+        "foo", llm_string, [Generation(text="fizz"), Generation(text="Buzz")]
+    )
+
+    # foo and bar will have the same embedding produced by FakeEmbeddings
+    cache_output = get_llm_cache().lookup("bar", llm_string)
+    assert cache_output == [Generation(text="fizz"), Generation(text="Buzz")]
+
+    # clear the cache
+    get_llm_cache().clear(llm_string=llm_string)
+
+
+@pytest.mark.requires("pymongo")
+@pytest.mark.skipif(
+    not _has_env_vars(), reason="Missing Azure CosmosDB Mongo vCore env. vars"
+)
+def test_azure_cosmos_db_semantic_cache_multi_inner_product_hnsw() -> None:
+    set_llm_cache(
+        AzureCosmosDBSemanticCache(
+            cosmosdb_connection_string=CONNECTION_STRING,
+            cosmosdb_client=None,
+            embedding=FakeEmbeddings(),
+            database_name=DB_NAME,
+            collection_name=COLLECTION_NAME,
+            num_lists=num_lists,
+            similarity=CosmosDBSimilarityType.IP,
+            kind=CosmosDBVectorSearchType.VECTOR_HNSW,
+            dimensions=dimensions,
+            m=m,
+            ef_construction=ef_construction,
+            ef_search=ef_search,
+            score_threshold=score_threshold,
+        )
+    )
+
+    llm = FakeLLM()
+    params = llm.dict()
+    params["stop"] = None
+    llm_string = str(sorted([(k, v) for k, v in params.items()]))
+    get_llm_cache().update(
+        "foo", llm_string, [Generation(text="fizz"), Generation(text="Buzz")]
+    )
+
+    # foo and bar will have the same embedding produced by FakeEmbeddings
+    cache_output = get_llm_cache().lookup("bar", llm_string)
+    assert cache_output == [Generation(text="fizz"), Generation(text="Buzz")]
+
+    # clear the cache
+    get_llm_cache().clear(llm_string=llm_string)
--- a/libs/partners/nvidia-trt/langchain_nvidia_trt/llms.py
+++ b/libs/partners/nvidia-trt/langchain_nvidia_trt/llms.py
@@ -176,9 +176,9 @@ class TritonTensorRTLLM(BaseLLM):
        result_queue = self._invoke_triton(self.model_name, inputs, outputs, stop_words)

        for token in result_queue:
-            yield GenerationChunk(text=token)
            if run_manager:
                run_manager.on_llm_new_token(token)
+            yield GenerationChunk(text=token)

        self.client.stop_stream()
Author	SHA1	Message	Date
William FH	db37d6b7a3	Update Notebook Image (#18470 )	2024-03-03 17:29:54 -08:00
William Fu-Hinthorn	2024d574f9	merge	2024-03-03 17:29:51 -08:00
Scott Nath	9dde6fea9e	community: Add you.com tool, add async to retriever, add async testing, add You tool doc (#18032 ) - Description: finishes adding the you.com functionality including: - add async functions to utility and retriever - add the You.com Tool - add async testing for utility, retriever, and tool - add a tool integration notebook page - Dependencies: any dependencies required for this change - Twitter handle: @scottnath	2024-03-03 17:29:32 -08:00
mackong	bd223572ef	langchain[patch]: add tools renderer for various non-openai agents (#18307 ) - Description: add tools_renderer for various non-openai agents, make tools can be render in different ways for your LLM. - Issue: N/A - Dependencies: N/A --------- Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>	2024-03-03 17:29:32 -08:00
Harrison Chase	e7b0b93cf6	improve query analysis docs (#18426 )	2024-03-03 17:29:32 -08:00
William De Vena	4f69212123	nvidia-trt[patch]: Invoke callback prior to yielding token (#18446 ) ## PR title nvidia-trt[patch]: Invoke callback prior to yielding ## PR message - Description: Invoke on_llm_new_token callback prior to yielding token in _stream method. - Issue: https://github.com/langchain-ai/langchain/issues/16913 - Dependencies: None	2024-03-03 17:29:32 -08:00
William De Vena	29b3965b60	community[patch]: Invoke callback prior to yielding token (#18447 ) ## PR title community[patch]: Invoke callback prior to yielding token ## PR message Description: Invoke callback prior to yielding token in _stream method in llms/vertexai. Issue: https://github.com/langchain-ai/langchain/issues/16913 Dependencies: None	2024-03-03 17:29:32 -08:00
William De Vena	49f86159d3	community[patch]: Invoke callback prior to yielding token (#18448 ) ## PR title community[patch]: Invoke callback prior to yielding token ## PR message - Description: Invoke callback prior to yielding token in _stream method in llms/tongyi. - Issue: https://github.com/langchain-ai/langchain/issues/16913 - Dependencies: None	2024-03-03 17:29:32 -08:00
William De Vena	069b877d23	community[patch]: Invoke callback prior to yielding token (#18449 ) ## PR title community[patch]: Invoke callback prior to yielding token ## PR message - Description: Invoke callback prior to yielding token in _stream method in chat_models/perplexity. - Issue: https://github.com/langchain-ai/langchain/issues/16913 - Dependencies: None	2024-03-03 17:29:32 -08:00
William De Vena	fdca1bbe3f	community[patch]: Invoke callback prior to yielding token (#18452 ) ## PR title community[patch]: Invoke callback prior to yielding token ## PR message - Description: Invoke callback prior to yielding token in _stream and _astream methods in llms/anthropic. - Issue: https://github.com/langchain-ai/langchain/issues/16913 - Dependencies: None	2024-03-03 17:29:32 -08:00
William De Vena	35ba0b7e68	community[patch]: Invoke callback prior to yielding token (#18454 ) ## PR title community[patch]: Invoke callback prior to yielding token ## PR message - Description: Invoke callback prior to yielding token in _stream and _astream methods in llms/baidu_qianfan_endpoint. - Issue: https://github.com/langchain-ai/langchain/issues/16913 - Dependencies: None	2024-03-03 17:29:32 -08:00
Aayush Kataria	a6f47db27b	community[minor]: Adding Azure Cosmos Mongo vCore Vector DB Cache (#16856 ) Description: This pull request introduces several enhancements for Azure Cosmos Vector DB, primarily focused on improving caching and search capabilities using Azure Cosmos MongoDB vCore Vector DB. Here's a summary of the changes: - AzureCosmosDBSemanticCache: Added a new cache implementation called AzureCosmosDBSemanticCache, which utilizes Azure Cosmos MongoDB vCore Vector DB for efficient caching of semantic data. Added comprehensive test cases for AzureCosmosDBSemanticCache to ensure its correctness and robustness. These tests cover various scenarios and edge cases to validate the cache's behavior. - HNSW Vector Search: Added HNSW vector search functionality in the CosmosDB Vector Search module. This enhancement enables more efficient and accurate vector searches by utilizing the HNSW (Hierarchical Navigable Small World) algorithm. Added corresponding test cases to validate the HNSW vector search functionality in both AzureCosmosDBSemanticCache and AzureCosmosDBVectorSearch. These tests ensure the correctness and performance of the HNSW search algorithm. - LLM Caching Notebook - The notebook now includes a comprehensive example showcasing the usage of the AzureCosmosDBSemanticCache. This example highlights how the cache can be employed to efficiently store and retrieve semantic data. Additionally, the example provides default values for all parameters used within the AzureCosmosDBSemanticCache, ensuring clarity and ease of understanding for users who are new to the cache implementation. @hwchase17,@baskaryan, @eyurtsev,	2024-03-03 17:29:32 -08:00
Bagatur	5fbee12819	docs: anthropic quickstart (#18440 )	2024-03-03 17:29:32 -08:00
Bagatur	7ae9bcb569	docs: anthropic qa quickstart (#18459 )	2024-03-03 17:29:32 -08:00
Harrison Chase	9577bfa967	more query analysis docs (#18358 )	2024-03-03 17:29:32 -08:00
William Fu-Hinthorn	26f467ed46	merge	2024-03-03 17:29:24 -08:00
Erick Friis	ed395994ae	community[patch]: release 0.0.25 (#18408 )	2024-03-03 17:27:36 -08:00
aditya thomas	ba24ee4304	infra: update to pathspec for 'git grep' in lint check (#18178 ) Description: Update to the pathspec for 'git grep' in lint check in the Makefile Issue: The pathspec {docs/docs,templates,cookbook} is not handled correctly leading to the error during 'make lint' - "fatal: ambiguous argument '{docs/docs,templates,cookbook}': unknown revision or path not in the working tree." See changes made in https://github.com/langchain-ai/langchain/pull/18058 Co-authored-by: Erick Friis <erick@langchain.dev>	2024-03-03 17:27:35 -08:00
William FH	82cdaf3d9b	Merge branch 'master' into wfh/specify_version	2024-03-01 14:02:25 -08:00
William Fu-Hinthorn	614f3c45f0	Support as_of	2024-03-01 14:01:02 -08:00
William Fu-Hinthorn	e98be8bc9b	Add version	2024-03-01 11:54:03 -08:00