x

Update chatbot.ipynb (#27243 )
Async invocation: remove : from at the end of line line 441 because there is not any structure block after it. Thank you for contributing to LangChain! - [ ] **PR title**: "package: description" - Where "package" is whichever of langchain, community, core, etc. is being modified. Use "docs: ..." for purely docs changes, "templates: ..." for template changes, "infra: ..." for CI changes. - Example: "community: add foobar LLM" - [ ] **PR message**: ***Delete this entire checklist*** and replace with - **Description:** a description of the change - **Issue:** the issue # it fixes, if applicable - **Dependencies:** any dependencies required for this change - **Twitter handle:** if your PR gets announced, and you'd like a mention, we'll gladly shout you out! - [ ] **Add tests and docs**: If you're adding a new integration, please include 1. a test for the integration, preferably unit tests that do not rely on network access, 2. an example notebook showing its use. It lives in `docs/docs/integrations` directory. - [ ] **Lint and test**: Run `make format`, `make lint` and `make test` from the root of the package(s) you've modified. See contribution guidelines for more: https://python.langchain.com/docs/contributing/ Additional guidelines: - Make sure optional dependencies are imported within a function. - Please do not add dependencies to pyproject.toml files (even optional ones) unless they are required for unit tests. - Most PRs should not touch more than one package. - Changes should be backwards compatible. - If you are adding something to community, do not re-import it in langchain. If no one reviews your PR within a few days, please @-mention one of baskaryan, efriis, eyurtsev, ccurme, vbarda, hwchase17.
2026-02-07 09:40:07 +00:00 · 2024-10-11 12:34:55 -04:00 · 2024-10-10 18:03:10 +00:00 · 2024-10-09 11:25:58 -04:00 · 2024-10-09 10:51:39 -04:00 · 2024-10-09 10:05:22 -04:00
218 changed files with 8900 additions and 3539 deletions
--- a/.github/workflows/api_doc_build.yml
+++ b/.github/workflows/api_doc_build.yml
@@ -101,8 +101,8 @@ jobs:
          mv langchain-google/libs/genai langchain/libs/partners/google-genai
          mv langchain-google/libs/vertexai langchain/libs/partners/google-vertexai
          mv langchain-google/libs/community langchain/libs/partners/google-community
-          # mv langchain-datastax/libs/astradb langchain/libs/partners/astradb
-          # mv langchain-nvidia/libs/ai-endpoints langchain/libs/partners/nvidia-ai-endpoints
+          mv langchain-datastax/libs/astradb langchain/libs/partners/astradb
+          mv langchain-nvidia/libs/ai-endpoints langchain/libs/partners/nvidia-ai-endpoints
          mv langchain-cohere/libs/cohere langchain/libs/partners/cohere
          mv langchain-elastic/libs/elasticsearch langchain/libs/partners/elasticsearch
          mv langchain-postgres langchain/libs/partners/postgres
--- a/MIGRATE.md
+++ b/MIGRATE.md
@@ -1,70 +1,11 @@
 # Migrating

-## 🚨Breaking Changes for select chains (SQLDatabase) on 7/28/23
+Please see the following guides for migratin LangChain code:

-In an effort to make `langchain` leaner and safer, we are moving select chains to `langchain_experimental`.
-This migration has already started, but we are remaining backwards compatible until 7/28.
-On that date, we will remove functionality from `langchain`.
-Read more about the motivation and the progress [here](https://github.com/langchain-ai/langchain/discussions/8043).
+* Migrate to [LangChain v0.3](https://python.langchain.com/docs/versions/v0_3/)
+* Migrate to [LangChain v0.2](https://python.langchain.com/docs/versions/v0_2/)
+* Migrating from [LangChain 0.0.x Chains](https://python.langchain.com/docs/versions/migrating_chains/)
+* Upgrate to [LangGraph Memory](https://python.langchain.com/docs/versions/migrating_memory/)

-### Migrating to `langchain_experimental`
-
-We are moving any experimental components of LangChain, or components with vulnerability issues, into `langchain_experimental`.
-This guide covers how to migrate.
-
-### Installation
-
-Previously:
-
-`pip install -U langchain`
-
-Now (only if you want to access things in experimental):
-
-`pip install -U langchain langchain_experimental`
-
-### Things in `langchain.experimental`
-
-Previously:
-
-`from langchain.experimental import ...`
-
-Now:
-
-`from langchain_experimental import ...`
-
-### PALChain
-
-Previously:
-
-`from langchain.chains import PALChain`
-
-Now:
-
-`from langchain_experimental.pal_chain import PALChain`
-
-### SQLDatabaseChain
-
-Previously:
-
-`from langchain.chains import SQLDatabaseChain`
-
-Now:
-
-`from langchain_experimental.sql import SQLDatabaseChain`
-
-Alternatively, if you are just interested in using the query generation part of the SQL chain, you can check out this [`SQL question-answering tutorial`](https://python.langchain.com/v0.2/docs/tutorials/sql_qa/#convert-question-to-sql-query)
-
-`from langchain.chains import create_sql_query_chain`
-
-### `load_prompt` for Python files
-
-Note: this only applies if you want to load Python files as prompts.
-If you want to load json/yaml files, no change is needed.
-
-Previously:
-
-`from langchain.prompts import load_prompt`
-
-Now:
-
-`from langchain_experimental.prompts import load_prompt`
+The [LangChain CLI](https://python.langchain.com/docs/versions/v0_3/#migrate-using-langchain-cli) can help automatically upgrade your code to use non deprecated imports. 
+This will be especially helpful if you're still on either version 0.0.x or 0.1.x of LangChain.
--- a/README.md
+++ b/README.md
@@ -119,7 +119,7 @@ Agents allow an LLM autonomy over how a task is accomplished. Agents make decisi
 Please see [here](https://python.langchain.com) for full documentation, which includes:

 - [Introduction](https://python.langchain.com/docs/introduction/): Overview of the framework and the structure of the docs.
- [Tutorials](https://python.langchain.com/docs/use_cases/): If you're looking to build something specific or are more of a hands-on learner, check out our tutorials. This is the best place to get started.
+- [Tutorials](https://python.langchain.com/docs/tutorials/): If you're looking to build something specific or are more of a hands-on learner, check out our tutorials. This is the best place to get started.
 - [How-to guides](https://python.langchain.com/docs/how_to/): Answers to “How do I….?” type questions. These guides are goal-oriented and concrete; they're meant to help you complete a specific task.
 - [Conceptual guide](https://python.langchain.com/docs/concepts/): Conceptual explanations of the key parts of the framework.
 - [API Reference](https://api.python.langchain.com): Thorough documentation of every class and method.
--- a/docs/data/people.yml
+++ b/docs/data/people.yml
--- a/docs/docs/concepts.mdx
+++ b/docs/docs/concepts.mdx
@@ -611,7 +611,7 @@ Read more about [defining tools that return artifacts here](/docs/how_to/tool_ar
 When designing tools to be used by a model, it is important to keep in mind that:

 - Chat models that have explicit [tool-calling APIs](/docs/concepts/#functiontool-calling) will be better at tool calling than non-fine-tuned models.
- Models will perform better if the tools have well-chosen names, descriptions, and JSON schemas. This another form of prompt engineering.
+- Models will perform better if the tools have well-chosen names, descriptions, and JSON schemas. This is another form of prompt engineering.
 - Simple, narrowly scoped tools are easier for models to use than complex tools.

 #### Related
--- a/docs/docs/how_to/custom_tools.ipynb
+++ b/docs/docs/how_to/custom_tools.ipynb
@@ -22,7 +22,7 @@
    "2. LangChain [Runnables](/docs/concepts#runnable-interface);\n",
    "3. By sub-classing from [BaseTool](https://python.langchain.com/api_reference/core/tools/langchain_core.tools.BaseTool.html) -- This is the most flexible method, it provides the largest degree of control, at the expense of more effort and code.\n",
    "\n",
-    "Creating tools from functions may be sufficient for most use cases, and can be done via a simple [@tool decorator](https://python.langchain.com/api_reference/core/tools/langchain_core.tools.tool.html#langchain_core.tools.tool). If more configuration is needed-- e.g., specification of both sync and async implementations-- one can also use the [StructuredTool.from_function](https://python.langchain.com/api_reference/core/tools/langchain_core.tools.StructuredTool.html#langchain_core.tools.StructuredTool.from_function) class method.\n",
+    "Creating tools from functions may be sufficient for most use cases, and can be done via a simple [@tool decorator](https://python.langchain.com/api_reference/core/tools/langchain_core.tools.tool.html#langchain_core.tools.tool). If more configuration is needed-- e.g., specification of both sync and async implementations-- one can also use the [StructuredTool.from_function](https://python.langchain.com/api_reference/core/tools/langchain_core.tools.structured.StructuredTool.html#langchain_core.tools.structured.StructuredTool.from_function) class method.\n",
    "\n",
    "In this guide we provide an overview of these methods.\n",
    "\n",
--- a/docs/docs/integrations/chat/ibm_watsonx.ipynb
+++ b/docs/docs/integrations/chat/ibm_watsonx.ipynb
@@ -36,7 +36,7 @@
    "### Integration details\n",
    "| Class | Package | Local | Serializable | [JS support](https://js.langchain.com/docs/integrations/chat/openai) | Package downloads | Package latest |\n",
    "| :--- | :--- | :---: | :---: |  :---: | :---: | :---: |\n",
-    "| ChatWatsonx | ❌ | ❌ | ❌ | ![PyPI - Downloads](https://img.shields.io/pypi/dm/langchain-ibm?style=flat-square&label=%20) | ![PyPI - Version](https://img.shields.io/pypi/v/langchain-ibm?style=flat-square&label=%20) |\n",
+    "| ChatWatsonx | ❌ | ❌ | ❌ | ❌ | ![PyPI - Downloads](https://img.shields.io/pypi/dm/langchain-ibm?style=flat-square&label=%20) | ![PyPI - Version](https://img.shields.io/pypi/v/langchain-ibm?style=flat-square&label=%20) |\n",
    "\n",
    "### Model features\n",
    "| [Tool calling](/docs/how_to/tool_calling/) | [Structured output](/docs/how_to/structured_output/) | JSON mode | Image input | Audio input | Video input | [Token-level streaming](/docs/how_to/chat_streaming/) | Native async | [Token usage](/docs/how_to/chat_token_usage_tracking/) | [Logprobs](/docs/how_to/logprobs/) |\n",
@@ -126,21 +126,19 @@
   "source": [
    "## Instantiation\n",
    "\n",
-    "You might need to adjust model `parameters` for different models or tasks. For details, refer to [Available MetaNames](https://ibm.github.io/watsonx-ai-python-sdk/fm_model.html#metanames.GenTextParamsMetaNames)."
+    "You might need to adjust model `parameters` for different models or tasks. For details, refer to [Available TextChatParameters](https://ibm.github.io/watsonx-ai-python-sdk/fm_schema.html#ibm_watsonx_ai.foundation_models.schema.TextChatParameters)."
   ]
  },
  {
   "cell_type": "code",
-   "execution_count": 1,
+   "execution_count": 5,
   "id": "407cd500",
   "metadata": {},
   "outputs": [],
   "source": [
    "parameters = {\n",
-    "    \"decoding_method\": \"sample\",\n",
-    "    \"max_new_tokens\": 100,\n",
-    "    \"min_new_tokens\": 1,\n",
-    "    \"stop_sequences\": [\".\"],\n",
+    "    \"temperature\": 0.9,\n",
+    "    \"max_tokens\": 200,\n",
    "}"
   ]
  },
@@ -160,20 +158,20 @@
    "In this example, we’ll use the `project_id` and Dallas URL.\n",
    "\n",
    "\n",
-    "You need to specify the `model_id` that will be used for inferencing. You can find the list of all the available models in [Supported foundation models](https://ibm.github.io/watsonx-ai-python-sdk/fm_model.html#ibm_watsonx_ai.foundation_models.utils.enums.ModelTypes)."
+    "You need to specify the `model_id` that will be used for inferencing. You can find the list of all the available models in [Supported chat models](https://ibm.github.io/watsonx-ai-python-sdk/fm_helpers.html#ibm_watsonx_ai.foundation_models_manager.FoundationModelsManager.get_chat_model_specs)."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
-   "id": "98371396",
+   "id": "e3568e91",
   "metadata": {},
   "outputs": [],
   "source": [
    "from langchain_ibm import ChatWatsonx\n",
    "\n",
    "chat = ChatWatsonx(\n",
-    "    model_id=\"ibm/granite-13b-chat-v2\",\n",
+    "    model_id=\"ibm/granite-34b-code-instruct\",\n",
    "    url=\"https://us-south.ml.cloud.ibm.com\",\n",
    "    project_id=\"PASTE YOUR PROJECT_ID HERE\",\n",
    "    params=parameters,\n",
@@ -196,7 +194,7 @@
   "outputs": [],
   "source": [
    "chat = ChatWatsonx(\n",
-    "    model_id=\"ibm/granite-13b-chat-v2\",\n",
+    "    model_id=\"ibm/granite-34b-code-instruct\",\n",
    "    url=\"PASTE YOUR URL HERE\",\n",
    "    username=\"PASTE YOUR USERNAME HERE\",\n",
    "    password=\"PASTE YOUR PASSWORD HERE\",\n",
@@ -242,17 +240,17 @@
  },
  {
   "cell_type": "code",
-   "execution_count": 22,
+   "execution_count": 8,
   "id": "beea2b5b",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
-       "AIMessage(content=\"Je t'aime pour écouter la Rock.\", response_metadata={'token_usage': {'generated_token_count': 12, 'input_token_count': 28}, 'model_name': 'ibm/granite-13b-chat-v2', 'system_fingerprint': '', 'finish_reason': 'stop_sequence'}, id='run-05b305ce-5401-4a10-b557-41a4b15c7f6f-0')"
+       "AIMessage(content=\"J'adore que tu escois de écouter de la rock ! \", additional_kwargs={}, response_metadata={'token_usage': {'completion_tokens': 19, 'prompt_tokens': 34, 'total_tokens': 53}, 'model_name': 'ibm/granite-34b-code-instruct', 'system_fingerprint': '', 'finish_reason': 'stop'}, id='chat-ef888fc41f0d4b37903b622250ff7528', usage_metadata={'input_tokens': 34, 'output_tokens': 19, 'total_tokens': 53})"
      ]
     },
-     "execution_count": 22,
+     "execution_count": 8,
     "metadata": {},
     "output_type": "execute_result"
    }
@@ -273,17 +271,17 @@
  },
  {
   "cell_type": "code",
-   "execution_count": 41,
+   "execution_count": 9,
   "id": "8ab1a25a",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
-       "AIMessage(content='Sure, I can help you with that! Horses are large, powerful mammals that belong to the family Equidae.', response_metadata={'token_usage': {'generated_token_count': 24, 'input_token_count': 24}, 'model_name': 'ibm/granite-13b-chat-v2', 'system_fingerprint': '', 'finish_reason': 'stop_sequence'}, id='run-391776ff-3b38-4768-91e8-ff64177149e5-0')"
+       "AIMessage(content='horses are quadrupedal mammals that are members of the family Equidae. They are typically farm animals, competing in horse racing and other forms of equine competition. With over 200 breeds, horses are diverse in their physical appearance and behavior. They are intelligent, social animals that are often used for transportation, food, and entertainment.', additional_kwargs={}, response_metadata={'token_usage': {'completion_tokens': 89, 'prompt_tokens': 29, 'total_tokens': 118}, 'model_name': 'ibm/granite-34b-code-instruct', 'system_fingerprint': '', 'finish_reason': 'stop'}, id='chat-9a6e28abb3d448aaa4f83b677a9fd653', usage_metadata={'input_tokens': 29, 'output_tokens': 89, 'total_tokens': 118})"
      ]
     },
-     "execution_count": 41,
+     "execution_count": 9,
     "metadata": {},
     "output_type": "execute_result"
    }
@@ -314,7 +312,7 @@
  },
  {
   "cell_type": "code",
-   "execution_count": 17,
+   "execution_count": 10,
   "id": "dd919925",
   "metadata": {},
   "outputs": [],
@@ -338,17 +336,17 @@
  },
  {
   "cell_type": "code",
-   "execution_count": 18,
+   "execution_count": 11,
   "id": "68160377",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
-       "AIMessage(content='Ich liebe Python.', response_metadata={'token_usage': {'generated_token_count': 5, 'input_token_count': 23}, 'model_name': 'ibm/granite-13b-chat-v2', 'system_fingerprint': '', 'finish_reason': 'stop_sequence'}, id='run-1b1ccf5d-0e33-46f2-a087-e2a136ba1fb7-0')"
+       "AIMessage(content='Ich liebe Python.', additional_kwargs={}, response_metadata={'token_usage': {'completion_tokens': 7, 'prompt_tokens': 28, 'total_tokens': 35}, 'model_name': 'ibm/granite-34b-code-instruct', 'system_fingerprint': '', 'finish_reason': 'stop'}, id='chat-fef871190b6047a7a3e68c58b3810c33', usage_metadata={'input_tokens': 28, 'output_tokens': 7, 'total_tokens': 35})"
      ]
     },
-     "execution_count": 18,
+     "execution_count": 11,
     "metadata": {},
     "output_type": "execute_result"
    }
@@ -376,7 +374,7 @@
  },
  {
   "cell_type": "code",
-   "execution_count": 4,
+   "execution_count": 12,
   "id": "3f63166a",
   "metadata": {},
   "outputs": [
@@ -384,7 +382,7 @@
     "name": "stdout",
     "output_type": "stream",
     "text": [
-      "The moon is a natural satellite of the Earth, and it has been a source of fascination for humans for centuries."
+      "The Moon is the fifth largest moon in the solar system and the largest relative to its host planet. It is the fifth brightest object in Earth's night sky after the Sun, the stars, the Milky Way, and the Moon itself. It orbits around the Earth at an average distance of 238,855 miles (384,400 kilometers). The Moon's gravity is about one-sixthth of Earth's and thus allows for the formation of tides on Earth. The Moon is thought to have formed around 4.5 billion years ago from debris from a collision between Earth and a Mars-sized body named Theia. The Moon is effectively immutable, with its current characteristics remaining from formation. Aside from Earth, the Moon is the only other natural satellite of Earth. The most widely accepted theory is that it formed from the debris of a collision"
     ]
    }
   ],
@@ -410,18 +408,18 @@
  },
  {
   "cell_type": "code",
-   "execution_count": 32,
+   "execution_count": 13,
   "id": "9e948729",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
-       "[AIMessage(content='Cats are domestic animals that belong to the Felidae family.', response_metadata={'token_usage': {'generated_token_count': 13, 'input_token_count': 24}, 'model_name': 'ibm/granite-13b-chat-v2', 'system_fingerprint': '', 'finish_reason': 'stop_sequence'}, id='run-71a8bd7a-a1aa-497b-9bdd-a4d6fe1d471a-0'),\n",
-       " AIMessage(content='Dogs are domesticated mammals of the family Canidae, characterized by their adaptability to various environments and social structures.', response_metadata={'token_usage': {'generated_token_count': 24, 'input_token_count': 24}, 'model_name': 'ibm/granite-13b-chat-v2', 'system_fingerprint': '', 'finish_reason': 'stop_sequence'}, id='run-22b7a0cb-e44a-4b68-9921-872f82dcd82b-0')]"
+       "[AIMessage(content='The cat is a popular domesticated carnivorous mammal that belongs to the family Felidae. Cats arefriendly, intelligent, and independent animals that are well-known for their playful behavior, agility, and ability to hunt prey. cats come in a wide range of breeds, each with their own unique physical and behavioral characteristics. They are kept as pets worldwide due to their affectionate nature and companionship. Cats are important members of the household and are often involved in everything from childcare to entertainment.', additional_kwargs={}, response_metadata={'token_usage': {'completion_tokens': 127, 'prompt_tokens': 28, 'total_tokens': 155}, 'model_name': 'ibm/granite-34b-code-instruct', 'system_fingerprint': '', 'finish_reason': 'stop'}, id='chat-fa452af0a0fa4a668b6a704aecd7d718', usage_metadata={'input_tokens': 28, 'output_tokens': 127, 'total_tokens': 155}),\n",
+       " AIMessage(content='Dogs are domesticated animals that belong to the Canidae family, also known as wolves. They are one of the most popular pets worldwide, known for their loyalty and affection towards their owners. Dogs come in various breeds, each with unique characteristics, and are trained for different purposes such as hunting, herding, or guarding. They require a lot of exercise and mental stimulation to stay healthy and happy, and they need proper training and socialization to be well-behaved. Dogs are also known for their playful and energetic nature, making them great companions for people of all ages.', additional_kwargs={}, response_metadata={'token_usage': {'completion_tokens': 144, 'prompt_tokens': 28, 'total_tokens': 172}, 'model_name': 'ibm/granite-34b-code-instruct', 'system_fingerprint': '', 'finish_reason': 'stop'}, id='chat-cae7663c50cf4f3499726821cc2f0ec7', usage_metadata={'input_tokens': 28, 'output_tokens': 144, 'total_tokens': 172})]"
      ]
     },
-     "execution_count": 32,
+     "execution_count": 13,
     "metadata": {},
     "output_type": "execute_result"
    }
@@ -452,9 +450,7 @@
    "\n",
    "### ChatWatsonx.bind_tools()\n",
    "\n",
-    "Please note that `ChatWatsonx.bind_tools` is on beta state, so right now we only support `mistralai/mixtral-8x7b-instruct-v01` model.\n",
-    "\n",
-    "You should also redefine `max_new_tokens` parameter to get the entire model response. By default `max_new_tokens` is set to 20."
+    "Please note that `ChatWatsonx.bind_tools` is on beta state, so we recommend using `mistralai/mistral-large` model."
   ]
  },
  {
@@ -466,10 +462,8 @@
   "source": [
    "from langchain_ibm import ChatWatsonx\n",
    "\n",
-    "parameters = {\"max_new_tokens\": 200}\n",
-    "\n",
    "chat = ChatWatsonx(\n",
-    "    model_id=\"mistralai/mixtral-8x7b-instruct-v01\",\n",
+    "    model_id=\"mistralai/mistral-large\",\n",
    "    url=\"https://us-south.ml.cloud.ibm.com\",\n",
    "    project_id=\"PASTE YOUR PROJECT_ID HERE\",\n",
    "    params=parameters,\n",
@@ -478,7 +472,7 @@
  },
  {
   "cell_type": "code",
-   "execution_count": 3,
+   "execution_count": 2,
   "id": "e1633a73",
   "metadata": {},
   "outputs": [],
@@ -497,17 +491,17 @@
  },
  {
   "cell_type": "code",
-   "execution_count": 4,
+   "execution_count": 3,
   "id": "3bf9b8ab",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
-       "AIMessage(content='', additional_kwargs={'function_call': {'type': 'function'}, 'tool_calls': [{'type': 'function', 'function': {'name': 'GetWeather', 'arguments': '{\"location\": \"Los Angeles\"}'}, 'id': None}, {'type': 'function', 'function': {'name': 'GetWeather', 'arguments': '{\"location\": \"New York\"}'}, 'id': None}]}, response_metadata={'token_usage': {'generated_token_count': 99, 'input_token_count': 320}, 'model_name': 'mistralai/mixtral-8x7b-instruct-v01', 'system_fingerprint': '', 'finish_reason': 'eos_token'}, id='run-38627104-f2ac-4edb-8390-d5425fb65979-0', tool_calls=[{'name': 'GetWeather', 'args': {'location': 'Los Angeles'}, 'id': None}, {'name': 'GetWeather', 'args': {'location': 'New York'}, 'id': None}])"
+       "AIMessage(content='', additional_kwargs={'tool_calls': [{'id': 'chatcmpl-tool-6c06a19bbe824d78a322eb193dbde12d', 'type': 'function', 'function': {'name': 'GetWeather', 'arguments': '{\"location\": \"Los Angeles, CA\"}'}}, {'id': 'chatcmpl-tool-493542e46f1141bfbfeb5deae6c9e086', 'type': 'function', 'function': {'name': 'GetWeather', 'arguments': '{\"location\": \"New York, NY\"}'}}]}, response_metadata={'token_usage': {'completion_tokens': 46, 'prompt_tokens': 95, 'total_tokens': 141}, 'model_name': 'mistralai/mistral-large', 'system_fingerprint': '', 'finish_reason': 'tool_calls'}, id='chat-027f2bdb217e4238909cb26d3e8a8fbf', tool_calls=[{'name': 'GetWeather', 'args': {'location': 'Los Angeles, CA'}, 'id': 'chatcmpl-tool-6c06a19bbe824d78a322eb193dbde12d', 'type': 'tool_call'}, {'name': 'GetWeather', 'args': {'location': 'New York, NY'}, 'id': 'chatcmpl-tool-493542e46f1141bfbfeb5deae6c9e086', 'type': 'tool_call'}], usage_metadata={'input_tokens': 95, 'output_tokens': 46, 'total_tokens': 141})"
      ]
     },
-     "execution_count": 4,
+     "execution_count": 3,
     "metadata": {},
     "output_type": "execute_result"
    }
@@ -530,18 +524,24 @@
  },
  {
   "cell_type": "code",
-   "execution_count": 5,
+   "execution_count": 4,
   "id": "38f10ba7",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
-       "[{'name': 'GetWeather', 'args': {'location': 'Los Angeles'}, 'id': None},\n",
-       " {'name': 'GetWeather', 'args': {'location': 'New York'}, 'id': None}]"
+       "[{'name': 'GetWeather',\n",
+       "  'args': {'location': 'Los Angeles, CA'},\n",
+       "  'id': 'chatcmpl-tool-6c06a19bbe824d78a322eb193dbde12d',\n",
+       "  'type': 'tool_call'},\n",
+       " {'name': 'GetWeather',\n",
+       "  'args': {'location': 'New York, NY'},\n",
+       "  'id': 'chatcmpl-tool-493542e46f1141bfbfeb5deae6c9e086',\n",
+       "  'type': 'tool_call'}]"
      ]
     },
-     "execution_count": 5,
+     "execution_count": 4,
     "metadata": {},
     "output_type": "execute_result"
    }
@@ -567,7 +567,7 @@
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
-   "version": "3.11.9"
+   "version": "3.10.14"
  }
 },
 "nbformat": 4,
--- a/docs/docs/integrations/chat/sambastudio.ipynb
+++ b/docs/docs/integrations/chat/sambastudio.ipynb
@@ -0,0 +1,383 @@
+{
+ "cells": [
+  {
+   "cell_type": "raw",
+   "metadata": {
+    "vscode": {
+     "languageId": "raw"
+    }
+   },
+   "source": [
+    "---\n",
+    "sidebar_label: SambaStudio\n",
+    "---"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "# ChatSambaStudio\n",
+    "\n",
+    "This will help you getting started with SambaStudio [chat models](/docs/concepts/#chat-models). For detailed documentation of all ChatStudio features and configurations head to the [API reference](https://api.python.langchain.com/en/latest/chat_models/langchain_community.chat_models.sambanova.ChatSambaStudio.html).\n",
+    "\n",
+    "**[SambaNova](https://sambanova.ai/)'s** [SambaStudio](https://docs.sambanova.ai/sambastudio/latest/sambastudio-intro.html) SambaStudio is a rich, GUI-based platform that provides the functionality to train, deploy, and manage models in SambaNova [DataScale](https://sambanova.ai/products/datascale) systems.\n",
+    "\n",
+    "## Overview\n",
+    "### Integration details\n",
+    "\n",
+    "| Class | Package | Local | Serializable | JS support | Package downloads | Package latest |\n",
+    "| :--- | :--- | :---: | :---: |  :---: | :---: | :---: |\n",
+    "| [ChatSambaStudio](https://api.python.langchain.com/en/latest/chat_models/langchain_community.chat_models.sambanova.ChatSambaStudio.html) | [langchain-community](https://python.langchain.com/api_reference/community/index.html) | ❌ | ❌ | ❌ | ![PyPI - Downloads](https://img.shields.io/pypi/dm/langchain_community?style=flat-square&label=%20) | ![PyPI - Version](https://img.shields.io/pypi/v/langchain_community?style=flat-square&label=%20) |\n",
+    "\n",
+    "### Model features\n",
+    "\n",
+    "| [Tool calling](/docs/how_to/tool_calling) | [Structured output](/docs/how_to/structured_output/) | JSON mode | [Image input](/docs/how_to/multimodal_inputs/) | Audio input | Video input | [Token-level streaming](/docs/how_to/chat_streaming/) | Native async | [Token usage](/docs/how_to/chat_token_usage_tracking/) | [Logprobs](/docs/how_to/logprobs/) |\n",
+    "| :---: | :---: | :---: | :---: |  :---: | :---: | :---: | :---: | :---: | :---: |\n",
+    "| ❌ | ❌ | ❌ | ❌ | ❌ | ❌ | ✅ | ✅ | ✅ | ❌ | \n",
+    "\n",
+    "## Setup\n",
+    "\n",
+    "To access ChatSambaStudio models you will need to [deploy an endpoint](https://docs.sambanova.ai/sambastudio/latest/language-models.html) in your SambaStudio platform, install the `langchain_community` integration package, and install the `SSEClient` Package.\n",
+    "\n",
+    "```bash\n",
+    "pip install langchain-community\n",
+    "pip install sseclient-py\n",
+    "```\n",
+    "\n",
+    "### Credentials\n",
+    "\n",
+    "Get the URL and API Key from your SambaStudio deployed endpoint and add them to your environment variables:\n",
+    "\n",
+    "``` bash\n",
+    "export SAMBASTUDIO_URL=\"your-api-key-here\"\n",
+    "export SAMBASTUDIO_API_KEY=\"your-api-key-here\"\n",
+    "```"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 1,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "import getpass\n",
+    "import os\n",
+    "\n",
+    "if not os.getenv(\"SAMBASTUDIO_URL\"):\n",
+    "    os.environ[\"SAMBASTUDIO_URL\"] = getpass.getpass(\"Enter your SambaStudio URL: \")\n",
+    "if not os.getenv(\"SAMBASTUDIO_API_KEY\"):\n",
+    "    os.environ[\"SAMBASTUDIO_API_KEY\"] = getpass.getpass(\n",
+    "        \"Enter your SambaStudio API key: \"\n",
+    "    )"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "If you want to get automated tracing of your model calls you can also set your [LangSmith](https://docs.smith.langchain.com/) API key by uncommenting below:"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "# os.environ[\"LANGCHAIN_TRACING_V2\"] = \"true\"\n",
+    "# os.environ[\"LANGCHAIN_API_KEY\"] = getpass.getpass(\"Enter your LangSmith API key: \")"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "### Installation\n",
+    "\n",
+    "The LangChain __SambaStudio__ integration lives in the `langchain_community` package:"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "%pip install -qU langchain-community\n",
+    "%pip install -qu sseclient-py"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "## Instantiation\n",
+    "\n",
+    "Now we can instantiate our model object and generate chat completions:"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 2,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "from langchain_community.chat_models.sambanova import ChatSambaStudio\n",
+    "\n",
+    "llm = ChatSambaStudio(\n",
+    "    model=\"Meta-Llama-3-70B-Instruct-4096\",  # set if using a CoE endpoint\n",
+    "    max_tokens=1024,\n",
+    "    temperature=0.7,\n",
+    "    top_k=1,\n",
+    "    top_p=0.01,\n",
+    "    do_sample=True,\n",
+    "    process_prompt=\"True\",  # set if using a CoE endpoint\n",
+    ")"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "## Invocation"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 3,
+   "metadata": {},
+   "outputs": [
+    {
+     "data": {
+      "text/plain": [
+       "AIMessage(content=\"J'adore la programmation.\", response_metadata={'id': 'item0', 'partial': False, 'value': {'completion': \"J'adore la programmation.\", 'logprobs': {'text_offset': [], 'top_logprobs': []}, 'prompt': '<|start_header_id|>system<|end_header_id|>\\n\\nYou are a helpful assistant that translates English to French. Translate the user sentence.<|eot_id|><|start_header_id|>user<|end_header_id|>\\n\\nI love programming.<|eot_id|><|start_header_id|>assistant<|end_header_id|>\\n\\n', 'stop_reason': 'end_of_text', 'tokens': ['J', \"'\", 'ad', 'ore', ' la', ' programm', 'ation', '.'], 'total_tokens_count': 43}, 'params': {}, 'status': None}, id='item0')"
+      ]
+     },
+     "execution_count": 3,
+     "metadata": {},
+     "output_type": "execute_result"
+    }
+   ],
+   "source": [
+    "messages = [\n",
+    "    (\n",
+    "        \"system\",\n",
+    "        \"You are a helpful assistant that translates English to French. Translate the user sentence.\",\n",
+    "    ),\n",
+    "    (\"human\", \"I love programming.\"),\n",
+    "]\n",
+    "ai_msg = llm.invoke(messages)\n",
+    "ai_msg"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 4,
+   "metadata": {},
+   "outputs": [
+    {
+     "name": "stdout",
+     "output_type": "stream",
+     "text": [
+      "J'adore la programmation.\n"
+     ]
+    }
+   ],
+   "source": [
+    "print(ai_msg.content)"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "## Chaining\n",
+    "\n",
+    "We can [chain](/docs/how_to/sequence/) our model with a prompt template like so:"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 5,
+   "metadata": {},
+   "outputs": [
+    {
+     "data": {
+      "text/plain": [
+       "AIMessage(content='Ich liebe das Programmieren.', response_metadata={'id': 'item0', 'partial': False, 'value': {'completion': 'Ich liebe das Programmieren.', 'logprobs': {'text_offset': [], 'top_logprobs': []}, 'prompt': '<|start_header_id|>system<|end_header_id|>\\n\\nYou are a helpful assistant that translates English to German.<|eot_id|><|start_header_id|>user<|end_header_id|>\\n\\nI love programming.<|eot_id|><|start_header_id|>assistant<|end_header_id|>\\n\\n', 'stop_reason': 'end_of_text', 'tokens': ['Ich', ' liebe', ' das', ' Programm', 'ieren', '.'], 'total_tokens_count': 36}, 'params': {}, 'status': None}, id='item0')"
+      ]
+     },
+     "execution_count": 5,
+     "metadata": {},
+     "output_type": "execute_result"
+    }
+   ],
+   "source": [
+    "from langchain_core.prompts import ChatPromptTemplate\n",
+    "\n",
+    "prompt = ChatPromptTemplate(\n",
+    "    [\n",
+    "        (\n",
+    "            \"system\",\n",
+    "            \"You are a helpful assistant that translates {input_language} to {output_language}.\",\n",
+    "        ),\n",
+    "        (\"human\", \"{input}\"),\n",
+    "    ]\n",
+    ")\n",
+    "\n",
+    "chain = prompt | llm\n",
+    "chain.invoke(\n",
+    "    {\n",
+    "        \"input_language\": \"English\",\n",
+    "        \"output_language\": \"German\",\n",
+    "        \"input\": \"I love programming.\",\n",
+    "    }\n",
+    ")"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "## Streaming"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 6,
+   "metadata": {},
+   "outputs": [
+    {
+     "name": "stdout",
+     "output_type": "stream",
+     "text": [
+      "Arrr, ye landlubber! Ye be wantin' to learn about owls, eh? Well, matey, settle yerself down with a pint o' grog and listen close, for I be tellin' ye about these fascinatin' creatures o' the night!\n",
+      "\n",
+      "Owls be birds, but not just any birds, me hearty! They be nocturnal, meanin' they do their huntin' at night, when the rest o' the world be sleepin'. And they be experts at it, too! Their big, round eyes be designed for seein' in the dark, with a special reflective layer called the tapetum lucidum that helps 'em spot prey in the shadows. It's like havin' a built-in lantern, savvy?\n",
+      "\n",
+      "But that be not all, me matey! Owls also have acute hearin', which helps 'em pinpoint the slightest sounds in the dark. And their ears be asymmetrical, meanin' one ear be higher than the other, which gives 'em better depth perception. It's like havin' a built-in sonar system, arrr!\n",
+      "\n",
+      "Now, ye might be wonderin' how owls fly so silently, like ghosts in the night. Well, it be because o' their special feathers, me hearty! They have soft, fringed feathers on their wings that help reduce noise and turbulence, makin' 'em the sneakiest flyers on the seven seas... er, skies!\n",
+      "\n",
+      "Owls come in all shapes and sizes, from the tiny elf owl to the great grey owl, which be one o' the largest owl species in the world. And they be found on every continent, except Antarctica, o' course. They be solitary creatures, but some species be known to form long-term monogamous relationships, like the barn owl and its mate.\n",
+      "\n",
+      "So, there ye have it, me hearty! Owls be amazin' creatures, with their clever adaptations and stealthy ways. Now, go forth and spread the word about these magnificent birds o' the night! And remember, if ye ever encounter an owl in the wild, be sure to show respect and keep a weather eye open, or ye might just find yerself on the receivin' end o' a silent, flyin' tackle! Arrr!"
+     ]
+    }
+   ],
+   "source": [
+    "system = \"You are a helpful assistant with pirate accent.\"\n",
+    "human = \"I want to learn more about this animal: {animal}\"\n",
+    "prompt = ChatPromptTemplate.from_messages([(\"system\", system), (\"human\", human)])\n",
+    "\n",
+    "chain = prompt | llm\n",
+    "\n",
+    "for chunk in chain.stream({\"animal\": \"owl\"}):\n",
+    "    print(chunk.content, end=\"\", flush=True)"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "## Async"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 7,
+   "metadata": {},
+   "outputs": [
+    {
+     "data": {
+      "text/plain": [
+       "AIMessage(content='The capital of France is Paris.', response_metadata={'id': 'item0', 'partial': False, 'value': {'completion': 'The capital of France is Paris.', 'logprobs': {'text_offset': [], 'top_logprobs': []}, 'prompt': '<|start_header_id|>user<|end_header_id|>\\n\\nwhat is the capital of France?<|eot_id|><|start_header_id|>assistant<|end_header_id|>\\n\\n', 'stop_reason': 'end_of_text', 'tokens': ['The', ' capital', ' of', ' France', ' is', ' Paris', '.'], 'total_tokens_count': 24}, 'params': {}, 'status': None}, id='item0')"
+      ]
+     },
+     "execution_count": 7,
+     "metadata": {},
+     "output_type": "execute_result"
+    }
+   ],
+   "source": [
+    "prompt = ChatPromptTemplate.from_messages(\n",
+    "    [\n",
+    "        (\n",
+    "            \"human\",\n",
+    "            \"what is the capital of {country}?\",\n",
+    "        )\n",
+    "    ]\n",
+    ")\n",
+    "\n",
+    "chain = prompt | llm\n",
+    "await chain.ainvoke({\"country\": \"France\"})"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "## Async Streaming"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 8,
+   "metadata": {},
+   "outputs": [
+    {
+     "name": "stdout",
+     "output_type": "stream",
+     "text": [
+      "Quantum computers use quantum bits (qubits) to process multiple possibilities simultaneously, exponentially faster than classical computers, enabling breakthroughs in fields like cryptography, optimization, and simulation."
+     ]
+    }
+   ],
+   "source": [
+    "prompt = ChatPromptTemplate.from_messages(\n",
+    "    [\n",
+    "        (\n",
+    "            \"human\",\n",
+    "            \"in less than {num_words} words explain me {topic} \",\n",
+    "        )\n",
+    "    ]\n",
+    ")\n",
+    "chain = prompt | llm\n",
+    "\n",
+    "async for chunk in chain.astream({\"num_words\": 30, \"topic\": \"quantum computers\"}):\n",
+    "    print(chunk.content, end=\"\", flush=True)"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "## API reference\n",
+    "\n",
+    "For detailed documentation of all ChatSambaStudio features and configurations head to the API reference: https://api.python.langchain.com/en/latest/chat_models/langchain_community.chat_models.sambanova.ChatSambaStudio.html"
+   ]
+  }
+ ],
+ "metadata": {
+  "kernelspec": {
+   "display_name": "langchain",
+   "language": "python",
+   "name": "python3"
+  },
+  "language_info": {
+   "codemirror_mode": {
+    "name": "ipython",
+    "version": 3
+   },
+   "file_extension": ".py",
+   "mimetype": "text/x-python",
+   "name": "python",
+   "nbconvert_exporter": "python",
+   "pygments_lexer": "ipython3",
+   "version": "3.9.19"
+  }
+ },
+ "nbformat": 4,
+ "nbformat_minor": 2
+}
--- a/docs/docs/integrations/document_loaders/airbyte.ipynb
+++ b/docs/docs/integrations/document_loaders/airbyte.ipynb
@@ -29,7 +29,7 @@
   "metadata": {},
   "outputs": [],
   "source": [
-    "% pip install -qU langchain-airbyte"
+    "%pip install -qU langchain-airbyte"
   ]
  },
  {
--- a/docs/docs/integrations/document_loaders/browserbase.ipynb
+++ b/docs/docs/integrations/document_loaders/browserbase.ipynb
@@ -26,7 +26,7 @@
   "metadata": {},
   "outputs": [],
   "source": [
-    "% pip install browserbase"
+    "%pip install browserbase"
   ]
  },
  {
--- a/docs/docs/integrations/document_loaders/upstage.ipynb
+++ b/docs/docs/integrations/document_loaders/upstage.ipynb
@@ -25,9 +25,9 @@
    }
   },
   "source": [
-    "# UpstageLayoutAnalysisLoader\n",
+    "# UpstageDocumentParseLoader\n",
    "\n",
-    "This notebook covers how to get started with `UpstageLayoutAnalysisLoader`.\n",
+    "This notebook covers how to get started with `UpstageDocumentParseLoader`.\n",
    "\n",
    "## Installation\n",
    "\n",
@@ -89,10 +89,10 @@
    }
   ],
   "source": [
-    "from langchain_upstage import UpstageLayoutAnalysisLoader\n",
+    "from langchain_upstage import UpstageDocumentParseLoader\n",
    "\n",
    "file_path = \"/PATH/TO/YOUR/FILE.pdf\"\n",
-    "layzer = UpstageLayoutAnalysisLoader(file_path, split=\"page\")\n",
+    "layzer = UpstageDocumentParseLoader(file_path, split=\"page\")\n",
    "\n",
    "# For improved memory efficiency, consider using the lazy_load method to load documents page by page.\n",
    "docs = layzer.load()  # or layzer.lazy_load()\n",
--- a/docs/docs/integrations/llms/nvidia_ai_endpoints.ipynb
+++ b/docs/docs/integrations/llms/nvidia_ai_endpoints.ipynb
@@ -0,0 +1,309 @@
+{
+ "cells": [
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "# NVIDIA\n",
+    "\n",
+    "This will help you getting started with NVIDIA [models](/docs/concepts/#llms). For detailed documentation of all `NVIDIA` features and configurations head to the [API reference](https://python.langchain.com/api_reference/nvidia_ai_endpoints/llms/langchain_nvidia_ai_endpoints.chat_models.NVIDIA.html).\n",
+    "\n",
+    "## Overview\n",
+    "The `langchain-nvidia-ai-endpoints` package contains LangChain integrations building applications with models on \n",
+    "NVIDIA NIM inference microservice. These models are optimized by NVIDIA to deliver the best performance on NVIDIA \n",
+    "accelerated infrastructure and deployed as a NIM, an easy-to-use, prebuilt containers that deploy anywhere using a single \n",
+    "command on NVIDIA accelerated infrastructure.\n",
+    "\n",
+    "NVIDIA hosted deployments of NIMs are available to test on the [NVIDIA API catalog](https://build.nvidia.com/). After testing, \n",
+    "NIMs can be exported from NVIDIA’s API catalog using the NVIDIA AI Enterprise license and run on-premises or in the cloud, \n",
+    "giving enterprises ownership and full control of their IP and AI application.\n",
+    "\n",
+    "NIMs are packaged as container images on a per model basis and are distributed as NGC container images through the NVIDIA NGC Catalog. \n",
+    "At their core, NIMs provide easy, consistent, and familiar APIs for running inference on an AI model.\n",
+    "\n",
+    "This example goes over how to use LangChain to interact with NVIDIA supported via the `NVIDIA` class.\n",
+    "\n",
+    "For more information on accessing the llm models through this api, check out the [NVIDIA](https://python.langchain.com/docs/integrations/llms/nvidia_ai_endpoints/) documentation.\n",
+    "\n",
+    "### Integration details\n",
+    "\n",
+    "| Class | Package | Local | Serializable | JS support | Package downloads | Package latest |\n",
+    "| :--- | :--- | :---: | :---: |  :---: | :---: | :---: |\n",
+    "| [NVIDIA](https://python.langchain.com/api_reference/nvidia_ai_endpoints/llms/langchain_nvidia_ai_endpoints.chat_models.ChatNVIDIA.html) | [langchain_nvidia_ai_endpoints](https://python.langchain.com/api_reference/nvidia_ai_endpoints/index.html) | ✅ | beta | ❌ | ![PyPI - Downloads](https://img.shields.io/pypi/dm/langchain_nvidia_ai_endpoints?style=flat-square&label=%20) | ![PyPI - Version](https://img.shields.io/pypi/v/langchain_nvidia_ai_endpoints?style=flat-square&label=%20) |\n",
+    "\n",
+    "### Model features\n",
+    "| JSON mode | [Image input](/docs/how_to/multimodal_inputs/) | Audio input | Video input | [Token-level streaming](/docs/how_to/chat_streaming/) | Native async | [Token usage](/docs/how_to/chat_token_usage_tracking/) | [Logprobs](/docs/how_to/logprobs/) |\n",
+    "| :---: | :---: |  :---: | :---: | :---: | :---: | :---: | :---: |\n",
+    "| ❌ | ✅ | ❌ | ❌ | ✅ | ❌ | ❌ | ❌ | \n",
+    "\n",
+    "## Setup\n",
+    "\n",
+    "**To get started:**\n",
+    "\n",
+    "1. Create a free account with [NVIDIA](https://build.nvidia.com/), which hosts NVIDIA AI Foundation models.\n",
+    "\n",
+    "2. Click on your model of choice.\n",
+    "\n",
+    "3. Under `Input` select the `Python` tab, and click `Get API Key`. Then click `Generate Key`.\n",
+    "\n",
+    "4. Copy and save the generated key as `NVIDIA_API_KEY`. From there, you should have access to the endpoints.\n",
+    "\n",
+    "### Credentials\n"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 1,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "import getpass\n",
+    "import os\n",
+    "\n",
+    "if not os.getenv(\"NVIDIA_API_KEY\"):\n",
+    "    # Note: the API key should start with \"nvapi-\"\n",
+    "    os.environ[\"NVIDIA_API_KEY\"] = getpass.getpass(\"Enter your NVIDIA API key: \")"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "### Installation\n",
+    "\n",
+    "The LangChain NVIDIA AI Endpoints integration lives in the `langchain_nvidia_ai_endpoints` package:"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "%pip install --upgrade --quiet langchain-nvidia-ai-endpoints"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "## Instantiation\n",
+    "\n",
+    "See [LLM](/docs/how_to#llms) for full functionality."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 3,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "from langchain_nvidia_ai_endpoints import NVIDIA"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "llm = NVIDIA().bind(max_tokens=256)\n",
+    "llm"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "## Invocation"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 5,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "prompt = \"# Function that does quicksort written in Rust without comments:\""
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "print(llm.invoke(prompt))"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "## Stream, Batch, and Async\n",
+    "\n",
+    "These models natively support streaming, and as is the case with all LangChain LLMs they expose a batch method to handle concurrent requests, as well as async methods for invoke, stream, and batch. Below are a few examples."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "for chunk in llm.stream(prompt):\n",
+    "    print(chunk, end=\"\", flush=True)"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "llm.batch([prompt])"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "await llm.ainvoke(prompt)"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "async for chunk in llm.astream(prompt):\n",
+    "    print(chunk, end=\"\", flush=True)"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "await llm.abatch([prompt])"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "async for chunk in llm.astream_log(prompt):\n",
+    "    print(chunk)"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "response = llm.invoke(\n",
+    "    \"X_train, y_train, X_test, y_test = train_test_split(X, y, test_size=0.1) #Train a logistic regression model, predict the labels on the test set and compute the accuracy score\"\n",
+    ")\n",
+    "print(response)"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "## Supported models\n",
+    "\n",
+    "Querying `available_models` will still give you all of the other models offered by your API credentials."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "NVIDIA.get_available_models()\n",
+    "# llm.get_available_models()"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "## Chaining\n",
+    "\n",
+    "We can [chain](/docs/how_to/sequence/) our model with a prompt template like so:"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "from langchain_core.prompts import ChatPromptTemplate\n",
+    "\n",
+    "prompt = ChatPromptTemplate(\n",
+    "    [\n",
+    "        (\n",
+    "            \"system\",\n",
+    "            \"You are a helpful assistant that translates {input_language} to {output_language}.\",\n",
+    "        ),\n",
+    "        (\"human\", \"{input}\"),\n",
+    "    ]\n",
+    ")\n",
+    "\n",
+    "chain = prompt | llm\n",
+    "chain.invoke(\n",
+    "    {\n",
+    "        \"input_language\": \"English\",\n",
+    "        \"output_language\": \"German\",\n",
+    "        \"input\": \"I love programming.\",\n",
+    "    }\n",
+    ")"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "## API reference\n",
+    "\n",
+    "For detailed documentation of all `NVIDIA` features and configurations head to the API reference: https://python.langchain.com/api_reference/nvidia_ai_endpoints/llms/langchain_nvidia_ai_endpoints.llms.NVIDIA.html"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": []
+  }
+ ],
+ "metadata": {
+  "kernelspec": {
+   "display_name": "langchain-nvidia-ai-endpoints-m0-Y4aGr-py3.10",
+   "language": "python",
+   "name": "python3"
+  },
+  "language_info": {
+   "codemirror_mode": {
+    "name": "ipython",
+    "version": 3
+   },
+   "file_extension": ".py",
+   "mimetype": "text/x-python",
+   "name": "python",
+   "nbconvert_exporter": "python",
+   "pygments_lexer": "ipython3",
+   "version": "3.10.12"
+  }
+ },
+ "nbformat": 4,
+ "nbformat_minor": 2
+}
--- a/docs/docs/integrations/platforms/microsoft.mdx
+++ b/docs/docs/integrations/platforms/microsoft.mdx
@@ -264,22 +264,20 @@ See a [usage example](/docs/integrations/document_loaders/url/#playwright-url-lo
 from langchain_community.document_loaders.onenote import OneNoteLoader
 ```

-## AI Agent Memory System
-
-[AI agent](https://learn.microsoft.com/en-us/azure/cosmos-db/ai-agents) needs robust memory systems that support multi-modality, offer strong operational performance, and enable agent memory sharing as well as separation.
+## Vector Stores

 ### Azure Cosmos DB
 AI agents can rely on Azure Cosmos DB as a unified [memory system](https://learn.microsoft.com/en-us/azure/cosmos-db/ai-agents#memory-can-make-or-break-agents) solution, enjoying speed, scale, and simplicity. This service successfully [enabled OpenAI's ChatGPT service](https://www.youtube.com/watch?v=6IIUtEFKJec&t) to scale dynamically with high reliability and low maintenance. Powered by an atom-record-sequence engine, it is the world's first globally distributed [NoSQL](https://learn.microsoft.com/en-us/azure/cosmos-db/distributed-nosql), [relational](https://learn.microsoft.com/en-us/azure/cosmos-db/distributed-relational), and [vector database](https://learn.microsoft.com/en-us/azure/cosmos-db/vector-database) service that offers a serverless mode. 

 Below are two available Azure Cosmos DB APIs that can provide vector store functionalities.

-### Azure Cosmos DB for MongoDB (vCore)
+#### Azure Cosmos DB for MongoDB (vCore)

 >[Azure Cosmos DB for MongoDB vCore](https://learn.microsoft.com/en-us/azure/cosmos-db/mongodb/vcore/) makes it easy to create a database with full native MongoDB support.
 > You can apply your MongoDB experience and continue to use your favorite MongoDB drivers, SDKs, and tools by pointing your application to the API for MongoDB vCore account's connection string.
 > Use vector search in Azure Cosmos DB for MongoDB vCore to seamlessly integrate your AI-based applications with your data that's stored in Azure Cosmos DB.

-#### Installation and Setup
+##### Installation and Setup

 See [detail configuration instructions](/docs/integrations/vectorstores/azure_cosmos_db).

@@ -289,7 +287,7 @@ We need to install `pymongo` python package.
 pip install pymongo
 ```

-#### Deploy Azure Cosmos DB on Microsoft Azure
+##### Deploy Azure Cosmos DB on Microsoft Azure

 Azure Cosmos DB for MongoDB vCore provides developers with a fully managed MongoDB-compatible database service for building modern applications with a familiar architecture.

@@ -303,7 +301,7 @@ See a [usage example](/docs/integrations/vectorstores/azure_cosmos_db).
 from langchain_community.vectorstores import AzureCosmosDBVectorSearch
 ```

-### Azure Cosmos DB NoSQL
+#### Azure Cosmos DB NoSQL

 >[Azure Cosmos DB for NoSQL](https://learn.microsoft.com/en-us/azure/cosmos-db/nosql/vector-search) now offers vector indexing and search in preview.
 This feature is designed to handle high-dimensional vectors, enabling efficient and accurate vector search at any scale. You can now store vectors
@@ -312,7 +310,7 @@ but also high-dimensional vectors as other properties of the documents. This col
 as the vectors are stored in the same logical unit as the data they represent. This simplifies data management, AI application architectures, and the
 efficiency of vector-based operations.

-#### Installation and Setup
+##### Installation and Setup

 See [detail configuration instructions](/docs/integrations/vectorstores/azure_cosmos_db_no_sql).

@@ -322,7 +320,7 @@ We need to install `azure-cosmos` python package.
 pip install azure-cosmos
 ```

-#### Deploy Azure Cosmos DB on Microsoft Azure
+##### Deploy Azure Cosmos DB on Microsoft Azure

 Azure Cosmos DB offers a solution for modern apps and intelligent workloads by being very responsive with dynamic and elastic autoscale. It is available
 in every Azure region and can automatically replicate data closer to users. It has SLA guaranteed low-latency and high availability.
@@ -336,6 +334,7 @@ from langchain_community.vectorstores import AzureCosmosDBNoSQLVectorSearch
 ```

 ### Azure Database for PostgreSQL
+
 >[Azure Database for PostgreSQL - Flexible Server](https://learn.microsoft.com/en-us/azure/postgresql/flexible-server/service-overview) is a relational database service based on the open-source Postgres database engine. It's a fully managed database-as-a-service that can handle mission-critical workloads with predictable performance, security, high availability, and dynamic scalability.

 See [set up instructions](https://learn.microsoft.com/en-us/azure/postgresql/flexible-server/quickstart-create-server-portal) for Azure Database for PostgreSQL. 
@@ -446,6 +445,38 @@ The `azure_ai_services` toolkit includes the following tools:
 - Text to Speech: [AzureAiServicesTextToSpeechTool](https://python.langchain.com/api_reference/community/tools/langchain_community.tools.azure_ai_services.text_to_speech.AzureAiServicesTextToSpeechTool.html)
 - Text Analytics for Health: [AzureAiServicesTextAnalyticsForHealthTool](https://python.langchain.com/api_reference/community/tools/langchain_community.tools.azure_ai_services.text_analytics_for_health.AzureAiServicesTextAnalyticsForHealthTool.html)

+### Azure Cognitive Services
+
+We need to install several python packages.
+
+```bash
+pip install azure-ai-formrecognizer azure-cognitiveservices-speech azure-ai-vision-imageanalysis
+```
+
+See a [usage example](/docs/integrations/tools/azure_cognitive_services).
+
+```python
+from langchain_community.agent_toolkits import AzureCognitiveServicesToolkit
+```
+
+#### Azure AI Services individual tools
+
+The `azure_ai_services` toolkit includes the tools that queries the `Azure Cognitive Services`:
+- `AzureCogsFormRecognizerTool`: Form Recognizer API
+- `AzureCogsImageAnalysisTool`: Image Analysis API
+- `AzureCogsSpeech2TextTool`: Speech2Text API
+- `AzureCogsText2SpeechTool`: Text2Speech API
+- `AzureCogsTextAnalyticsHealthTool`: Text Analytics for Health API
+
+```python
+from langchain_community.tools.azure_cognitive_services import (
+    AzureCogsFormRecognizerTool,
+    AzureCogsImageAnalysisTool,
+    AzureCogsSpeech2TextTool,
+    AzureCogsText2SpeechTool,
+    AzureCogsTextAnalyticsHealthTool,
+)
+```

 ### Microsoft Office 365 email and calendar

@@ -465,11 +496,11 @@ from langchain_community.agent_toolkits import O365Toolkit
 #### Office 365 individual tools

 You can use individual tools from the Office 365 Toolkit:
- `O365CreateDraftMessage`: tool for creating a draft email in Office 365
- `O365SearchEmails`: tool for searching email messages in Office 365
- `O365SearchEvents`: tool for searching calendar events in Office 365
- `O365SendEvent`: tool for sending calendar events in Office 365
- `O365SendMessage`: tool for sending an email in Office 365
+- `O365CreateDraftMessage`: creating a draft email in Office 365
+- `O365SearchEmails`: searching email messages in Office 365
+- `O365SearchEvents`: searching calendar events in Office 365
+- `O365SendEvent`: sending calendar events in Office 365
+- `O365SendMessage`: sending an email in Office 365

 ```python
 from langchain_community.tools.office365 import O365CreateDraftMessage
@@ -497,9 +528,9 @@ from langchain_community.utilities.powerbi import PowerBIDataset
 #### PowerBI individual tools

 You can use individual tools from the Azure PowerBI Toolkit:
- `InfoPowerBITool`: tool for getting metadata about a PowerBI Dataset
- `ListPowerBITool`: tool for getting tables names
- `QueryPowerBITool`: tool for querying a PowerBI Dataset
+- `InfoPowerBITool`: getting metadata about a PowerBI Dataset
+- `ListPowerBITool`: getting tables names
+- `QueryPowerBITool`: querying a PowerBI Dataset

 ```python
 from langchain_community.tools.powerbi.tool import InfoPowerBITool
--- a/docs/docs/integrations/providers/baai.mdx
+++ b/docs/docs/integrations/providers/baai.mdx
@@ -0,0 +1,44 @@
+# BAAI
+
+>[Beijing Academy of Artificial Intelligence (BAAI) (Wikipedia)](https://en.wikipedia.org/wiki/Beijing_Academy_of_Artificial_Intelligence), 
+> also known as `Zhiyuan Institute`, is a Chinese non-profit artificial 
+> intelligence (AI) research laboratory. `BAAI` conducts AI research 
+> and is dedicated to promoting collaboration among academia and industry, 
+> as well as fostering top talent and a focus on long-term research on 
+> the fundamentals of AI technology. As a collaborative hub, BAAI's founding 
+> members include leading AI companies, universities, and research institutes.
+
+
+## Embedding Models
+
+### HuggingFaceBgeEmbeddings
+
+>[BGE models on the HuggingFace](https://huggingface.co/BAAI/bge-large-en-v1.5) 
+> are one of [the best open-source embedding models](https://huggingface.co/spaces/mteb/leaderboard).
+
+See a [usage example](/docs/integrations/text_embedding/bge_huggingface).
+
+```python
+from langchain_community.embeddings import HuggingFaceBgeEmbeddings
+```
+
+### IpexLLMBgeEmbeddings
+
+>[IPEX-LLM](https://github.com/intel-analytics/ipex-llm) is a PyTorch 
+> library for running LLM on Intel CPU and GPU (e.g., local PC with iGPU, 
+> discrete GPU such as Arc, Flex and Max) with very low latency.
+
+See a [usage example running model on Intel CPU](/docs/integrations/text_embedding/ipex_llm).
+See a [usage example running model on Intel GPU](/docs/integrations/text_embedding/ipex_llm_gpu).
+
+```python
+from langchain_community.embeddings import IpexLLMBgeEmbeddings
+```
+
+### QuantizedBgeEmbeddings
+
+See a [usage example](/docs/integrations/text_embedding/itrex).
+
+```python
+from langchain_community.embeddings import QuantizedBgeEmbeddings
+```
--- a/docs/docs/integrations/providers/jina.mdx
+++ b/docs/docs/integrations/providers/jina.mdx
@@ -1,20 +1,35 @@
-# Jina
+# Jina AI

-This page covers how to use the Jina Embeddings within LangChain.
-It is broken into two parts: installation and setup, and then references to specific Jina wrappers.
+>[Jina AI](https://jina.ai/about-us) is a search AI company. `Jina` helps businesses and developers unlock multimodal data with a better search.

 ## Installation and Setup
 - Get a Jina AI API token from [here](https://jina.ai/embeddings/) and set it as an environment variable (`JINA_API_TOKEN`)

-There exists a Jina Embeddings wrapper, which you can access with 
+## Chat Models

 ```python
-from langchain_community.embeddings import JinaEmbeddings
-
-# you can pas jina_api_key, if none is passed it will be taken from `JINA_API_TOKEN` environment variable
-embeddings = JinaEmbeddings(jina_api_key='jina_**', model_name='jina-embeddings-v2-base-en')
+from langchain_community.chat_models import JinaChat
 ```

+See a [usage examples](/docs/integrations/chat/jinachat).
+
+## Embedding Models
+
 You can check the list of available models from [here](https://jina.ai/embeddings/)

-For a more detailed walkthrough of this, see [this notebook](/docs/integrations/text_embedding/jina)
+```python
+from langchain_community.embeddings import JinaEmbeddings
+```
+
+See a [usage examples](/docs/integrations/text_embedding/jina).
+
+## Document Transformers
+
+### Jina Rerank
+
+```python
+from langchain_community.document_compressors import JinaRerank
+```
+
+See a [usage examples](/docs/integrations/document_transformers/jina_rerank).
+
--- a/docs/docs/integrations/providers/koboldai.mdx
+++ b/docs/docs/integrations/providers/koboldai.mdx
@@ -0,0 +1,20 @@
+# KoboldAI
+
+>[KoboldAI](https://koboldai.com/) is a free, open-source project that allows users to run AI models locally 
+> on their own computer. 
+> It's a browser-based front-end that can be used for writing or role playing with an AI.
+>[KoboldAI](https://github.com/KoboldAI/KoboldAI-Client) is a "a browser-based front-end for 
+> AI-assisted writing with multiple local & remote AI models...". 
+> It has a public and local API that can be used in LangChain.
+
+## Installation and Setup
+
+Check out the [installation guide](https://github.com/KoboldAI/KoboldAI-Client).
+
+## LLMs
+
+See a [usage example](/docs/integrations/llms/koboldai).
+
+```python
+from langchain_community.llms import KoboldApiLLM
+```
--- a/docs/docs/integrations/providers/upstage.ipynb
+++ b/docs/docs/integrations/providers/upstage.ipynb
@@ -10,7 +10,7 @@
    ">\n",
    ">**Solar Mini Chat** is a fast yet powerful advanced large language model focusing on English and Korean. It has been specifically fine-tuned for multi-turn chat purposes, showing enhanced performance across a wide range of natural language processing tasks, like multi-turn conversation or tasks that require an understanding of long contexts, such as RAG (Retrieval-Augmented Generation), compared to other models of a similar size. This fine-tuning equips it with the ability to handle longer conversations more effectively, making it particularly adept for interactive applications.\n",
    "\n",
-    ">Other than Solar, Upstage also offers features for real-world RAG (retrieval-augmented generation), such as **Groundedness Check** and **Layout Analysis**. \n"
+    ">Other than Solar, Upstage also offers features for real-world RAG (retrieval-augmented generation), such as **Document Parse** and **Groundedness Check**. \n"
   ]
  },
  {
@@ -24,7 +24,7 @@
    "| Chat | Build assistants using Solar Mini Chat | `from langchain_upstage import ChatUpstage` | [Go](../../chat/upstage) |\n",
    "| Text Embedding | Embed strings to vectors | `from langchain_upstage import UpstageEmbeddings` | [Go](../../text_embedding/upstage) |\n",
    "| Groundedness Check | Verify groundedness of assistant's response | `from langchain_upstage import UpstageGroundednessCheck` | [Go](../../tools/upstage_groundedness_check) |\n",
-    "| Layout Analysis | Serialize documents with tables and figures | `from langchain_upstage import UpstageLayoutAnalysisLoader` | [Go](../../document_loaders/upstage) |\n",
+    "| Document Parse | Serialize documents with tables and figures | `from langchain_upstage import UpstageDocumentParseLoader` | [Go](../../document_loaders/upstage) |\n",
    "\n",
    "See [documentations](https://developers.upstage.ai/) for more details about the features."
   ]
@@ -122,7 +122,7 @@
   "source": [
    "## Document loader\n",
    "\n",
-    "### Layout Analysis\n",
+    "### Document Parse\n",
    "\n",
    "See [a usage example](/docs/integrations/document_loaders/upstage)."
   ]
@@ -133,10 +133,10 @@
   "metadata": {},
   "outputs": [],
   "source": [
-    "from langchain_upstage import UpstageLayoutAnalysisLoader\n",
+    "from langchain_upstage import UpstageDocumentParseLoader\n",
    "\n",
    "file_path = \"/PATH/TO/YOUR/FILE.pdf\"\n",
-    "layzer = UpstageLayoutAnalysisLoader(file_path, split=\"page\")\n",
+    "layzer = UpstageDocumentParseLoader(file_path, split=\"page\")\n",
    "\n",
    "# For improved memory efficiency, consider using the lazy_load method to load documents page by page.\n",
    "docs = layzer.load()  # or layzer.lazy_load()\n",
--- a/docs/docs/integrations/retrievers/box.ipynb
+++ b/docs/docs/integrations/retrievers/box.ipynb
@@ -52,18 +52,10 @@
  },
  {
   "cell_type": "code",
-   "execution_count": 1,
+   "execution_count": 2,
   "id": "b87a8e8b-9b5a-4e78-97e4-274b6b0dd29f",
   "metadata": {},
-   "outputs": [
-    {
-     "name": "stdin",
-     "output_type": "stream",
-     "text": [
-      "Enter your Box Developer Token:  ········\n"
-     ]
-    }
-   ],
+   "outputs": [],
   "source": [
    "import getpass\n",
    "import os\n",
@@ -81,7 +73,7 @@
  },
  {
   "cell_type": "code",
-   "execution_count": null,
+   "execution_count": 3,
   "id": "a15d341e-3e26-4ca3-830b-5aab30ed66de",
   "metadata": {},
   "outputs": [],
@@ -102,10 +94,18 @@
  },
  {
   "cell_type": "code",
-   "execution_count": null,
+   "execution_count": 4,
   "id": "652d6238-1f87-422a-b135-f5abbb8652fc",
   "metadata": {},
-   "outputs": [],
+   "outputs": [
+    {
+     "name": "stdout",
+     "output_type": "stream",
+     "text": [
+      "Note: you may need to restart the kernel to use updated packages.\n"
+     ]
+    }
+   ],
   "source": [
    "%pip install -qU langchain-box"
   ]
@@ -124,7 +124,7 @@
  },
  {
   "cell_type": "code",
-   "execution_count": 2,
+   "execution_count": 5,
   "id": "70cc8e65-2a02-408a-bbc6-8ef649057d82",
   "metadata": {},
   "outputs": [],
@@ -146,7 +146,7 @@
  },
  {
   "cell_type": "code",
-   "execution_count": 33,
+   "execution_count": 6,
   "id": "97f3ae67",
   "metadata": {},
   "outputs": [
@@ -156,7 +156,7 @@
       "[Document(metadata={'source': 'https://dl.boxcloud.com/api/2.0/internal_files/1514555423624/versions/1663171610024/representations/extracted_text/content/', 'title': 'Invoice-A5555_txt'}, page_content='Vendor: AstroTech Solutions\\nInvoice Number: A5555\\n\\nLine Items:\\n    - Gravitational Wave Detector Kit: $800\\n    - Exoplanet Terrarium: $120\\nTotal: $920')]"
      ]
     },
-     "execution_count": 33,
+     "execution_count": 6,
     "metadata": {},
     "output_type": "execute_result"
    }
@@ -192,7 +192,7 @@
  },
  {
   "cell_type": "code",
-   "execution_count": 4,
+   "execution_count": 7,
   "id": "ee0e726d-9974-4aa0-9ce1-0057ec3e540a",
   "metadata": {},
   "outputs": [],
@@ -216,17 +216,17 @@
  },
  {
   "cell_type": "code",
-   "execution_count": 5,
+   "execution_count": 8,
   "id": "51a60dbe-9f2e-4e04-bb62-23968f17164a",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
-       "[Document(metadata={'source': 'Box AI', 'title': 'Box AI What was the most expensive item purchased'}, page_content='The most expensive item purchased was the **Gravitational Wave Detector Kit** from AstroTech Solutions, which cost $800.')]"
+       "[Document(metadata={'source': 'Box AI', 'title': 'Box AI What was the most expensive item purchased'}, page_content='The most expensive item purchased is the **Gravitational Wave Detector Kit** from AstroTech Solutions, which costs **$800**.')]"
      ]
     },
-     "execution_count": 5,
+     "execution_count": 8,
     "metadata": {},
     "output_type": "execute_result"
    }
@@ -237,6 +237,80 @@
    "retriever.invoke(query)"
   ]
  },
+  {
+   "cell_type": "markdown",
+   "id": "31a59a51",
+   "metadata": {},
+   "source": [
+    "## Citations\n",
+    "\n",
+    "With Box AI and the `BoxRetriever`, you can return the answer to your prompt, return the citations used by Box to get that answer, or both. No matter how you choose to use Box AI, the retriever returns a `List[Document]` object. We offer this flexibility with two `bool` arguments, `answer` and `citations`. Answer defaults to `True` and citations defaults to `False`, do you can omit both if you just want the answer. If you want both, you can just include `citations=True` and if you only want citations, you would include `answer=False` and `citations=True`\n",
+    "\n",
+    "### Get both"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 11,
+   "id": "2eddc8c1",
+   "metadata": {},
+   "outputs": [
+    {
+     "data": {
+      "text/plain": [
+       "[Document(metadata={'source': 'Box AI', 'title': 'Box AI What was the most expensive item purchased'}, page_content='The most expensive item purchased is the **Gravitational Wave Detector Kit** from AstroTech Solutions, which costs **$800**.'),\n",
+       " Document(metadata={'source': 'Box AI What was the most expensive item purchased', 'file_name': 'Invoice-A5555.txt', 'file_id': '1514555423624', 'file_type': 'file'}, page_content='Vendor: AstroTech Solutions\\nInvoice Number: A5555\\n\\nLine Items:\\n    - Gravitational Wave Detector Kit: $800\\n    - Exoplanet Terrarium: $120\\nTotal: $920')]"
+      ]
+     },
+     "execution_count": 11,
+     "metadata": {},
+     "output_type": "execute_result"
+    }
+   ],
+   "source": [
+    "retriever = BoxRetriever(\n",
+    "    box_developer_token=box_developer_token, box_file_ids=box_file_ids, citations=True\n",
+    ")\n",
+    "\n",
+    "retriever.invoke(query)"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "d2e93a2e",
+   "metadata": {},
+   "source": [
+    "### Citations only"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 12,
+   "id": "c1892b07",
+   "metadata": {},
+   "outputs": [
+    {
+     "data": {
+      "text/plain": [
+       "[Document(metadata={'source': 'Box AI What was the most expensive item purchased', 'file_name': 'Invoice-A5555.txt', 'file_id': '1514555423624', 'file_type': 'file'}, page_content='Vendor: AstroTech Solutions\\nInvoice Number: A5555\\n\\nLine Items:\\n    - Gravitational Wave Detector Kit: $800\\n    - Exoplanet Terrarium: $120\\nTotal: $920')]"
+      ]
+     },
+     "execution_count": 12,
+     "metadata": {},
+     "output_type": "execute_result"
+    }
+   ],
+   "source": [
+    "retriever = BoxRetriever(\n",
+    "    box_developer_token=box_developer_token,\n",
+    "    box_file_ids=box_file_ids,\n",
+    "    answer=False,\n",
+    "    citations=True,\n",
+    ")\n",
+    "\n",
+    "retriever.invoke(query)"
+   ]
+  },
  {
   "cell_type": "markdown",
   "id": "dfe8aad4-8626-4330-98a9-7ea1ca5d2e0e",
@@ -260,7 +334,7 @@
   "metadata": {},
   "outputs": [
    {
-     "name": "stdin",
+     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Enter your OpenAI key:  ········\n"
--- a/docs/docs/integrations/text_embedding/ipex_llm.ipynb
+++ b/docs/docs/integrations/text_embedding/ipex_llm.ipynb
@@ -4,7 +4,7 @@
   "cell_type": "markdown",
   "metadata": {},
   "source": [
-    "# Local BGE Embeddings with IPEX-LLM on Intel CPU\n",
+    "# IPEX-LLM: Local BGE Embeddings on Intel CPU\n",
    "\n",
    "> [IPEX-LLM](https://github.com/intel-analytics/ipex-llm) is a PyTorch library for running LLM on Intel CPU and GPU (e.g., local PC with iGPU, discrete GPU such as Arc, Flex and Max) with very low latency.\n",
    "\n",
@@ -92,10 +92,24 @@
  }
 ],
 "metadata": {
+  "kernelspec": {
+   "display_name": "Python 3 (ipykernel)",
+   "language": "python",
+   "name": "python3"
+  },
  "language_info": {
-   "name": "python"
+   "codemirror_mode": {
+    "name": "ipython",
+    "version": 3
+   },
+   "file_extension": ".py",
+   "mimetype": "text/x-python",
+   "name": "python",
+   "nbconvert_exporter": "python",
+   "pygments_lexer": "ipython3",
+   "version": "3.10.12"
  }
 },
 "nbformat": 4,
- "nbformat_minor": 2
+ "nbformat_minor": 4
 }
--- a/docs/docs/integrations/text_embedding/ipex_llm_gpu.ipynb
+++ b/docs/docs/integrations/text_embedding/ipex_llm_gpu.ipynb
@@ -4,7 +4,7 @@
   "cell_type": "markdown",
   "metadata": {},
   "source": [
-    "# Local BGE Embeddings with IPEX-LLM on Intel GPU\n",
+    "# IPEX-LLM: Local BGE Embeddings on Intel GPU\n",
    "\n",
    "> [IPEX-LLM](https://github.com/intel-analytics/ipex-llm) is a PyTorch library for running LLM on Intel CPU and GPU (e.g., local PC with iGPU, discrete GPU such as Arc, Flex and Max) with very low latency.\n",
    "\n",
@@ -155,10 +155,24 @@
  }
 ],
 "metadata": {
+  "kernelspec": {
+   "display_name": "Python 3 (ipykernel)",
+   "language": "python",
+   "name": "python3"
+  },
  "language_info": {
-   "name": "python"
+   "codemirror_mode": {
+    "name": "ipython",
+    "version": 3
+   },
+   "file_extension": ".py",
+   "mimetype": "text/x-python",
+   "name": "python",
+   "nbconvert_exporter": "python",
+   "pygments_lexer": "ipython3",
+   "version": "3.10.12"
  }
 },
 "nbformat": 4,
- "nbformat_minor": 2
+ "nbformat_minor": 4
 }
--- a/docs/docs/integrations/text_embedding/jina.ipynb
+++ b/docs/docs/integrations/text_embedding/jina.ipynb
@@ -5,7 +5,11 @@
   "id": "1c0cf975",
   "metadata": {},
   "source": [
-    "# Jina"
+    "# Jina\n",
+    "\n",
+    "You can check the list of available models from [here](https://jina.ai/embeddings/).\n",
+    "\n",
+    "## Installation and setup"
   ]
  },
  {
@@ -231,7 +235,7 @@
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
-   "version": "3.9.13"
+   "version": "3.10.12"
  }
 },
 "nbformat": 4,
--- a/docs/docs/integrations/tools/databricks.ipynb
+++ b/docs/docs/integrations/tools/databricks.ipynb
@@ -74,6 +74,24 @@
    ")"
   ]
  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "(Optional) To increase the retry time for getting a function execution response, set environment variable UC_TOOL_CLIENT_EXECUTION_TIMEOUT. Default retry time value is 120s."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "import os\n",
+    "\n",
+    "os.environ[\"UC_TOOL_CLIENT_EXECUTION_TIMEOUT\"] = \"200\""
+   ]
+  },
  {
   "cell_type": "code",
   "execution_count": 4,
--- a/docs/docs/integrations/tools/jira.ipynb
+++ b/docs/docs/integrations/tools/jira.ipynb
@@ -11,6 +11,8 @@
    "\n",
    "The `Jira` toolkit allows agents to interact with a given Jira instance, performing actions such as searching for issues and creating issues, the tool wraps the atlassian-python-api library, for more see: https://atlassian-python-api.readthedocs.io/jira.html\n",
    "\n",
+    "## Installation and setup\n",
+    "\n",
    "To use this tool, you must first set as environment variables:\n",
    "    JIRA_API_TOKEN\n",
    "    JIRA_USERNAME\n",
@@ -47,7 +49,7 @@
   },
   "outputs": [],
   "source": [
-    "%pip install -qU langchain-community"
+    "%pip install -qU langchain-community langchain_openai"
   ]
  },
  {
@@ -58,6 +60,13 @@
    "ExecuteTime": {
     "end_time": "2023-04-17T10:21:23.730922Z",
     "start_time": "2023-04-17T10:21:22.911233Z"
+    },
+    "execution": {
+     "iopub.execute_input": "2024-10-02T17:40:07.356954Z",
+     "iopub.status.busy": "2024-10-02T17:40:07.356792Z",
+     "iopub.status.idle": "2024-10-02T17:40:07.359943Z",
+     "shell.execute_reply": "2024-10-02T17:40:07.359476Z",
+     "shell.execute_reply.started": "2024-10-02T17:40:07.356942Z"
    }
   },
   "outputs": [],
@@ -72,7 +81,7 @@
  },
  {
   "cell_type": "code",
-   "execution_count": 4,
+   "execution_count": 3,
   "id": "b3050b55",
   "metadata": {
    "ExecuteTime": {
@@ -80,6 +89,13 @@
     "start_time": "2023-04-17T10:22:42.499447Z"
    },
    "collapsed": false,
+    "execution": {
+     "iopub.execute_input": "2024-10-02T17:40:16.201684Z",
+     "iopub.status.busy": "2024-10-02T17:40:16.200922Z",
+     "iopub.status.idle": "2024-10-02T17:40:16.208035Z",
+     "shell.execute_reply": "2024-10-02T17:40:16.207564Z",
+     "shell.execute_reply.started": "2024-10-02T17:40:16.201634Z"
+    },
    "jupyter": {
     "outputs_hidden": false
    }
@@ -93,6 +109,74 @@
    "os.environ[\"JIRA_CLOUD\"] = \"True\""
   ]
  },
+  {
+   "cell_type": "code",
+   "execution_count": 4,
+   "id": "c0768000-227b-4aa1-a838-4befbdefadb1",
+   "metadata": {
+    "execution": {
+     "iopub.execute_input": "2024-10-02T17:42:00.792867Z",
+     "iopub.status.busy": "2024-10-02T17:42:00.792365Z",
+     "iopub.status.idle": "2024-10-02T17:42:00.816979Z",
+     "shell.execute_reply": "2024-10-02T17:42:00.816419Z",
+     "shell.execute_reply.started": "2024-10-02T17:42:00.792827Z"
+    }
+   },
+   "outputs": [],
+   "source": [
+    "llm = OpenAI(temperature=0)\n",
+    "jira = JiraAPIWrapper()\n",
+    "toolkit = JiraToolkit.from_jira_api_wrapper(jira)"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "961b3187-daf0-4907-9cc0-a69796fba4aa",
+   "metadata": {},
+   "source": [
+    "## Tool usage\n",
+    "\n",
+    "Let's see what individual tools are in the Jira toolkit:"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 5,
+   "id": "eb5cf521-9a91-44bc-b68e-bc4067d05a76",
+   "metadata": {
+    "execution": {
+     "iopub.execute_input": "2024-10-02T17:42:27.232022Z",
+     "iopub.status.busy": "2024-10-02T17:42:27.231140Z",
+     "iopub.status.idle": "2024-10-02T17:42:27.240169Z",
+     "shell.execute_reply": "2024-10-02T17:42:27.239693Z",
+     "shell.execute_reply.started": "2024-10-02T17:42:27.231949Z"
+    }
+   },
+   "outputs": [
+    {
+     "data": {
+      "text/plain": [
+       "[('JQL Query',\n",
+       "  '\\n    This tool is a wrapper around atlassian-python-api\\'s Jira jql API, useful when you need to search for Jira issues.\\n    The input to this tool is a JQL query string, and will be passed into atlassian-python-api\\'s Jira `jql` function,\\n    For example, to find all the issues in project \"Test\" assigned to the me, you would pass in the following string:\\n    project = Test AND assignee = currentUser()\\n    or to find issues with summaries that contain the word \"test\", you would pass in the following string:\\n    summary ~ \\'test\\'\\n    '),\n",
+       " ('Get Projects',\n",
+       "  \"\\n    This tool is a wrapper around atlassian-python-api's Jira project API, \\n    useful when you need to fetch all the projects the user has access to, find out how many projects there are, or as an intermediary step that involv searching by projects. \\n    there is no input to this tool.\\n    \"),\n",
+       " ('Create Issue',\n",
+       "  '\\n    This tool is a wrapper around atlassian-python-api\\'s Jira issue_create API, useful when you need to create a Jira issue. \\n    The input to this tool is a dictionary specifying the fields of the Jira issue, and will be passed into atlassian-python-api\\'s Jira `issue_create` function.\\n    For example, to create a low priority task called \"test issue\" with description \"test description\", you would pass in the following dictionary: \\n    {{\"summary\": \"test issue\", \"description\": \"test description\", \"issuetype\": {{\"name\": \"Task\"}}, \"priority\": {{\"name\": \"Low\"}}}}\\n    '),\n",
+       " ('Catch all Jira API call',\n",
+       "  '\\n    This tool is a wrapper around atlassian-python-api\\'s Jira API.\\n    There are other dedicated tools for fetching all projects, and creating and searching for issues, \\n    use this tool if you need to perform any other actions allowed by the atlassian-python-api Jira API.\\n    The input to this tool is a dictionary specifying a function from atlassian-python-api\\'s Jira API, \\n    as well as a list of arguments and dictionary of keyword arguments to pass into the function.\\n    For example, to get all the users in a group, while increasing the max number of results to 100, you would\\n    pass in the following dictionary: {{\"function\": \"get_all_users_from_group\", \"args\": [\"group\"], \"kwargs\": {{\"limit\":100}} }}\\n    or to find out how many projects are in the Jira instance, you would pass in the following string:\\n    {{\"function\": \"projects\"}}\\n    For more information on the Jira API, refer to https://atlassian-python-api.readthedocs.io/jira.html\\n    '),\n",
+       " ('Create confluence page',\n",
+       "  'This tool is a wrapper around atlassian-python-api\\'s Confluence \\natlassian-python-api API, useful when you need to create a Confluence page. The input to this tool is a dictionary \\nspecifying the fields of the Confluence page, and will be passed into atlassian-python-api\\'s Confluence `create_page` \\nfunction. For example, to create a page in the DEMO space titled \"This is the title\" with body \"This is the body. You can use \\n<strong>HTML tags</strong>!\", you would pass in the following dictionary: {{\"space\": \"DEMO\", \"title\":\"This is the \\ntitle\",\"body\":\"This is the body. You can use <strong>HTML tags</strong>!\"}} ')]"
+      ]
+     },
+     "execution_count": 5,
+     "metadata": {},
+     "output_type": "execute_result"
+    }
+   ],
+   "source": [
+    "[(tool.name, tool.description) for tool in toolkit.get_tools()]"
+   ]
+  },
  {
   "cell_type": "code",
   "execution_count": 5,
@@ -105,9 +189,6 @@
   },
   "outputs": [],
   "source": [
-    "llm = OpenAI(temperature=0)\n",
-    "jira = JiraAPIWrapper()\n",
-    "toolkit = JiraToolkit.from_jira_api_wrapper(jira)\n",
    "agent = initialize_agent(\n",
    "    toolkit.get_tools(), llm, agent=AgentType.ZERO_SHOT_REACT_DESCRIPTION, verbose=True\n",
    ")"
--- a/docs/docs/integrations/tools/json.ipynb
+++ b/docs/docs/integrations/tools/json.ipynb
@@ -35,9 +35,16 @@
  },
  {
   "cell_type": "code",
-   "execution_count": 1,
+   "execution_count": 2,
   "id": "ff988466-c389-4ec6-b6ac-14364a537fd5",
   "metadata": {
+    "execution": {
+     "iopub.execute_input": "2024-10-02T17:52:40.107644Z",
+     "iopub.status.busy": "2024-10-02T17:52:40.107485Z",
+     "iopub.status.idle": "2024-10-02T17:52:40.110169Z",
+     "shell.execute_reply": "2024-10-02T17:52:40.109841Z",
+     "shell.execute_reply.started": "2024-10-02T17:52:40.107633Z"
+    },
    "tags": []
   },
   "outputs": [],
@@ -50,16 +57,23 @@
  },
  {
   "cell_type": "code",
-   "execution_count": 2,
+   "execution_count": 4,
   "id": "9ecd1ba0-3937-4359-a41e-68605f0596a1",
   "metadata": {
+    "execution": {
+     "iopub.execute_input": "2024-10-02T17:59:54.134295Z",
+     "iopub.status.busy": "2024-10-02T17:59:54.134138Z",
+     "iopub.status.idle": "2024-10-02T17:59:54.137250Z",
+     "shell.execute_reply": "2024-10-02T17:59:54.136636Z",
+     "shell.execute_reply.started": "2024-10-02T17:59:54.134283Z"
+    },
    "tags": []
   },
   "outputs": [],
   "source": [
    "with open(\"openai_openapi.yml\") as f:\n",
    "    data = yaml.load(f, Loader=yaml.FullLoader)\n",
-    "json_spec = JsonSpec(dict_=data, max_value_length=4000)\n",
+    "json_spec = JsonSpec(dict_={}, max_value_length=4000)\n",
    "json_toolkit = JsonToolkit(spec=json_spec)\n",
    "\n",
    "json_agent_executor = create_json_agent(\n",
@@ -67,6 +81,48 @@
    ")"
   ]
  },
+  {
+   "cell_type": "markdown",
+   "id": "910eccbc-9d42-49b6-a4ca-1fbc418fcee7",
+   "metadata": {},
+   "source": [
+    "## Individual tools\n",
+    "\n",
+    "Let's see what individual tools are inside the Jira toolkit."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 5,
+   "id": "b16a3ee5-ca16-452e-993f-c27228b945ac",
+   "metadata": {
+    "execution": {
+     "iopub.execute_input": "2024-10-02T18:00:30.527665Z",
+     "iopub.status.busy": "2024-10-02T18:00:30.527053Z",
+     "iopub.status.idle": "2024-10-02T18:00:30.538483Z",
+     "shell.execute_reply": "2024-10-02T18:00:30.537672Z",
+     "shell.execute_reply.started": "2024-10-02T18:00:30.527626Z"
+    }
+   },
+   "outputs": [
+    {
+     "data": {
+      "text/plain": [
+       "[('json_spec_list_keys',\n",
+       "  '\\n    Can be used to list all keys at a given path. \\n    Before calling this you should be SURE that the path to this exists.\\n    The input is a text representation of the path to the dict in Python syntax (e.g. data[\"key1\"][0][\"key2\"]).\\n    '),\n",
+       " ('json_spec_get_value',\n",
+       "  '\\n    Can be used to see value in string format at a given path.\\n    Before calling this you should be SURE that the path to this exists.\\n    The input is a text representation of the path to the dict in Python syntax (e.g. data[\"key1\"][0][\"key2\"]).\\n    ')]"
+      ]
+     },
+     "execution_count": 5,
+     "metadata": {},
+     "output_type": "execute_result"
+    }
+   ],
+   "source": [
+    "[(el.name, el.description) for el in json_toolkit.get_tools()]"
+   ]
+  },
  {
   "cell_type": "markdown",
   "id": "05cfcb24-4389-4b8f-ad9e-466e3fca8db0",
--- a/docs/docs/integrations/tools/passio_nutrition_ai.ipynb
+++ b/docs/docs/integrations/tools/passio_nutrition_ai.ipynb
@@ -176,7 +176,7 @@
   "id": "f8014c9d",
   "metadata": {},
   "source": [
-    "Now, we can initalize the agent with the LLM, the prompt, and the tools. The agent is responsible for taking in input and deciding what actions to take. Crucially, the Agent does not execute those actions - that is done by the AgentExecutor (next step). For more information about how to think about these components, see our [conceptual guide](/docs/concepts#agents)"
+    "Now, we can initialize the agent with the LLM, the prompt, and the tools. The agent is responsible for taking in input and deciding what actions to take. Crucially, the Agent does not execute those actions - that is done by the AgentExecutor (next step). For more information about how to think about these components, see our [conceptual guide](/docs/concepts#agents)"
   ]
  },
  {
--- a/docs/docs/integrations/tools/sql_database.ipynb
+++ b/docs/docs/integrations/tools/sql_database.ipynb
@@ -209,15 +209,25 @@
  },
  {
   "cell_type": "markdown",
-   "id": "5f5751e3-2e98-485f-8164-db8094039c25",
+   "id": "4e3fd064-aa86-448d-8db3-3c55eaa5bc15",
   "metadata": {},
   "source": [
-    "API references:\n",
-    "\n",
-    "- [QuerySQLDataBaseTool](https://python.langchain.com/api_reference/community/tools/langchain_community.tools.sql_database.tool.QuerySQLDataBaseTool.html)\n",
-    "- [InfoSQLDatabaseTool](https://python.langchain.com/api_reference/community/tools/langchain_community.tools.sql_database.tool.InfoSQLDatabaseTool.html)\n",
-    "- [ListSQLDatabaseTool](https://python.langchain.com/api_reference/community/tools/langchain_community.tools.sql_database.tool.ListSQLDatabaseTool.html)\n",
-    "- [QuerySQLCheckerTool](https://python.langchain.com/api_reference/community/tools/langchain_community.tools.sql_database.tool.QuerySQLCheckerTool.html)"
+    "You can use the individual tools directly:"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "7fa8d00c-750c-4803-9b66-057d12b26b06",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "from langchain_community.tools.sql_database.tool import (\n",
+    "    InfoSQLDatabaseTool,\n",
+    "    ListSQLDatabaseTool,\n",
+    "    QuerySQLCheckerTool,\n",
+    "    QuerySQLDataBaseTool,\n",
+    ")"
   ]
  },
  {
@@ -604,7 +614,7 @@
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
-   "version": "3.10.4"
+   "version": "3.10.12"
  }
 },
 "nbformat": 4,
--- a/docs/docs/integrations/vectorstores/google_spanner.ipynb
+++ b/docs/docs/integrations/vectorstores/google_spanner.ipynb
@@ -52,7 +52,7 @@
    }
   ],
   "source": [
-    "%pip install --upgrade --quiet langchain-google-spanner"
+    "%pip install --upgrade --quiet langchain-google-spanner langchain-google-vertexai"
   ]
  },
  {
@@ -124,7 +124,8 @@
    "PROJECT_ID = \"my-project-id\"  # @param {type:\"string\"}\n",
    "\n",
    "# Set the project id\n",
-    "!gcloud config set project {PROJECT_ID}"
+    "!gcloud config set project {PROJECT_ID}\n",
+    "%env GOOGLE_CLOUD_PROJECT={PROJECT_ID}"
   ]
  },
  {
@@ -194,14 +195,16 @@
    "    instance_id=INSTANCE,\n",
    "    database_id=DATABASE,\n",
    "    table_name=TABLE_NAME,\n",
-    "    id_column=\"row_id\",\n",
-    "    metadata_columns=[\n",
-    "        TableColumn(name=\"metadata\", type=\"JSON\", is_null=True),\n",
-    "        TableColumn(name=\"title\", type=\"STRING(MAX)\", is_null=False),\n",
-    "    ],\n",
-    "    secondary_indexes=[\n",
-    "        SecondaryIndex(index_name=\"row_id_and_title\", columns=[\"row_id\", \"title\"])\n",
-    "    ],\n",
+    "    # Customize the table creation\n",
+    "    # id_column=\"row_id\",\n",
+    "    # content_column=\"content_column\",\n",
+    "    # metadata_columns=[\n",
+    "    #     TableColumn(name=\"metadata\", type=\"JSON\", is_null=True),\n",
+    "    #     TableColumn(name=\"title\", type=\"STRING(MAX)\", is_null=False),\n",
+    "    # ],\n",
+    "    # secondary_indexes=[\n",
+    "    #     SecondaryIndex(index_name=\"row_id_and_title\", columns=[\"row_id\", \"title\"])\n",
+    "    # ],\n",
    ")"
   ]
  },
@@ -262,9 +265,11 @@
    "    instance_id=INSTANCE,\n",
    "    database_id=DATABASE,\n",
    "    table_name=TABLE_NAME,\n",
-    "    ignore_metadata_columns=[],\n",
    "    embedding_service=embeddings,\n",
-    "    metadata_json_column=\"metadata\",\n",
+    "    # Connect to a custom vector store table\n",
+    "    # id_column=\"row_id\",\n",
+    "    # content_column=\"content\",\n",
+    "    # metadata_columns=[\"metadata\", \"title\"],\n",
    ")"
   ]
  },
@@ -272,7 +277,7 @@
   "cell_type": "markdown",
   "metadata": {},
   "source": [
-    "#### 🔐 Add Documents\n",
+    "#### Add Documents\n",
    "To add documents in the vector store."
   ]
  },
@@ -289,14 +294,15 @@
    "loader = HNLoader(\"https://news.ycombinator.com/item?id=34817881\")\n",
    "\n",
    "documents = loader.load()\n",
-    "ids = [str(uuid.uuid4()) for _ in range(len(documents))]"
+    "ids = [str(uuid.uuid4()) for _ in range(len(documents))]\n",
+    "db.add_documents(documents, ids)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
-    "#### 🔐 Search Documents\n",
+    "#### Search Documents\n",
    "To search documents in the vector store with similarity search."
   ]
  },
@@ -313,7 +319,7 @@
   "cell_type": "markdown",
   "metadata": {},
   "source": [
-    "#### 🔐 Search Documents\n",
+    "#### Search Documents\n",
    "To search documents in the vector store with max marginal relevance search."
   ]
  },
@@ -330,7 +336,7 @@
   "cell_type": "markdown",
   "metadata": {},
   "source": [
-    "#### 🔐 Delete Documents\n",
+    "#### Delete Documents\n",
    "To remove documents from the vector store, use the IDs that correspond to the values in the `row_id`` column when initializing the VectorStore."
   ]
  },
@@ -347,7 +353,7 @@
   "cell_type": "markdown",
   "metadata": {},
   "source": [
-    "#### 🔐 Delete Documents\n",
+    "#### Delete Documents\n",
    "To remove documents from the vector store, you can utilize the documents themselves. The content column and metadata columns provided during VectorStore initialization will be used to find out the rows corresponding to the documents. Any matching rows will then be deleted."
   ]
  },
@@ -377,7 +383,7 @@
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
-   "version": "3.11.6"
+   "version": "3.11.8"
  }
 },
 "nbformat": 4,
--- a/docs/docs/tutorials/chatbot.ipynb
+++ b/docs/docs/tutorials/chatbot.ipynb
@@ -438,7 +438,7 @@
    "app = workflow.compile(checkpointer=MemorySaver())\n",
    "\n",
    "# Async invocation:\n",
-    "output = await app.ainvoke({\"messages\": input_messages}, config):\n",
+    "output = await app.ainvoke({\"messages\": input_messages}, config)\n",
    "output[\"messages\"][-1].pretty_print()\n",
    "```\n",
    "\n",
@@ -686,7 +686,7 @@
    "\n",
    "input_messages = [HumanMessage(query)]\n",
    "output = app.invoke(\n",
-    "    {\"messages\": input_messages, \"language\": language},\n",
+    "    {\"messages\": input_messages},\n",
    "    config,\n",
    ")\n",
    "output[\"messages\"][-1].pretty_print()"
--- a/docs/docs/tutorials/local_rag.ipynb
+++ b/docs/docs/tutorials/local_rag.ipynb
@@ -60,7 +60,7 @@
    "%pip install -qU langchain_ollama\n",
    "\n",
    "# Web Loader\n",
-    "% pip install -qU beautifulsoup4"
+    "%pip install -qU beautifulsoup4"
   ]
  },
  {
--- a/docs/docs/versions/migrating_chains/map_reduce_chain.ipynb
+++ b/docs/docs/versions/migrating_chains/map_reduce_chain.ipynb
@@ -196,7 +196,7 @@
   "metadata": {},
   "outputs": [],
   "source": [
-    "% pip install -qU langgraph"
+    "%pip install -qU langgraph"
   ]
  },
  {
--- a/docs/scripts/generate_api_reference_links.py
+++ b/docs/scripts/generate_api_reference_links.py
@@ -20,111 +20,53 @@ _LANGGRAPH_API_REFERENCE = "https://langchain-ai.github.io/langgraph/reference/"
 code_block_re = re.compile(r"^(```\s?python\n)(.*?)(```)", re.DOTALL | re.MULTILINE)


+# (alias/re-exported modules, source module, class, docs namespace)
 MANUAL_API_REFERENCES_LANGGRAPH = [
-    ("langgraph.prebuilt", "create_react_agent"),
    (
-        "langgraph.prebuilt",
-        "ToolNode",
+        ["langgraph.prebuilt"],
+        "langgraph.prebuilt.chat_agent_executor",
+        "create_react_agent",
+        "prebuilt",
+    ),
+    (["langgraph.prebuilt"], "langgraph.prebuilt.tool_node", "ToolNode", "prebuilt"),
+    (
+        ["langgraph.prebuilt"],
+        "langgraph.prebuilt.tool_node",
+        "tools_condition",
+        "prebuilt",
    ),
    (
-        "langgraph.prebuilt",
-        "ToolExecutor",
-    ),
-    (
-        "langgraph.prebuilt",
-        "ToolInvocation",
-    ),
-    ("langgraph.prebuilt", "tools_condition"),
-    (
-        "langgraph.prebuilt",
-        "ValidationNode",
-    ),
-    (
-        "langgraph.prebuilt",
+        ["langgraph.prebuilt"],
+        "langgraph.prebuilt.tool_node",
        "InjectedState",
+        "prebuilt",
    ),
    # Graph
-    (
-        "langgraph.graph",
-        "StateGraph",
-    ),
-    (
-        "langgraph.graph.message",
-        "MessageGraph",
-    ),
-    ("langgraph.graph.message", "add_messages"),
-    (
-        "langgraph.graph.graph",
-        "CompiledGraph",
-    ),
-    (
-        "langgraph.types",
-        "StreamMode",
-    ),
-    (
-        "langgraph.graph",
-        "START",
-    ),
-    (
-        "langgraph.graph",
-        "END",
-    ),
-    (
-        "langgraph.types",
-        "Send",
-    ),
-    (
-        "langgraph.types",
-        "Interrupt",
-    ),
-    (
-        "langgraph.types",
-        "RetryPolicy",
-    ),
-    (
-        "langgraph.checkpoint.base",
-        "Checkpoint",
-    ),
-    (
-        "langgraph.checkpoint.base",
-        "CheckpointMetadata",
-    ),
-    (
-        "langgraph.checkpoint.base",
-        "BaseCheckpointSaver",
-    ),
-    (
-        "langgraph.checkpoint.base",
-        "SerializerProtocol",
-    ),
-    (
-        "langgraph.checkpoint.serde.jsonplus",
-        "JsonPlusSerializer",
-    ),
-    (
-        "langgraph.checkpoint.memory",
-        "MemorySaver",
-    ),
-    (
-        "langgraph.checkpoint.sqlite.aio",
-        "AsyncSqliteSaver",
-    ),
-    (
-        "langgraph.checkpoint.sqlite",
-        "SqliteSaver",
-    ),
-    (
-        "langgraph.checkpoint.postgres.aio",
-        "AsyncPostgresSaver",
-    ),
-    (
-        "langgraph.checkpoint.postgres",
-        "PostgresSaver",
-    ),
+    (["langgraph.graph"], "langgraph.graph.message", "add_messages", "graphs"),
+    (["langgraph.graph"], "langgraph.graph.state", "StateGraph", "graphs"),
+    (["langgraph.graph"], "langgraph.graph.state", "CompiledStateGraph", "graphs"),
+    ([], "langgraph.types", "StreamMode", "types"),
+    (["langgraph.graph"], "langgraph.constants", "START", "constants"),
+    (["langgraph.graph"], "langgraph.constants", "END", "constants"),
+    (["langgraph.constants"], "langgraph.types", "Send", "types"),
+    (["langgraph.constants"], "langgraph.types", "Interrupt", "types"),
+    ([], "langgraph.types", "RetryPolicy", "types"),
+    ([], "langgraph.checkpoint.base", "Checkpoint", "checkpoints"),
+    ([], "langgraph.checkpoint.base", "CheckpointMetadata", "checkpoints"),
+    ([], "langgraph.checkpoint.base", "BaseCheckpointSaver", "checkpoints"),
+    ([], "langgraph.checkpoint.base", "SerializerProtocol", "checkpoints"),
+    ([], "langgraph.checkpoint.serde.jsonplus", "JsonPlusSerializer", "checkpoints"),
+    ([], "langgraph.checkpoint.memory", "MemorySaver", "checkpoints"),
+    ([], "langgraph.checkpoint.sqlite.aio", "AsyncSqliteSaver", "checkpoints"),
+    ([], "langgraph.checkpoint.sqlite", "SqliteSaver", "checkpoints"),
+    ([], "langgraph.checkpoint.postgres.aio", "AsyncPostgresSaver", "checkpoints"),
+    ([], "langgraph.checkpoint.postgres", "PostgresSaver", "checkpoints"),
 ]

 WELL_KNOWN_LANGGRAPH_OBJECTS = {
-    (module_, class_) for module_, class_ in MANUAL_API_REFERENCES_LANGGRAPH
+    (module_, class_): (source_module, namespace)
+    for (modules, source_module, class_, namespace) in MANUAL_API_REFERENCES_LANGGRAPH
+    for module_ in modules + [source_module]
 }


@@ -308,34 +250,21 @@ def _get_imports(
                    + ".html"
                )
            elif package_ecosystem == "langgraph":
-                if module.startswith("langgraph.checkpoint"):
-                    namespace = "checkpoints"
-                elif module.startswith("langgraph.graph"):
-                    namespace = "graphs"
-                elif module.startswith("langgraph.prebuilt"):
-                    namespace = "prebuilt"
-                elif module.startswith("langgraph.errors"):
-                    namespace = "errors"
-                else:
+                if (module, class_name) not in WELL_KNOWN_LANGGRAPH_OBJECTS:
                    # Likely not documented yet
-                    # Unable to determine the namespace
                    continue

-                if module.startswith("langgraph.errors"):
-                    # Has different URL structure than other modules
-                    url = (
-                        _LANGGRAPH_API_REFERENCE
-                        + namespace
-                        + "/#langgraph.errors."
-                        + class_name  # Uses the actual class name here.
-                    )
-                else:
-                    if (module, class_name) not in WELL_KNOWN_LANGGRAPH_OBJECTS:
-                        # Likely not documented yet
-                        continue
-                    url = (
-                        _LANGGRAPH_API_REFERENCE + namespace + "/#" + class_name.lower()
-                    )
+                source_module, namespace = WELL_KNOWN_LANGGRAPH_OBJECTS[
+                    (module, class_name)
+                ]
+                url = (
+                    _LANGGRAPH_API_REFERENCE
+                    + namespace
+                    + "/#"
+                    + source_module
+                    + "."
+                    + class_name
+                )
            else:
                raise ValueError(f"Invalid package ecosystem: {package_ecosystem}")

--- a/docs/src/theme/FeatureTables.js
+++ b/docs/src/theme/FeatureTables.js
@@ -8,7 +8,8 @@ const FEATURE_TABLES = {
    chat: {
        link: "/docs/integrations/chat",
        columns: [
-            {title: "Provider", formatter: (item) => <a href={item.link}>{item.name}</a>},
+            {title: "Provider", mode: "category", formatter: (item) => <a href={item.link}>{item.name}</a>},
+            {title: "Provider", mode: "item", formatter: (item) => <a href={item.link}>{item.name}</a>},
            {title: <a href="/docs/how_to/tool_calling">Tool calling</a>, formatter: (item) => item.tool_calling ? "✅" : "❌"},
            {title: <a href="/docs/how_to/structured_output/">Structured output</a>, formatter: (item) => item.structured_output ? "✅" : "❌"},
            {title: "JSON mode", formatter: (item) => item.json_mode ? "✅" : "❌"},
@@ -221,7 +222,7 @@ const FEATURE_TABLES = {
    llms: {
        link: "/docs/integrations/llms",
        columns: [
-            {title: "Provider", formatter: (item) => <a href={
+            {title: "Provider", mode: "category", formatter: (item) => <a href={
                item.link
            }>{item.name}</a>},
            {title: "Package", formatter: (item) => <a href={
@@ -294,7 +295,8 @@ const FEATURE_TABLES = {
    text_embedding: {
        link: "/docs/integrations/text_embedding",
        columns: [
-            {title: "Provider", formatter: (item) => <a href={item.link}>{item.name}</a>},
+            {title: "Provider", mode: "category", formatter: (item) => <a href={item.link}>{item.name}</a>},
+            {title: "Provider", mode: "item", formatter: (item) => <a href={`/docs/integrations/${item.top ? "platforms":"providers"}/${item.link}`}>{item.name}</a>},
            {title: "Package", formatter: (item) => <a href={item.apiLink}>{item.package}</a>},
        ],
        items:[
@@ -1120,7 +1122,7 @@ const DEPRECATED_DOC_IDS = [
  "integrations/text_embedding/ernie",
 ];

-function toTable(columns, items) {
+function toTable(columns, items, mode) {
    const headers = columns.map((col) => col.title);
    return (
        <table>
@@ -1132,7 +1134,7 @@ function toTable(columns, items) {
            <tbody>
                {items.map((item, i) => (
                    <tr key={`row-${i}`}>
-                        {columns.map((col, j) => <td key={`cell-${i}-${j}`}>{col.formatter(item)}</td>)}
+                        {columns.filter(col => !col.mode || col.mode === mode).map((col, j) => <td key={`cell-${i}-${j}`}>{col.formatter(item)}</td>)}
                    </tr>
                ))}
            </tbody>
@@ -1142,7 +1144,7 @@ function toTable(columns, items) {

 export function CategoryTable({ category }) {
    const cat = FEATURE_TABLES[category];
-    const rtn = toTable(cat.columns, cat.items);
+    const rtn = toTable(cat.columns, cat.items, "category");
    return rtn;
 }

@@ -1152,7 +1154,7 @@ export function ItemTable({ category, item }) {
    if (!row) {
        throw new Error(`Item ${item} not found in category ${category}`);
    }
-    const rtn = toTable(cat.columns, [row]);
+    const rtn = toTable(cat.columns, [row], "item");
    return rtn;
 }

@@ -1185,6 +1187,7 @@ export function IndexTable() {
      },
    ],
    rows,
+    "index",
  );
  return rtn;
 }
--- a/libs/community/langchain_community/callbacks/openai_info.py
+++ b/libs/community/langchain_community/callbacks/openai_info.py
@@ -27,11 +27,11 @@ MODEL_COST_PER_1K_TOKENS = {
    "gpt-4o-mini-completion": 0.0006,
    "gpt-4o-mini-2024-07-18-completion": 0.0006,
    # GPT-4o input
-    "gpt-4o": 0.005,
+    "gpt-4o": 0.0025,
    "gpt-4o-2024-05-13": 0.005,
    "gpt-4o-2024-08-06": 0.0025,
    # GPT-4o output
-    "gpt-4o-completion": 0.015,
+    "gpt-4o-completion": 0.01,
    "gpt-4o-2024-05-13-completion": 0.015,
    "gpt-4o-2024-08-06-completion": 0.01,
    # GPT-4 input
--- a/libs/community/langchain_community/chat_models/init.py
+++ b/libs/community/langchain_community/chat_models/init.py
@@ -149,6 +149,7 @@ if TYPE_CHECKING:
    )
    from langchain_community.chat_models.sambanova import (
        ChatSambaNovaCloud,
+        ChatSambaStudio,
    )
    from langchain_community.chat_models.snowflake import (
        ChatSnowflakeCortex,
@@ -215,6 +216,7 @@ __all__ = [
    "ChatPerplexity",
    "ChatPremAI",
    "ChatSambaNovaCloud",
+    "ChatSambaStudio",
    "ChatSparkLLM",
    "ChatSnowflakeCortex",
    "ChatTongyi",
@@ -274,6 +276,7 @@ _module_lookup = {
    "ChatOpenAI": "langchain_community.chat_models.openai",
    "ChatPerplexity": "langchain_community.chat_models.perplexity",
    "ChatSambaNovaCloud": "langchain_community.chat_models.sambanova",
+    "ChatSambaStudio": "langchain_community.chat_models.sambanova",
    "ChatSnowflakeCortex": "langchain_community.chat_models.snowflake",
    "ChatSparkLLM": "langchain_community.chat_models.sparkllm",
    "ChatTongyi": "langchain_community.chat_models.tongyi",
--- a/libs/community/langchain_community/chat_models/llamacpp.py
+++ b/libs/community/langchain_community/chat_models/llamacpp.py
@@ -342,7 +342,7 @@ class ChatLlamaCpp(BaseChatModel):
        self,
        tools: Sequence[Union[Dict[str, Any], Type[BaseModel], Callable, BaseTool]],
        *,
-        tool_choice: Optional[Union[Dict[str, Dict], bool, str]] = None,
+        tool_choice: Optional[Union[dict, bool, str]] = None,
        **kwargs: Any,
    ) -> Runnable[LanguageModelInput, BaseMessage]:
        """Bind tool-like objects to this chat model
@@ -538,7 +538,8 @@ class ChatLlamaCpp(BaseChatModel):
                "Received None."
            )
        tool_name = convert_to_openai_tool(schema)["function"]["name"]
-        llm = self.bind_tools([schema], tool_choice=tool_name)
+        tool_choice = {"type": "function", "function": {"name": tool_name}}
+        llm = self.bind_tools([schema], tool_choice=tool_choice)
        if is_pydantic_schema:
            output_parser: OutputParserLike = PydanticToolsParser(
                tools=[cast(Type, schema)], first_tool_only=True
--- a/libs/community/langchain_community/chat_models/sambanova.py
+++ b/libs/community/langchain_community/chat_models/sambanova.py
--- a/libs/community/langchain_community/chat_models/snowflake.py
+++ b/libs/community/langchain_community/chat_models/snowflake.py
@@ -17,7 +17,7 @@ from langchain_core.utils import (
    get_pydantic_field_names,
    pre_init,
 )
-from langchain_core.utils.utils import build_extra_kwargs
+from langchain_core.utils.utils import _build_model_kwargs
 from pydantic import Field, SecretStr, model_validator

 SUPPORTED_ROLES: List[str] = [
@@ -131,10 +131,7 @@ class ChatSnowflakeCortex(BaseChatModel):
    def build_extra(cls, values: Dict[str, Any]) -> Any:
        """Build extra kwargs from additional params that were passed in."""
        all_required_field_names = get_pydantic_field_names(cls)
-        extra = values.get("model_kwargs", {})
-        values["model_kwargs"] = build_extra_kwargs(
-            extra, values, all_required_field_names
-        )
+        values = _build_model_kwargs(values, all_required_field_names)
        return values

    @pre_init
--- a/libs/community/langchain_community/graphs/neo4j_graph.py
+++ b/libs/community/langchain_community/graphs/neo4j_graph.py
@@ -430,7 +430,7 @@ class Neo4jGraph(GraphStore):
        try:
            data, _, _ = self._driver.execute_query(
                Query(text=query, timeout=self.timeout),
-                database=self._database,
+                database_=self._database,
                parameters_=params,
            )
            json_data = [r.data() for r in data]
@@ -457,7 +457,7 @@ class Neo4jGraph(GraphStore):
            ):
                raise
        # fallback to allow implicit transactions
-        with self._driver.session() as session:
+        with self._driver.session(database=self._database) as session:
            data = session.run(Query(text=query, timeout=self.timeout), params)
            json_data = [r.data() for r in data]
            if self.sanitize:
--- a/libs/community/langchain_community/llms/anthropic.py
+++ b/libs/community/langchain_community/llms/anthropic.py
@@ -26,7 +26,7 @@ from langchain_core.utils import (
    get_pydantic_field_names,
    pre_init,
 )
-from langchain_core.utils.utils import build_extra_kwargs, convert_to_secret_str
+from langchain_core.utils.utils import _build_model_kwargs, convert_to_secret_str
 from pydantic import ConfigDict, Field, SecretStr, model_validator


@@ -69,11 +69,8 @@ class _AnthropicCommon(BaseLanguageModel):
    @model_validator(mode="before")
    @classmethod
    def build_extra(cls, values: Dict) -> Any:
-        extra = values.get("model_kwargs", {})
        all_required_field_names = get_pydantic_field_names(cls)
-        values["model_kwargs"] = build_extra_kwargs(
-            extra, values, all_required_field_names
-        )
+        values = _build_model_kwargs(values, all_required_field_names)
        return values

    @pre_init
--- a/libs/community/langchain_community/llms/azureml_endpoint.py
+++ b/libs/community/langchain_community/llms/azureml_endpoint.py
@@ -9,7 +9,7 @@ from langchain_core.callbacks.manager import CallbackManagerForLLMRun
 from langchain_core.language_models.llms import BaseLLM
 from langchain_core.outputs import Generation, LLMResult
 from langchain_core.utils import convert_to_secret_str, get_from_dict_or_env
-from pydantic import BaseModel, SecretStr, model_validator, validator
+from pydantic import BaseModel, ConfigDict, SecretStr, model_validator, validator

 DEFAULT_TIMEOUT = 50

@@ -382,6 +382,8 @@ class AzureMLBaseEndpoint(BaseModel):
    model_kwargs: Optional[dict] = None
    """Keyword arguments to pass to the model."""

+    model_config = ConfigDict(protected_namespaces=())
+
    @model_validator(mode="before")
    @classmethod
    def validate_environ(cls, values: Dict) -> Any:
--- a/libs/community/langchain_community/llms/bedrock.py
+++ b/libs/community/langchain_community/llms/bedrock.py
@@ -295,6 +295,8 @@ class LLMInputOutputAdapter:
 class BedrockBase(BaseModel, ABC):
    """Base class for Bedrock models."""

+    model_config = ConfigDict(protected_namespaces=())
+
    client: Any = Field(exclude=True)  #: :meta private:

    region_name: Optional[str] = None
--- a/libs/community/langchain_community/llms/llamacpp.py
+++ b/libs/community/langchain_community/llms/llamacpp.py
@@ -8,7 +8,7 @@ from langchain_core.callbacks import CallbackManagerForLLMRun
 from langchain_core.language_models.llms import LLM
 from langchain_core.outputs import GenerationChunk
 from langchain_core.utils import get_pydantic_field_names, pre_init
-from langchain_core.utils.utils import build_extra_kwargs
+from langchain_core.utils.utils import _build_model_kwargs
 from pydantic import Field, model_validator

 logger = logging.getLogger(__name__)
@@ -199,10 +199,7 @@ class LlamaCpp(LLM):
    def build_model_kwargs(cls, values: Dict[str, Any]) -> Any:
        """Build extra kwargs from additional params that were passed in."""
        all_required_field_names = get_pydantic_field_names(cls)
-        extra = values.get("model_kwargs", {})
-        values["model_kwargs"] = build_extra_kwargs(
-            extra, values, all_required_field_names
-        )
+        values = _build_model_kwargs(values, all_required_field_names)
        return values

    @property
--- a/libs/community/langchain_community/llms/minimax.py
+++ b/libs/community/langchain_community/llms/minimax.py
@@ -16,7 +16,7 @@ from langchain_core.callbacks import (
 )
 from langchain_core.language_models.llms import LLM
 from langchain_core.utils import convert_to_secret_str, get_from_dict_or_env, pre_init
-from pydantic import BaseModel, Field, SecretStr, model_validator
+from pydantic import BaseModel, ConfigDict, Field, SecretStr, model_validator

 from langchain_community.llms.utils import enforce_stop_tokens

@@ -58,6 +58,8 @@ class _MinimaxEndpointClient(BaseModel):
 class MinimaxCommon(BaseModel):
    """Common parameters for Minimax large language models."""

+    model_config = ConfigDict(protected_namespaces=())
+
    _client: _MinimaxEndpointClient
    model: str = "abab5.5-chat"
    """Model name to use."""
--- a/libs/community/langchain_community/llms/openai.py
+++ b/libs/community/langchain_community/llms/openai.py
@@ -34,7 +34,7 @@ from langchain_core.utils import (
    pre_init,
 )
 from langchain_core.utils.pydantic import get_fields
-from langchain_core.utils.utils import build_extra_kwargs
+from langchain_core.utils.utils import _build_model_kwargs
 from pydantic import ConfigDict, Field, model_validator

 from langchain_community.utils.openai import is_openai_v1
@@ -268,10 +268,7 @@ class BaseOpenAI(BaseLLM):
    def build_extra(cls, values: Dict[str, Any]) -> Any:
        """Build extra kwargs from additional params that were passed in."""
        all_required_field_names = get_pydantic_field_names(cls)
-        extra = values.get("model_kwargs", {})
-        values["model_kwargs"] = build_extra_kwargs(
-            extra, values, all_required_field_names
-        )
+        values = _build_model_kwargs(values, all_required_field_names)
        return values

    @pre_init
--- a/libs/community/langchain_community/llms/vertexai.py
+++ b/libs/community/langchain_community/llms/vertexai.py
@@ -11,7 +11,7 @@ from langchain_core.callbacks.manager import (
 from langchain_core.language_models.llms import BaseLLM
 from langchain_core.outputs import Generation, GenerationChunk, LLMResult
 from langchain_core.utils import pre_init
-from pydantic import BaseModel, Field
+from pydantic import BaseModel, ConfigDict, Field

 from langchain_community.utilities.vertexai import (
    create_retry_decorator,
@@ -100,6 +100,8 @@ async def acompletion_with_retry(


 class _VertexAIBase(BaseModel):
+    model_config = ConfigDict(protected_namespaces=())
+
    project: Optional[str] = None
    "The default GCP project to use when making Vertex API calls."
    location: str = "us-central1"
--- a/libs/community/langchain_community/llms/volcengine_maas.py
+++ b/libs/community/langchain_community/llms/volcengine_maas.py
@@ -6,12 +6,14 @@ from langchain_core.callbacks import CallbackManagerForLLMRun
 from langchain_core.language_models.llms import LLM
 from langchain_core.outputs import GenerationChunk
 from langchain_core.utils import convert_to_secret_str, get_from_dict_or_env, pre_init
-from pydantic import BaseModel, Field, SecretStr
+from pydantic import BaseModel, ConfigDict, Field, SecretStr


 class VolcEngineMaasBase(BaseModel):
    """Base class for VolcEngineMaas models."""

+    model_config = ConfigDict(protected_namespaces=())
+
    client: Any = None

    volc_engine_maas_ak: Optional[SecretStr] = None
--- a/libs/community/langchain_community/storage/sql.py
+++ b/libs/community/langchain_community/storage/sql.py
@@ -95,7 +95,7 @@ class SQLStore(BaseStore[str, bytes]):

        .. code-block:: python

-            from langchain_rag.storage import SQLStore
+            from langchain_community.storage import SQLStore

            # Instantiate the SQLStore with the root path
            sql_store = SQLStore(namespace="test", db_url="sqlite://:memory:")
--- a/libs/community/langchain_community/tools/databricks/_execution.py
+++ b/libs/community/langchain_community/tools/databricks/_execution.py
@@ -1,5 +1,8 @@
 import inspect
 import json
+import logging
+import os
+import time
 from dataclasses import dataclass
 from io import StringIO
 from typing import TYPE_CHECKING, Any, Dict, List, Literal, Optional
@@ -7,7 +10,7 @@ from typing import TYPE_CHECKING, Any, Dict, List, Literal, Optional
 if TYPE_CHECKING:
    from databricks.sdk import WorkspaceClient
    from databricks.sdk.service.catalog import FunctionInfo
-    from databricks.sdk.service.sql import StatementParameterListItem
+    from databricks.sdk.service.sql import StatementParameterListItem, StatementState

 EXECUTE_FUNCTION_ARG_NAME = "__execution_args__"
 DEFAULT_EXECUTE_FUNCTION_ARGS = {
@@ -15,6 +18,9 @@ DEFAULT_EXECUTE_FUNCTION_ARGS = {
    "row_limit": 100,
    "byte_limit": 4096,
 }
+UC_TOOL_CLIENT_EXECUTION_TIMEOUT = "UC_TOOL_CLIENT_EXECUTION_TIMEOUT"
+DEFAULT_UC_TOOL_CLIENT_EXECUTION_TIMEOUT = "120"
+_logger = logging.getLogger(__name__)


 def is_scalar(function: "FunctionInfo") -> bool:
@@ -174,13 +180,42 @@ def execute_function(
        parameters=parametrized_statement.parameters,
        **execute_statement_args,  # type: ignore
    )
-    status = response.status
-    assert status is not None, f"Statement execution failed: {response}"
-    if status.state != StatementState.SUCCEEDED:
-        error = status.error
+    if response.status and job_pending(response.status.state) and response.statement_id:
+        statement_id = response.statement_id
+        wait_time = 0
+        retry_cnt = 0
+        client_execution_timeout = int(
+            os.environ.get(
+                UC_TOOL_CLIENT_EXECUTION_TIMEOUT,
+                DEFAULT_UC_TOOL_CLIENT_EXECUTION_TIMEOUT,
+            )
+        )
+        while wait_time < client_execution_timeout:
+            wait = min(2**retry_cnt, client_execution_timeout - wait_time)
+            _logger.debug(
+                f"Retrying {retry_cnt} time to get statement execution "
+                f"status after {wait} seconds."
+            )
+            time.sleep(wait)
+            response = ws.statement_execution.get_statement(statement_id)  # type: ignore
+            if response.status is None or not job_pending(response.status.state):
+                break
+            wait_time += wait
+            retry_cnt += 1
+        if response.status and job_pending(response.status.state):
+            return FunctionExecutionResult(
+                error=f"Statement execution is still pending after {wait_time} "
+                "seconds. Please increase the wait_timeout argument for executing "
+                f"the function or increase {UC_TOOL_CLIENT_EXECUTION_TIMEOUT} "
+                "environment variable for increasing retrying time, default is "
+                f"{DEFAULT_UC_TOOL_CLIENT_EXECUTION_TIMEOUT} seconds."
+            )
+    assert response.status is not None, f"Statement execution failed: {response}"
+    if response.status.state != StatementState.SUCCEEDED:
+        error = response.status.error
        assert (
            error is not None
-        ), "Statement execution failed but no error message was provided."
+        ), f"Statement execution failed but no error message was provided: {response}"
        return FunctionExecutionResult(error=f"{error.error_code}: {error.message}")
    manifest = response.manifest
    assert manifest is not None
@@ -211,3 +246,9 @@ def execute_function(
        return FunctionExecutionResult(
            format="CSV", value=csv_buffer.getvalue(), truncated=truncated
        )
+
+
+def job_pending(state: Optional["StatementState"]) -> bool:
+    from databricks.sdk.service.sql import StatementState
+
+    return state in (StatementState.PENDING, StatementState.RUNNING)
--- a/libs/community/langchain_community/vectorstores/azuresearch.py
+++ b/libs/community/langchain_community/vectorstores/azuresearch.py
@@ -1769,6 +1769,8 @@ def _reorder_results_with_maximal_marginal_relevance(
        )
        for result in results
    ]
+    if not docs:
+        return []
    documents, scores, vectors = map(list, zip(*docs))

    # Get the new order of results.
--- a/libs/community/langchain_community/vectorstores/cassandra.py
+++ b/libs/community/langchain_community/vectorstores/cassandra.py
@@ -1,6 +1,7 @@
 from __future__ import annotations

 import asyncio
+import importlib.metadata
 import typing
 import uuid
 from typing import (
@@ -18,6 +19,7 @@ from typing import (
 )

 import numpy as np
+from packaging.version import Version  # this is a lancghain-core dependency

 if typing.TYPE_CHECKING:
    from cassandra.cluster import Session
@@ -30,6 +32,7 @@ from langchain_community.utilities.cassandra import SetupMode
 from langchain_community.vectorstores.utils import maximal_marginal_relevance

 CVST = TypeVar("CVST", bound="Cassandra")
+MIN_CASSIO_VERSION = Version("0.1.10")


 class Cassandra(VectorStore):
@@ -110,6 +113,15 @@ class Cassandra(VectorStore):
                "Could not import cassio python package. "
                "Please install it with `pip install cassio`."
            )
+        cassio_version = Version(importlib.metadata.version("cassio"))
+
+        if cassio_version is not None and cassio_version < MIN_CASSIO_VERSION:
+            msg = (
+                "Cassio version not supported. Please upgrade cassio "
+                f"to version {MIN_CASSIO_VERSION} or higher."
+            )
+            raise ImportError(msg)
+
        if not table_name:
            raise ValueError("Missing required parameter 'table_name'.")
        self.embedding = embedding
@@ -143,6 +155,9 @@ class Cassandra(VectorStore):
            **kwargs,
        )

+        if self.session is None:
+            self.session = self.table.session
+
    @property
    def embeddings(self) -> Embeddings:
        return self.embedding
@@ -231,6 +246,70 @@ class Cassandra(VectorStore):
            await self.adelete_by_document_id(document_id)
        return True

+    def delete_by_metadata_filter(
+        self,
+        filter: dict[str, Any],
+        *,
+        batch_size: int = 50,
+    ) -> int:
+        """Delete all documents matching a certain metadata filtering condition.
+
+        This operation does not use the vector embeddings in any way, it simply
+        removes all documents whose metadata match the provided condition.
+
+        Args:
+            filter: Filter on the metadata to apply. The filter cannot be empty.
+            batch_size: amount of deletions per each batch (until exhaustion of
+                the matching documents).
+
+        Returns:
+            A number expressing the amount of deleted documents.
+        """
+        if not filter:
+            msg = (
+                "Method `delete_by_metadata_filter` does not accept an empty "
+                "filter. Use the `clear()` method if you really want to empty "
+                "the vector store."
+            )
+            raise ValueError(msg)
+
+        return self.table.find_and_delete_entries(
+            metadata=filter,
+            batch_size=batch_size,
+        )
+
+    async def adelete_by_metadata_filter(
+        self,
+        filter: dict[str, Any],
+        *,
+        batch_size: int = 50,
+    ) -> int:
+        """Delete all documents matching a certain metadata filtering condition.
+
+        This operation does not use the vector embeddings in any way, it simply
+        removes all documents whose metadata match the provided condition.
+
+        Args:
+            filter: Filter on the metadata to apply. The filter cannot be empty.
+            batch_size: amount of deletions per each batch (until exhaustion of
+                the matching documents).
+
+        Returns:
+            A number expressing the amount of deleted documents.
+        """
+        if not filter:
+            msg = (
+                "Method `delete_by_metadata_filter` does not accept an empty "
+                "filter. Use the `clear()` method if you really want to empty "
+                "the vector store."
+            )
+            raise ValueError(msg)
+
+        return await self.table.afind_and_delete_entries(
+            metadata=filter,
+            batch_size=batch_size,
+        )
+
    def add_texts(
        self,
        texts: Iterable[str],
@@ -333,6 +412,180 @@ class Cassandra(VectorStore):
            await asyncio.gather(*tasks)
        return ids

+    def replace_metadata(
+        self,
+        id_to_metadata: dict[str, dict],
+        *,
+        batch_size: int = 50,
+    ) -> None:
+        """Replace the metadata of documents.
+
+        For each document to update, identified by its ID, the new metadata
+        dictionary completely replaces what is on the store. This includes
+        passing empty metadata `{}` to erase the currently-stored information.
+
+        Args:
+            id_to_metadata: map from the Document IDs to modify to the
+                new metadata for updating.
+                Keys in this dictionary that do not correspond to an existing
+                document will not cause an error, rather will result in new
+                rows being written into the Cassandra table but without an
+                associated vector: hence unreachable through vector search.
+            batch_size: Number of concurrent requests to send to the server.
+
+        Returns:
+            None if the writes succeed (otherwise an error is raised).
+        """
+        ids_and_metadatas = list(id_to_metadata.items())
+        for i in range(0, len(ids_and_metadatas), batch_size):
+            batch_i_m = ids_and_metadatas[i : i + batch_size]
+            futures = [
+                self.table.put_async(
+                    row_id=doc_id,
+                    metadata=doc_md,
+                )
+                for doc_id, doc_md in batch_i_m
+            ]
+            for future in futures:
+                future.result()
+        return
+
+    async def areplace_metadata(
+        self,
+        id_to_metadata: dict[str, dict],
+        *,
+        concurrency: int = 50,
+    ) -> None:
+        """Replace the metadata of documents.
+
+        For each document to update, identified by its ID, the new metadata
+        dictionary completely replaces what is on the store. This includes
+        passing empty metadata `{}` to erase the currently-stored information.
+
+        Args:
+            id_to_metadata: map from the Document IDs to modify to the
+                new metadata for updating.
+                Keys in this dictionary that do not correspond to an existing
+                document will not cause an error, rather will result in new
+                rows being written into the Cassandra table but without an
+                associated vector: hence unreachable through vector search.
+            concurrency: Number of concurrent queries to the database.
+                Defaults to 50.
+
+        Returns:
+            None if the writes succeed (otherwise an error is raised).
+        """
+        ids_and_metadatas = list(id_to_metadata.items())
+
+        sem = asyncio.Semaphore(concurrency)
+
+        async def send_concurrently(doc_id: str, doc_md: dict) -> None:
+            async with sem:
+                await self.table.aput(
+                    row_id=doc_id,
+                    metadata=doc_md,
+                )
+
+        for doc_id, doc_md in ids_and_metadatas:
+            tasks = [asyncio.create_task(send_concurrently(doc_id, doc_md))]
+            await asyncio.gather(*tasks)
+
+        return
+
+    @staticmethod
+    def _row_to_document(row: Dict[str, Any]) -> Document:
+        return Document(
+            id=row["row_id"],
+            page_content=row["body_blob"],
+            metadata=row["metadata"],
+        )
+
+    def get_by_document_id(self, document_id: str) -> Document | None:
+        """Get by document ID.
+
+        Args:
+            document_id: the document ID to get.
+        """
+        row = self.table.get(row_id=document_id)
+        if row is None:
+            return None
+        return self._row_to_document(row=row)
+
+    async def aget_by_document_id(self, document_id: str) -> Document | None:
+        """Get by document ID.
+
+        Args:
+            document_id: the document ID to get.
+        """
+        row = await self.table.aget(row_id=document_id)
+        if row is None:
+            return None
+        return self._row_to_document(row=row)
+
+    def metadata_search(
+        self,
+        metadata: dict[str, Any] = {},  # noqa: B006
+        n: int = 5,
+    ) -> Iterable[Document]:
+        """Get documents via a metadata search.
+
+        Args:
+            metadata: the metadata to query for.
+        """
+        rows = self.table.find_entries(metadata=metadata, n=n)
+        return [self._row_to_document(row=row) for row in rows if row]
+
+    async def ametadata_search(
+        self,
+        metadata: dict[str, Any] = {},  # noqa: B006
+        n: int = 5,
+    ) -> Iterable[Document]:
+        """Get documents via a metadata search.
+
+        Args:
+            metadata: the metadata to query for.
+        """
+        rows = await self.table.afind_entries(metadata=metadata, n=n)
+        return [self._row_to_document(row=row) for row in rows]
+
+    async def asimilarity_search_with_embedding_id_by_vector(
+        self,
+        embedding: List[float],
+        k: int = 4,
+        filter: Optional[Dict[str, str]] = None,
+        body_search: Optional[Union[str, List[str]]] = None,
+    ) -> List[Tuple[Document, List[float], str]]:
+        """Return docs most similar to embedding vector.
+
+        Args:
+            embedding: Embedding to look up documents similar to.
+            k: Number of Documents to return. Defaults to 4.
+            filter: Filter on the metadata to apply.
+            body_search: Document textual search terms to apply.
+                Only supported by Astra DB at the moment.
+        Returns:
+            List of (Document, embedding, id), the most similar to the query vector.
+        """
+        kwargs: Dict[str, Any] = {}
+        if filter is not None:
+            kwargs["metadata"] = filter
+        if body_search is not None:
+            kwargs["body_search"] = body_search
+
+        hits = await self.table.aann_search(
+            vector=embedding,
+            n=k,
+            **kwargs,
+        )
+        return [
+            (
+                self._row_to_document(row=hit),
+                hit["vector"],
+                hit["row_id"],
+            )
+            for hit in hits
+        ]
+
    @staticmethod
    def _search_to_documents(
        hits: Iterable[Dict[str, Any]],
@@ -341,10 +594,7 @@ class Cassandra(VectorStore):
        # (1=most relevant), as required by this class' contract.
        return [
            (
-                Document(
-                    page_content=hit["body_blob"],
-                    metadata=hit["metadata"],
-                ),
+                Cassandra._row_to_document(row=hit),
                0.5 + 0.5 * hit["distance"],
                hit["row_id"],
            )
@@ -375,7 +625,6 @@ class Cassandra(VectorStore):
            kwargs["metadata"] = filter
        if body_search is not None:
            kwargs["body_search"] = body_search
-
        hits = self.table.metric_ann_search(
            vector=embedding,
            n=k,
@@ -712,13 +961,7 @@ class Cassandra(VectorStore):
            for pf_index, pf_hit in enumerate(prefetch_hits)
            if pf_index in mmr_chosen_indices
        ]
-        return [
-            Document(
-                page_content=hit["body_blob"],
-                metadata=hit["metadata"],
-            )
-            for hit in mmr_hits
-        ]
+        return [Cassandra._row_to_document(row=hit) for hit in mmr_hits]

    def max_marginal_relevance_search_by_vector(
        self,
--- a/libs/community/langchain_community/vectorstores/infinispanvs.py
+++ b/libs/community/langchain_community/vectorstores/infinispanvs.py
@@ -5,9 +5,10 @@ from __future__ import annotations
 import json
 import logging
 import uuid
-from typing import Any, Iterable, List, Optional, Tuple, Type, cast
+import warnings
+from typing import Any, Iterable, List, Optional, Tuple, Type, Union, cast

-import requests
+from httpx import Response
 from langchain_core.documents import Document
 from langchain_core.embeddings import Embeddings
 from langchain_core.vectorstores import VectorStore
@@ -49,7 +50,7 @@ class InfinispanVS(VectorStore):
                            embedding=RGBEmbeddings(),
                            output_fields: ["texture", "color"],
                            lambda_key: lambda text,meta: str(meta["_key"]),
-                            lambda_content: lambda item: item["color"]})
+                            lambda_content: lambda item: item["color"])
    """

    def __init__(
@@ -58,13 +59,48 @@ class InfinispanVS(VectorStore):
        ids: Optional[List[str]] = None,
        **kwargs: Any,
    ):
+        """
+        Parameters
+        ----------
+        cache_name: str
+            Embeddings cache name. Default "vector"
+        entity_name: str
+            Protobuf entity name for the embeddings. Default "vector"
+        text_field: str
+            Protobuf field name for text. Default "text"
+        vector_field: str
+            Protobuf field name for vector. Default "vector"
+        lambda_content: lambda
+            Lambda returning the content part of an item. Default returns text_field
+        lambda_metadata: lambda
+            Lambda returning the metadata part of an item. Default returns items
+            fields excepts text_field, vector_field, _type
+        output_fields: List[str]
+            List of fields to be returned from item, if None return all fields.
+            Default None
+        kwargs: Any
+            Rest of arguments passed to Infinispan. See docs"""
        self.ispn = Infinispan(**kwargs)
        self._configuration = kwargs
        self._cache_name = str(self._configuration.get("cache_name", "vector"))
        self._entity_name = str(self._configuration.get("entity_name", "vector"))
        self._embedding = embedding
-        self._textfield = self._configuration.get("textfield", "text")
-        self._vectorfield = self._configuration.get("vectorfield", "vector")
+        self._textfield = self._configuration.get("textfield", "")
+        if self._textfield == "":
+            self._textfield = self._configuration.get("text_field", "text")
+        else:
+            warnings.warn(
+                "`textfield` is deprecated. Please use `text_field` " "param.",
+                DeprecationWarning,
+            )
+        self._vectorfield = self._configuration.get("vectorfield", "")
+        if self._vectorfield == "":
+            self._vectorfield = self._configuration.get("vector_field", "vector")
+        else:
+            warnings.warn(
+                "`vectorfield` is deprecated. Please use `vector_field` " "param.",
+                DeprecationWarning,
+            )
        self._to_content = self._configuration.get(
            "lambda_content", lambda item: self._default_content(item)
        )
@@ -121,7 +157,7 @@ repeated float %s = 1;
        metadata_proto += "}\n"
        return metadata_proto

-    def schema_create(self, proto: str) -> requests.Response:
+    def schema_create(self, proto: str) -> Response:
        """Deploy the schema for the vector db
        Args:
            proto(str): protobuf schema
@@ -130,14 +166,14 @@ repeated float %s = 1;
        """
        return self.ispn.schema_post(self._entity_name + ".proto", proto)

-    def schema_delete(self) -> requests.Response:
+    def schema_delete(self) -> Response:
        """Delete the schema for the vector db
        Returns:
            An http Response containing the result of the operation
        """
        return self.ispn.schema_delete(self._entity_name + ".proto")

-    def cache_create(self, config: str = "") -> requests.Response:
+    def cache_create(self, config: str = "") -> Response:
        """Create the cache for the vector db
        Args:
            config(str): configuration of the cache.
@@ -172,14 +208,14 @@ repeated float %s = 1;
            )
        return self.ispn.cache_post(self._cache_name, config)

-    def cache_delete(self) -> requests.Response:
+    def cache_delete(self) -> Response:
        """Delete the cache for the vector db
        Returns:
            An http Response containing the result of the operation
        """
        return self.ispn.cache_delete(self._cache_name)

-    def cache_clear(self) -> requests.Response:
+    def cache_clear(self) -> Response:
        """Clear the cache for the vector db
        Returns:
            An http Response containing the result of the operation
@@ -193,14 +229,14 @@ repeated float %s = 1;
        """
        return self.ispn.cache_exists(self._cache_name)

-    def cache_index_clear(self) -> requests.Response:
+    def cache_index_clear(self) -> Response:
        """Clear the index for the vector db
        Returns:
            An http Response containing the result of the operation
        """
        return self.ispn.index_clear(self._cache_name)

-    def cache_index_reindex(self) -> requests.Response:
+    def cache_index_reindex(self) -> Response:
        """Rebuild the for the vector db
        Returns:
            An http Response containing the result of the operation
@@ -325,12 +361,16 @@ repeated float %s = 1;
    def configure(self, metadata: dict, dimension: int) -> None:
        schema = self.schema_builder(metadata, dimension)
        output = self.schema_create(schema)
-        assert output.ok, "Unable to create schema. Already exists? "
+        assert (
+            output.status_code == self.ispn.Codes.OK
+        ), "Unable to create schema. Already exists? "
        "Consider using clear_old=True"
        assert json.loads(output.text)["error"] is None
        if not self.cache_exists():
            output = self.cache_create()
-            assert output.ok, "Unable to create cache. Already exists? "
+            assert (
+                output.status_code == self.ispn.Codes.OK
+            ), "Unable to create cache. Already exists? "
            "Consider using clear_old=True"
            # Ensure index is clean
            self.cache_index_clear()
@@ -350,7 +390,24 @@ repeated float %s = 1;
        auto_config: Optional[bool] = True,
        **kwargs: Any,
    ) -> InfinispanVS:
-        """Return VectorStore initialized from texts and embeddings."""
+        """Return VectorStore initialized from texts and embeddings.
+
+        In addition to parameters described by the super method, this
+        implementation provides other configuration params if different
+        configuration from default is needed.
+
+        Parameters
+        ----------
+        ids : List[str]
+            Additional list of keys associated to the embedding. If not
+            provided UUIDs will be generated
+        clear_old : bool
+            Whether old data must be deleted. Default True
+        auto_config: bool
+            Whether to do a complete server setup (caches,
+            protobuf definition...). Default True
+        kwargs: Any
+            Rest of arguments passed to InfinispanVS. See docs"""
        infinispanvs = cls(embedding=embedding, ids=ids, **kwargs)
        if auto_config and len(metadatas or []) > 0:
            if clear_old:
@@ -381,20 +438,83 @@ class Infinispan:
    https://github.com/rigazilla/infinispan-vector#run-infinispan
    """

-    def __init__(self, **kwargs: Any):
-        self._configuration = kwargs
-        self._schema = str(self._configuration.get("schema", "http"))
-        self._host = str(self._configuration.get("hosts", ["127.0.0.1:11222"])[0])
-        self._default_node = self._schema + "://" + self._host
-        self._cache_url = str(self._configuration.get("cache_url", "/rest/v2/caches"))
-        self._schema_url = str(self._configuration.get("cache_url", "/rest/v2/schemas"))
-        self._use_post_for_query = str(
-            self._configuration.get("use_post_for_query", True)
-        )
+    def __init__(
+        self,
+        schema: str = "http",
+        user: str = "",
+        password: str = "",
+        hosts: List[str] = ["127.0.0.1:11222"],
+        cache_url: str = "/rest/v2/caches",
+        schema_url: str = "/rest/v2/schemas",
+        use_post_for_query: bool = True,
+        http2: bool = True,
+        verify: bool = True,
+        **kwargs: Any,
+    ):
+        """
+        Parameters
+        ----------
+        schema: str
+            Schema for HTTP request: "http" or "https". Default "http"
+        user, password: str
+            User and password if auth is required. Default None
+        hosts: List[str]
+            List of server addresses. Default ["127.0.0.1:11222"]
+        cache_url: str
+            URL endpoint for cache API. Default "/rest/v2/caches"
+        schema_url: str
+            URL endpoint for schema API. Default "/rest/v2/schemas"
+        use_post_for_query: bool
+            Whether POST method should be used for query. Default True
+        http2: bool
+            Whether HTTP/2 protocol should be used. `pip install "httpx[http2]"` is
+            needed for HTTP/2. Default True
+        verify:  bool
+            Whether TLS certificate must be verified. Default True
+        """

-    def req_query(
-        self, query: str, cache_name: str, local: bool = False
-    ) -> requests.Response:
+        try:
+            import httpx
+        except ImportError:
+            raise ImportError(
+                "Could not import httpx python package. "
+                "Please install it with `pip install httpx`"
+                'or `pip install "httpx[http2]"` if you need HTTP/2.'
+            )
+
+        self.Codes = httpx.codes
+
+        self._configuration = kwargs
+        self._schema = schema
+        self._user = user
+        self._password = password
+        self._host = hosts[0]
+        self._default_node = self._schema + "://" + self._host
+        self._cache_url = cache_url
+        self._schema_url = schema_url
+        self._use_post_for_query = use_post_for_query
+        self._http2 = http2
+        if self._user and self._password:
+            if self._schema == "http":
+                auth: Union[Tuple[str, str], httpx.DigestAuth] = httpx.DigestAuth(
+                    username=self._user, password=self._password
+                )
+            else:
+                auth = (self._user, self._password)
+            self._h2c = httpx.Client(
+                http2=self._http2,
+                http1=not self._http2,
+                auth=auth,
+                verify=verify,
+            )
+        else:
+            self._h2c = httpx.Client(
+                http2=self._http2,
+                http1=not self._http2,
+                verify=verify,
+            )
+
+    def req_query(self, query: str, cache_name: str, local: bool = False) -> Response:
        """Request a query
        Args:
            query(str): query requested
@@ -409,7 +529,7 @@ class Infinispan:

    def _query_post(
        self, query_str: str, cache_name: str, local: bool = False
-    ) -> requests.Response:
+    ) -> Response:
        api_url = (
            self._default_node
            + self._cache_url
@@ -420,9 +540,9 @@ class Infinispan:
        )
        data = {"query": query_str}
        data_json = json.dumps(data)
-        response = requests.post(
+        response = self._h2c.post(
            api_url,
-            data_json,
+            content=data_json,
            headers={"Content-Type": "application/json"},
            timeout=REST_TIMEOUT,
        )
@@ -430,7 +550,7 @@ class Infinispan:

    def _query_get(
        self, query_str: str, cache_name: str, local: bool = False
-    ) -> requests.Response:
+    ) -> Response:
        api_url = (
            self._default_node
            + self._cache_url
@@ -441,10 +561,10 @@ class Infinispan:
            + "&local="
            + str(local)
        )
-        response = requests.get(api_url, timeout=REST_TIMEOUT)
+        response = self._h2c.get(api_url, timeout=REST_TIMEOUT)
        return response

-    def post(self, key: str, data: str, cache_name: str) -> requests.Response:
+    def post(self, key: str, data: str, cache_name: str) -> Response:
        """Post an entry
        Args:
            key(str): key of the entry
@@ -454,15 +574,15 @@ class Infinispan:
            An http Response containing the result of the operation
        """
        api_url = self._default_node + self._cache_url + "/" + cache_name + "/" + key
-        response = requests.post(
+        response = self._h2c.post(
            api_url,
-            data,
+            content=data,
            headers={"Content-Type": "application/json"},
            timeout=REST_TIMEOUT,
        )
        return response

-    def put(self, key: str, data: str, cache_name: str) -> requests.Response:
+    def put(self, key: str, data: str, cache_name: str) -> Response:
        """Put an entry
        Args:
            key(str): key of the entry
@@ -472,15 +592,15 @@ class Infinispan:
            An http Response containing the result of the operation
        """
        api_url = self._default_node + self._cache_url + "/" + cache_name + "/" + key
-        response = requests.put(
+        response = self._h2c.put(
            api_url,
-            data,
+            content=data,
            headers={"Content-Type": "application/json"},
            timeout=REST_TIMEOUT,
        )
        return response

-    def get(self, key: str, cache_name: str) -> requests.Response:
+    def get(self, key: str, cache_name: str) -> Response:
        """Get an entry
        Args:
            key(str): key of the entry
@@ -489,12 +609,12 @@ class Infinispan:
            An http Response containing the entry or errors
        """
        api_url = self._default_node + self._cache_url + "/" + cache_name + "/" + key
-        response = requests.get(
+        response = self._h2c.get(
            api_url, headers={"Content-Type": "application/json"}, timeout=REST_TIMEOUT
        )
        return response

-    def schema_post(self, name: str, proto: str) -> requests.Response:
+    def schema_post(self, name: str, proto: str) -> Response:
        """Deploy a schema
        Args:
            name(str): name of the schema. Will be used as a key
@@ -503,10 +623,10 @@ class Infinispan:
            An http Response containing the result of the operation
        """
        api_url = self._default_node + self._schema_url + "/" + name
-        response = requests.post(api_url, proto, timeout=REST_TIMEOUT)
+        response = self._h2c.post(api_url, content=proto, timeout=REST_TIMEOUT)
        return response

-    def cache_post(self, name: str, config: str) -> requests.Response:
+    def cache_post(self, name: str, config: str) -> Response:
        """Create a cache
        Args:
            name(str): name of the cache.
@@ -515,15 +635,15 @@ class Infinispan:
            An http Response containing the result of the operation
        """
        api_url = self._default_node + self._cache_url + "/" + name
-        response = requests.post(
+        response = self._h2c.post(
            api_url,
-            config,
+            content=config,
            headers={"Content-Type": "application/json"},
            timeout=REST_TIMEOUT,
        )
        return response

-    def schema_delete(self, name: str) -> requests.Response:
+    def schema_delete(self, name: str) -> Response:
        """Delete a schema
        Args:
            name(str): name of the schema.
@@ -531,10 +651,10 @@ class Infinispan:
            An http Response containing the result of the operation
        """
        api_url = self._default_node + self._schema_url + "/" + name
-        response = requests.delete(api_url, timeout=REST_TIMEOUT)
+        response = self._h2c.delete(api_url, timeout=REST_TIMEOUT)
        return response

-    def cache_delete(self, name: str) -> requests.Response:
+    def cache_delete(self, name: str) -> Response:
        """Delete a cache
        Args:
            name(str): name of the cache.
@@ -542,10 +662,10 @@ class Infinispan:
            An http Response containing the result of the operation
        """
        api_url = self._default_node + self._cache_url + "/" + name
-        response = requests.delete(api_url, timeout=REST_TIMEOUT)
+        response = self._h2c.delete(api_url, timeout=REST_TIMEOUT)
        return response

-    def cache_clear(self, cache_name: str) -> requests.Response:
+    def cache_clear(self, cache_name: str) -> Response:
        """Clear a cache
        Args:
            cache_name(str): name of the cache.
@@ -555,7 +675,7 @@ class Infinispan:
        api_url = (
            self._default_node + self._cache_url + "/" + cache_name + "?action=clear"
        )
-        response = requests.post(api_url, timeout=REST_TIMEOUT)
+        response = self._h2c.post(api_url, timeout=REST_TIMEOUT)
        return response

    def cache_exists(self, cache_name: str) -> bool:
@@ -570,18 +690,17 @@ class Infinispan:
        )
        return self.resource_exists(api_url)

-    @staticmethod
-    def resource_exists(api_url: str) -> bool:
+    def resource_exists(self, api_url: str) -> bool:
        """Check if a resource exists
        Args:
            api_url(str): url of the resource.
        Returns:
            true if resource exists
        """
-        response = requests.head(api_url, timeout=REST_TIMEOUT)
-        return response.ok
+        response = self._h2c.head(api_url, timeout=REST_TIMEOUT)
+        return response.status_code == self.Codes.OK

-    def index_clear(self, cache_name: str) -> requests.Response:
+    def index_clear(self, cache_name: str) -> Response:
        """Clear an index on a cache
        Args:
            cache_name(str): name of the cache.
@@ -595,9 +714,9 @@ class Infinispan:
            + cache_name
            + "/search/indexes?action=clear"
        )
-        return requests.post(api_url, timeout=REST_TIMEOUT)
+        return self._h2c.post(api_url, timeout=REST_TIMEOUT)

-    def index_reindex(self, cache_name: str) -> requests.Response:
+    def index_reindex(self, cache_name: str) -> Response:
        """Rebuild index on a cache
        Args:
            cache_name(str): name of the cache.
@@ -611,4 +730,4 @@ class Infinispan:
            + cache_name
            + "/search/indexes?action=reindex"
        )
-        return requests.post(api_url, timeout=REST_TIMEOUT)
+        return self._h2c.post(api_url, timeout=REST_TIMEOUT)
--- a/libs/community/langchain_community/vectorstores/neo4j_vector.py
+++ b/libs/community/langchain_community/vectorstores/neo4j_vector.py
@@ -623,7 +623,7 @@ class Neo4jVector(VectorStore):
        params = params or {}
        try:
            data, _, _ = self._driver.execute_query(
-                query, database=self._database, parameters_=params
+                query, database_=self._database, parameters_=params
            )
            return [r.data() for r in data]
        except Neo4jError as e:
@@ -646,7 +646,7 @@ class Neo4jVector(VectorStore):
            ):
                raise
        # Fallback to allow implicit transactions
-        with self._driver.session() as session:
+        with self._driver.session(database=self._database) as session:
            data = session.run(Query(text=query), params)
            return [r.data() for r in data]

--- a/libs/community/langchain_community/vectorstores/usearch.py
+++ b/libs/community/langchain_community/vectorstores/usearch.py
@@ -1,6 +1,6 @@
 from __future__ import annotations

-from typing import Any, Dict, Iterable, List, Optional, Tuple
+from typing import Any, Dict, Iterable, List, Optional, Tuple, Union

 import numpy as np
 from langchain_core.documents import Document
@@ -42,7 +42,7 @@ class USearch(VectorStore):
        self,
        texts: Iterable[str],
        metadatas: Optional[List[Dict]] = None,
-        ids: Optional[np.ndarray] = None,
+        ids: Optional[Union[np.ndarray, list[str]]] = None,
        **kwargs: Any,
    ) -> List[str]:
        """Run more texts through the embeddings and add to the vectorstore.
@@ -69,6 +69,8 @@ class USearch(VectorStore):
        last_id = int(self.ids[-1]) + 1
        if ids is None:
            ids = np.array([str(last_id + id) for id, _ in enumerate(texts)])
+        elif isinstance(ids, list):
+            ids = np.array(ids)

        self.index.add(np.array(ids), np.array(embeddings))
        self.docstore.add(dict(zip(ids, documents)))
@@ -134,7 +136,7 @@ class USearch(VectorStore):
        texts: List[str],
        embedding: Embeddings,
        metadatas: Optional[List[Dict]] = None,
-        ids: Optional[np.ndarray] = None,
+        ids: Optional[Union[np.ndarray, list[str]]] = None,
        metric: str = "cos",
        **kwargs: Any,
    ) -> USearch:
@@ -159,6 +161,8 @@ class USearch(VectorStore):
        documents: List[Document] = []
        if ids is None:
            ids = np.array([str(id) for id, _ in enumerate(texts)])
+        elif isinstance(ids, list):
+            ids = np.array(ids)
        for i, text in enumerate(texts):
            metadata = metadatas[i] if metadatas else {}
            documents.append(Document(page_content=text, metadata=metadata))
--- a/libs/community/poetry.lock
+++ b/libs/community/poetry.lock
--- a/libs/community/pyproject.toml
+++ b/libs/community/pyproject.toml
@@ -4,7 +4,7 @@ build-backend = "poetry.core.masonry.api"

 [tool.poetry]
 name = "langchain-community"
-version = "0.3.1"
+version = "0.3.2"
 description = "Community contributed LangChain integrations."
 authors = []
 license = "MIT"
@@ -33,13 +33,13 @@ ignore-words-list = "momento,collison,ned,foor,reworkd,parth,whats,aapply,mysogy

 [tool.poetry.dependencies]
 python = ">=3.9,<4.0"
-langchain-core = "^0.3.6"
-langchain = "^0.3.1"
+langchain-core = "^0.3.10"
+langchain = "^0.3.3"
 SQLAlchemy = ">=1.4,<3"
 requests = "^2"
 PyYAML = ">=5.3"
 aiohttp = "^3.8.3"
-tenacity = "^8.1.0,!=8.4.0"
+tenacity = ">=8.1.0,!=8.4.0,<10"
 dataclasses-json = ">= 0.5.7, < 0.7"
 pydantic-settings = "^2.4.0"
 langsmith = "^0.1.125"
--- a/libs/community/tests/integration_tests/chat_models/test_llamacpp.py
+++ b/libs/community/tests/integration_tests/chat_models/test_llamacpp.py
@@ -0,0 +1,19 @@
+from pydantic import BaseModel, Field
+
+from langchain_community.chat_models import ChatLlamaCpp
+
+
+class Joke(BaseModel):
+    """Joke to tell user."""
+
+    setup: str = Field(description="question to set up a joke")
+    punchline: str = Field(description="answer to resolve the joke")
+
+
+# TODO: replace with standard integration tests
+# See example in tests/integration_tests/chat_models/test_litellm.py
+def test_structured_output() -> None:
+    llm = ChatLlamaCpp(model_path="/path/to/Meta-Llama-3.1-8B-Instruct.Q4_K_M.gguf")
+    structured_llm = llm.with_structured_output(Joke)
+    result = structured_llm.invoke("Tell me a short joke about cats.")
+    assert isinstance(result, Joke)
--- a/libs/community/tests/integration_tests/chat_models/test_sambanova.py
+++ b/libs/community/tests/integration_tests/chat_models/test_sambanova.py
@@ -1,6 +1,9 @@
 from langchain_core.messages import AIMessage, HumanMessage

-from langchain_community.chat_models.sambanova import ChatSambaNovaCloud
+from langchain_community.chat_models.sambanova import (
+    ChatSambaNovaCloud,
+    ChatSambaStudio,
+)


 def test_chat_sambanova_cloud() -> None:
@@ -9,3 +12,11 @@ def test_chat_sambanova_cloud() -> None:
    response = chat.invoke([message])
    assert isinstance(response, AIMessage)
    assert isinstance(response.content, str)
+
+
+def test_chat_sambastudio() -> None:
+    chat = ChatSambaStudio()
+    message = HumanMessage(content="Hello")
+    response = chat.invoke([message])
+    assert isinstance(response, AIMessage)
+    assert isinstance(response.content, str)
--- a/libs/community/tests/integration_tests/vectorstores/docker-compose/infinispan.sh
+++ b/libs/community/tests/integration_tests/vectorstores/docker-compose/infinispan.sh
@@ -0,0 +1,4 @@
+#/bin/sh
+
+cd infinispan
+docker compose up
--- a/libs/community/tests/integration_tests/vectorstores/docker-compose/infinispan/conf/groups.properties
+++ b/libs/community/tests/integration_tests/vectorstores/docker-compose/infinispan/conf/groups.properties
@@ -0,0 +1,2 @@
+#Fri May 03 10:19:58 CEST 2024
+user=ADMIN,admin
--- a/libs/community/tests/integration_tests/vectorstores/docker-compose/infinispan/conf/infinispan.xml
+++ b/libs/community/tests/integration_tests/vectorstores/docker-compose/infinispan/conf/infinispan.xml
@@ -0,0 +1,62 @@
+<infinispan
+      xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
+      xsi:schemaLocation="urn:infinispan:config:15.0 https://infinispan.org/schemas/infinispan-config-15.0.xsd
+                            urn:infinispan:server:15.0 https://infinispan.org/schemas/infinispan-server-15.0.xsd"
+      xmlns="urn:infinispan:config:15.0"
+      xmlns:server="urn:infinispan:server:15.0">
+
+   <cache-container name="default" statistics="true">
+      <transport cluster="${infinispan.cluster.name:cluster}" stack="${infinispan.cluster.stack:tcp}" node-name="${infinispan.node.name:}"/>
+   </cache-container>
+
+   <server xmlns="urn:infinispan:server:15.0">
+      <interfaces>
+         <interface name="public">
+            <inet-address value="${infinispan.bind.address:127.0.0.1}"/>
+         </interface>
+      </interfaces>
+
+      <socket-bindings default-interface="public" port-offset="${infinispan.socket.binding.port-offset:0}">
+         <socket-binding name="default" port="${infinispan.bind.port:11222}"/>
+         <socket-binding name="authenticated" port="11232"/>
+         <socket-binding name="auth-tls" port="11242"/>
+      </socket-bindings>
+
+      <security>
+         <credential-stores>
+            <credential-store name="credentials" path="credentials.pfx">
+               <clear-text-credential clear-text="secret"/>
+            </credential-store>
+         </credential-stores>
+         <security-realms>
+            <security-realm name="default">
+               <properties-realm groups-attribute="Roles">
+		       <user-properties path="/user-config/users.properties"/>
+		       <group-properties path="/user-config/groups.properties"/>
+               </properties-realm>
+            </security-realm>
+            <security-realm name="tls">
+               <!-- Uncomment to enable TLS on the realm -->
+               <server-identities>
+                  <ssl>
+                     <keystore path="application.keystore"
+                               password="password" alias="server"
+                               generate-self-signed-certificate-host="localhost"/>
+
+                  </ssl>
+               </server-identities>
+               <properties-realm groups-attribute="Roles">
+		       <user-properties path="/user-config/users.properties"/>
+		       <group-properties path="/user-config/groups.properties"/>
+               </properties-realm>
+            </security-realm>
+         </security-realms>
+      </security>
+
+      <endpoints>
+	      <endpoint socket-binding="default"/>
+	      <endpoint socket-binding="authenticated" security-realm="default"/>
+	      <endpoint socket-binding="auth-tls" security-realm="tls"/>
+      </endpoints>
+   </server>
+</infinispan>
--- a/libs/community/tests/integration_tests/vectorstores/docker-compose/infinispan/conf/users.properties
+++ b/libs/community/tests/integration_tests/vectorstores/docker-compose/infinispan/conf/users.properties
@@ -0,0 +1,4 @@
+#$REALM_NAME=default$
+#$ALGORITHM=encrypted$
+#Fri May 03 10:19:58 CEST 2024
+user=scram-sha-1\:BYGcIAws2gznU/kpezoSb1VQNVd+YMX9r+9SAINFoZtPHaHTAQ\=\=;scram-sha-256\:BYGcIAwRiWiD+8f7dyQEs1Wsum/64MOcjGJ2UcmZFQB6DZJqwRDJ4NrvII4NttmxlA\=\=;scram-sha-384\:BYGcIAz+Eud65N8GWK4TMwhSCZpeE5EFSdynywdryQj3ZwBEgv+KF8hRUuGxiq3EyRxsby6w7DHK3CICGZLsPrM\=;scram-sha-512\:BYGcIAwWxVY9DHn42kHydivyU3s9LSPmyfPPJkIFYyt/XsMASFHGoy5rzk4ahX4HjpJgb+NjdCwhGfi33CY0azUIrn439s62Yg5mq9i+ISto;digest-md5\:AgR1c2VyB2RlZmF1bHSYYyzPjRDR7MhrsdFSK03P;digest-sha\:AgR1c2VyB2RlZmF1bHTga5gDNnNYh7/2HqhBVOdUHjBzhw\=\=;digest-sha-256\:AgR1c2VyB2RlZmF1bHTig5qZQIxqtJBTUp3EMh5UIFoS4qOhz9Uk5aOW9ZKCfw\=\=;digest-sha-384\:AgR1c2VyB2RlZmF1bHT01pAN/pRMLS5afm4Q9S0kuLlA0NokuP8F0AISTwXCb1E8RMsFHlBVPOa5rC6Nyso\=;digest-sha-512\:AgR1c2VyB2RlZmF1bHTi+cHn1Ez2Ze41CvPXb9eP/7JmRys7m1f5qPMQWhAmDOuuUXNWEG4yKSI9k2EZgQvMKTd5hDbR24ul1BsYP8X5;
--- a/libs/community/tests/integration_tests/vectorstores/docker-compose/infinispan/docker-compose.yaml
+++ b/libs/community/tests/integration_tests/vectorstores/docker-compose/infinispan/docker-compose.yaml
@@ -0,0 +1,16 @@
+version: "3.7"
+
+services:
+  infinispan:
+    image: quay.io/infinispan/server:15.0
+    ports:
+      - '11222:11222'
+      - '11232:11232'
+      - '11242:11242'
+    deploy:
+      resources:
+        limits:
+          memory: 25Gb
+    volumes:
+      - ./conf:/user-config
+    command: -c /user-config/infinispan.xml
--- a/libs/community/tests/integration_tests/vectorstores/test_cassandra.py
+++ b/libs/community/tests/integration_tests/vectorstores/test_cassandra.py
@@ -17,6 +17,17 @@ from tests.integration_tests.vectorstores.fake_embeddings import (
 )


+def _strip_docs(documents: List[Document]) -> List[Document]:
+    return [_strip_doc(doc) for doc in documents]
+
+
+def _strip_doc(document: Document) -> Document:
+    return Document(
+        page_content=document.page_content,
+        metadata=document.metadata,
+    )
+
+
 def _vectorstore_from_texts(
    texts: List[str],
    metadatas: Optional[List[dict]] = None,
@@ -110,9 +121,9 @@ async def test_cassandra() -> None:
    texts = ["foo", "bar", "baz"]
    docsearch = _vectorstore_from_texts(texts)
    output = docsearch.similarity_search("foo", k=1)
-    assert output == [Document(page_content="foo")]
+    assert _strip_docs(output) == _strip_docs([Document(page_content="foo")])
    output = await docsearch.asimilarity_search("foo", k=1)
-    assert output == [Document(page_content="foo")]
+    assert _strip_docs(output) == _strip_docs([Document(page_content="foo")])


 async def test_cassandra_with_score() -> None:
@@ -130,13 +141,13 @@ async def test_cassandra_with_score() -> None:
    output = docsearch.similarity_search_with_score("foo", k=3)
    docs = [o[0] for o in output]
    scores = [o[1] for o in output]
-    assert docs == expected_docs
+    assert _strip_docs(docs) == _strip_docs(expected_docs)
    assert scores[0] > scores[1] > scores[2]

    output = await docsearch.asimilarity_search_with_score("foo", k=3)
    docs = [o[0] for o in output]
    scores = [o[1] for o in output]
-    assert docs == expected_docs
+    assert _strip_docs(docs) == _strip_docs(expected_docs)
    assert scores[0] > scores[1] > scores[2]


@@ -239,7 +250,7 @@ async def test_cassandra_no_drop_async() -> None:
 def test_cassandra_delete() -> None:
    """Test delete methods from vector store."""
    texts = ["foo", "bar", "baz", "gni"]
-    metadatas = [{"page": i} for i in range(len(texts))]
+    metadatas = [{"page": i, "mod2": i % 2} for i in range(len(texts))]
    docsearch = _vectorstore_from_texts([], metadatas=metadatas)

    ids = docsearch.add_texts(texts, metadatas)
@@ -263,11 +274,21 @@ def test_cassandra_delete() -> None:
    output = docsearch.similarity_search("foo", k=10)
    assert len(output) == 0

+    docsearch.add_texts(texts, metadatas)
+    num_deleted = docsearch.delete_by_metadata_filter({"mod2": 0}, batch_size=1)
+    assert num_deleted == 2
+    output = docsearch.similarity_search("foo", k=10)
+    assert len(output) == 2
+    docsearch.clear()
+
+    with pytest.raises(ValueError):
+        docsearch.delete_by_metadata_filter({})
+

 async def test_cassandra_adelete() -> None:
    """Test delete methods from vector store."""
    texts = ["foo", "bar", "baz", "gni"]
-    metadatas = [{"page": i} for i in range(len(texts))]
+    metadatas = [{"page": i, "mod2": i % 2} for i in range(len(texts))]
    docsearch = await _vectorstore_from_texts_async([], metadatas=metadatas)

    ids = await docsearch.aadd_texts(texts, metadatas)
@@ -291,6 +312,16 @@ async def test_cassandra_adelete() -> None:
    output = docsearch.similarity_search("foo", k=10)
    assert len(output) == 0

+    await docsearch.aadd_texts(texts, metadatas)
+    num_deleted = await docsearch.adelete_by_metadata_filter({"mod2": 0}, batch_size=1)
+    assert num_deleted == 2
+    output = await docsearch.asimilarity_search("foo", k=10)
+    assert len(output) == 2
+    await docsearch.aclear()
+
+    with pytest.raises(ValueError):
+        await docsearch.adelete_by_metadata_filter({})
+

 def test_cassandra_metadata_indexing() -> None:
    """Test comparing metadata indexing policies."""
@@ -316,3 +347,107 @@ def test_cassandra_metadata_indexing() -> None:
    with pytest.raises(ValueError):
        # "Non-indexed metadata fields cannot be used in queries."
        vstore_f1.similarity_search("bar", filter={"field2": "b"}, k=2)
+
+
+def test_cassandra_replace_metadata() -> None:
+    """Test of replacing metadata."""
+    N_DOCS = 100
+    REPLACE_RATIO = 2  # one in ... will have replaced metadata
+    BATCH_SIZE = 3
+
+    vstore_f1 = _vectorstore_from_texts(
+        texts=[],
+        metadata_indexing=("allowlist", ["field1", "field2"]),
+        table_name="vector_test_table_indexing",
+    )
+    orig_documents = [
+        Document(
+            page_content=f"doc_{doc_i}",
+            id=f"doc_id_{doc_i}",
+            metadata={"field1": f"f1_{doc_i}", "otherf": "pre"},
+        )
+        for doc_i in range(N_DOCS)
+    ]
+    vstore_f1.add_documents(orig_documents)
+
+    ids_to_replace = [
+        f"doc_id_{doc_i}" for doc_i in range(N_DOCS) if doc_i % REPLACE_RATIO == 0
+    ]
+
+    # various kinds of replacement at play here:
+    def _make_new_md(mode: int, doc_id: str) -> dict[str, str]:
+        if mode == 0:
+            return {}
+        elif mode == 1:
+            return {"field2": f"NEW_{doc_id}"}
+        elif mode == 2:
+            return {"field2": f"NEW_{doc_id}", "ofherf2": "post"}
+        else:
+            return {"ofherf2": "post"}
+
+    ids_to_new_md = {
+        doc_id: _make_new_md(rep_i % 4, doc_id)
+        for rep_i, doc_id in enumerate(ids_to_replace)
+    }
+
+    vstore_f1.replace_metadata(ids_to_new_md, batch_size=BATCH_SIZE)
+    # thorough check
+    expected_id_to_metadata: dict[str, dict] = {
+        **{(document.id or ""): document.metadata for document in orig_documents},
+        **ids_to_new_md,
+    }
+    for hit in vstore_f1.similarity_search("doc", k=N_DOCS + 1):
+        assert hit.id is not None
+        assert hit.metadata == expected_id_to_metadata[hit.id]
+
+
+async def test_cassandra_areplace_metadata() -> None:
+    """Test of replacing metadata."""
+    N_DOCS = 100
+    REPLACE_RATIO = 2  # one in ... will have replaced metadata
+    BATCH_SIZE = 3
+
+    vstore_f1 = _vectorstore_from_texts(
+        texts=[],
+        metadata_indexing=("allowlist", ["field1", "field2"]),
+        table_name="vector_test_table_indexing",
+    )
+    orig_documents = [
+        Document(
+            page_content=f"doc_{doc_i}",
+            id=f"doc_id_{doc_i}",
+            metadata={"field1": f"f1_{doc_i}", "otherf": "pre"},
+        )
+        for doc_i in range(N_DOCS)
+    ]
+    await vstore_f1.aadd_documents(orig_documents)
+
+    ids_to_replace = [
+        f"doc_id_{doc_i}" for doc_i in range(N_DOCS) if doc_i % REPLACE_RATIO == 0
+    ]
+
+    # various kinds of replacement at play here:
+    def _make_new_md(mode: int, doc_id: str) -> dict[str, str]:
+        if mode == 0:
+            return {}
+        elif mode == 1:
+            return {"field2": f"NEW_{doc_id}"}
+        elif mode == 2:
+            return {"field2": f"NEW_{doc_id}", "ofherf2": "post"}
+        else:
+            return {"ofherf2": "post"}
+
+    ids_to_new_md = {
+        doc_id: _make_new_md(rep_i % 4, doc_id)
+        for rep_i, doc_id in enumerate(ids_to_replace)
+    }
+
+    await vstore_f1.areplace_metadata(ids_to_new_md, concurrency=BATCH_SIZE)
+    # thorough check
+    expected_id_to_metadata: dict[str, dict] = {
+        **{(document.id or ""): document.metadata for document in orig_documents},
+        **ids_to_new_md,
+    }
+    for hit in await vstore_f1.asimilarity_search("doc", k=N_DOCS + 1):
+        assert hit.id is not None
+        assert hit.metadata == expected_id_to_metadata[hit.id]
--- a/libs/community/tests/integration_tests/vectorstores/test_infinispanvs.py
+++ b/libs/community/tests/integration_tests/vectorstores/test_infinispanvs.py
@@ -1,7 +1,9 @@
 """Test Infinispan functionality."""

+import warnings
 from typing import Any, List, Optional

+import httpx
 import pytest
 from langchain_core.documents import Document

@@ -11,9 +13,18 @@ from tests.integration_tests.vectorstores.fake_embeddings import (
    fake_texts,
 )

+"""
+cd tests/integration_tests/vectorstores/docker-compose
+./infinispan.sh

-def _infinispan_setup_noautoconf() -> None:
-    ispnvs = InfinispanVS(auto_config=False)
+Current Infinispan implementation relies on httpx: `pip install "httpx[http2]"`
+if not installed. HTTP/2 is enable by default, if it's not
+wanted use `pip install "httpx"`.
+"""
+
+
+def _infinispan_setup_noautoconf(**kwargs: Any) -> None:
+    ispnvs = InfinispanVS(http2=_hasHttp2(), auto_config=False, **kwargs)
    ispnvs.cache_delete()
    ispnvs.schema_delete()
    proto = """
@@ -54,64 +65,104 @@ def _infinispanvs_from_texts(
        ids=ids,
        clear_old=clear_old,
        auto_config=auto_config,
+        http2=_hasHttp2(),
        **kwargs,
    )


+def _hasHttp2() -> bool:
+    try:
+        httpx.Client(http2=True)
+        return True
+    except Exception:
+        return False
+
+
@pytest.mark.parametrize("autoconfig", [False, True])
+@pytest.mark.parametrize(
+    "conn_opts",
+    [
+        {},
+        {
+            "user": "user",
+            "password": "password",
+            "hosts": ["localhost:11232"],
+            "schema": "http",
+        },
+        {
+            "user": "user",
+            "password": "password",
+            "hosts": ["localhost:11242"],
+            "schema": "https",
+            "verify": False,
+        },
+    ],
+)
 class TestBasic:
-    def test_infinispan(self, autoconfig: bool) -> None:
+    def test_infinispan(self, autoconfig: bool, conn_opts: dict) -> None:
        """Test end to end construction and search."""
        if not autoconfig:
-            _infinispan_setup_noautoconf()
-        docsearch = _infinispanvs_from_texts(auto_config=autoconfig)
+            _infinispan_setup_noautoconf(**conn_opts)
+        docsearch = _infinispanvs_from_texts(auto_config=autoconfig, **conn_opts)
        output = docsearch.similarity_search("foo", k=1)
        assert output == [Document(page_content="foo")]

-    def test_infinispan_with_metadata(self, autoconfig: bool) -> None:
+    def test_infinispan_with_auth(self, autoconfig: bool, conn_opts: dict) -> None:
+        """Test end to end construction and search."""
+        if not autoconfig:
+            _infinispan_setup_noautoconf(**conn_opts)
+        docsearch = _infinispanvs_from_texts(auto_config=autoconfig, **conn_opts)
+        output = docsearch.similarity_search("foo", k=1)
+        assert output == [Document(page_content="foo")]
+
+    def test_infinispan_with_metadata(self, autoconfig: bool, conn_opts: dict) -> None:
        """Test with metadata"""
        if not autoconfig:
-            _infinispan_setup_noautoconf()
+            _infinispan_setup_noautoconf(**conn_opts)
        meta = []
        for _ in range(len(fake_texts)):
            meta.append({"label": "test"})
-        docsearch = _infinispanvs_from_texts(metadatas=meta, auto_config=autoconfig)
+        docsearch = _infinispanvs_from_texts(
+            metadatas=meta, auto_config=autoconfig, **conn_opts
+        )
        output = docsearch.similarity_search("foo", k=1)
        assert output == [Document(page_content="foo", metadata={"label": "test"})]

    def test_infinispan_with_metadata_with_output_fields(
-        self, autoconfig: bool
+        self, autoconfig: bool, conn_opts: dict
    ) -> None:
        """Test with metadata"""
        if not autoconfig:
-            _infinispan_setup_noautoconf()
+            _infinispan_setup_noautoconf(**conn_opts)
        metadatas = [
            {"page": i, "label": "label" + str(i)} for i in range(len(fake_texts))
        ]
        c = {"output_fields": ["label", "page", "text"]}
        docsearch = _infinispanvs_from_texts(
-            metadatas=metadatas, configuration=c, auto_config=autoconfig
+            metadatas=metadatas, configuration=c, auto_config=autoconfig, **conn_opts
        )
        output = docsearch.similarity_search("foo", k=1)
        assert output == [
            Document(page_content="foo", metadata={"label": "label0", "page": 0})
        ]

-    def test_infinispanvs_with_id(self, autoconfig: bool) -> None:
+    def test_infinispanvs_with_id(self, autoconfig: bool, conn_opts: dict) -> None:
        """Test with ids"""
        ids = ["id_" + str(i) for i in range(len(fake_texts))]
-        docsearch = _infinispanvs_from_texts(ids=ids, auto_config=autoconfig)
+        docsearch = _infinispanvs_from_texts(
+            ids=ids, auto_config=autoconfig, **conn_opts
+        )
        output = docsearch.similarity_search("foo", k=1)
        assert output == [Document(page_content="foo")]

-    def test_infinispan_with_score(self, autoconfig: bool) -> None:
+    def test_infinispan_with_score(self, autoconfig: bool, conn_opts: dict) -> None:
        """Test end to end construction and search with scores and IDs."""
        if not autoconfig:
-            _infinispan_setup_noautoconf()
+            _infinispan_setup_noautoconf(**conn_opts)
        texts = ["foo", "bar", "baz"]
        metadatas = [{"page": i} for i in range(len(texts))]
        docsearch = _infinispanvs_from_texts(
-            metadatas=metadatas, auto_config=autoconfig
+            metadatas=metadatas, auto_config=autoconfig, **conn_opts
        )
        output = docsearch.similarity_search_with_score("foo", k=3)
        docs = [o[0] for o in output]
@@ -123,14 +174,14 @@ class TestBasic:
        ]
        assert scores[0] >= scores[1] >= scores[2]

-    def test_infinispan_add_texts(self, autoconfig: bool) -> None:
+    def test_infinispan_add_texts(self, autoconfig: bool, conn_opts: dict) -> None:
        """Test end to end construction and MRR search."""
        if not autoconfig:
-            _infinispan_setup_noautoconf()
+            _infinispan_setup_noautoconf(**conn_opts)
        texts = ["foo", "bar", "baz"]
        metadatas = [{"page": i} for i in range(len(texts))]
        docsearch = _infinispanvs_from_texts(
-            metadatas=metadatas, auto_config=autoconfig
+            metadatas=metadatas, auto_config=autoconfig, **conn_opts
        )

        docsearch.add_texts(texts, metadatas)
@@ -138,19 +189,22 @@ class TestBasic:
        output = docsearch.similarity_search("foo", k=10)
        assert len(output) == 6

-    def test_infinispan_no_clear_old(self, autoconfig: bool) -> None:
+    def test_infinispan_no_clear_old(self, autoconfig: bool, conn_opts: dict) -> None:
        """Test end to end construction and MRR search."""
        if not autoconfig:
-            _infinispan_setup_noautoconf()
+            _infinispan_setup_noautoconf(**conn_opts)
        texts = ["foo", "bar", "baz"]
        metadatas = [{"page": i} for i in range(len(texts))]
        docsearch = _infinispanvs_from_texts(
-            metadatas=metadatas, auto_config=autoconfig
+            metadatas=metadatas, auto_config=autoconfig, **conn_opts
        )
        del docsearch
        try:
            docsearch = _infinispanvs_from_texts(
-                metadatas=metadatas, clear_old=False, auto_config=autoconfig
+                metadatas=metadatas,
+                clear_old=False,
+                auto_config=autoconfig,
+                **conn_opts,
            )
        except AssertionError:
            if autoconfig:
@@ -159,3 +213,12 @@ class TestBasic:
                raise
        output = docsearch.similarity_search("foo", k=10)
        assert len(output) == 6
+
+
+class TestHttp2:
+    def test_http2(self) -> None:
+        try:
+            httpx.Client(http2=True)
+        except Exception:
+            warnings.warn('pip install "httpx[http2]" if you need HTTP/2')
+        pass
--- a/libs/community/tests/unit_tests/chat_models/test_anthropic.py
+++ b/libs/community/tests/unit_tests/chat_models/test_anthropic.py
@@ -33,9 +33,12 @@ def test_anthropic_model_kwargs() -> None:


@pytest.mark.requires("anthropic")
-def test_anthropic_invalid_model_kwargs() -> None:
-    with pytest.raises(ValueError):
-        ChatAnthropic(model_kwargs={"max_tokens_to_sample": 5})
+def test_anthropic_fields_in_model_kwargs() -> None:
+    """Test that for backwards compatibility fields can be passed in as model_kwargs."""
+    llm = ChatAnthropic(model_kwargs={"max_tokens_to_sample": 5})
+    assert llm.max_tokens_to_sample == 5
+    llm = ChatAnthropic(model_kwargs={"max_tokens": 5})
+    assert llm.max_tokens_to_sample == 5


@pytest.mark.requires("anthropic")
--- a/libs/community/tests/unit_tests/chat_models/test_imports.py
+++ b/libs/community/tests/unit_tests/chat_models/test_imports.py
@@ -35,6 +35,7 @@ EXPECTED_ALL = [
    "ChatPerplexity",
    "ChatPremAI",
    "ChatSambaNovaCloud",
+    "ChatSambaStudio",
    "ChatSparkLLM",
    "ChatTongyi",
    "ChatVertexAI",
--- a/libs/community/tests/unit_tests/llms/test_openai.py
+++ b/libs/community/tests/unit_tests/llms/test_openai.py
@@ -26,13 +26,12 @@ def test_openai_model_kwargs() -> None:


@pytest.mark.requires("openai")
-def test_openai_invalid_model_kwargs() -> None:
-    with pytest.raises(ValueError):
-        OpenAI(model_kwargs={"model_name": "foo"})
-
-    # Test that "model" cannot be specified in kwargs
-    with pytest.raises(ValueError):
-        OpenAI(model_kwargs={"model": "gpt-3.5-turbo-instruct"})
+def test_openai_fields_model_kwargs() -> None:
+    """Test that for backwards compatibility fields can be passed in as model_kwargs."""
+    llm = OpenAI(model_kwargs={"model_name": "foo"}, api_key="foo")
+    assert llm.model_name == "foo"
+    llm = OpenAI(model_kwargs={"model": "foo"}, api_key="foo")
+    assert llm.model_name == "foo"


@pytest.mark.requires("openai")
--- a/libs/core/Makefile
+++ b/libs/core/Makefile
@@ -46,7 +46,7 @@ lint lint_diff lint_package lint_tests:

 format format_diff:
 	[ "$(PYTHON_FILES)" = "" ] || poetry run ruff format $(PYTHON_FILES)
-	[ "$(PYTHON_FILES)" = "" ] || poetry run ruff check --select I --fix $(PYTHON_FILES)
+	[ "$(PYTHON_FILES)" = "" ] || poetry run ruff check --fix $(PYTHON_FILES)

 spell_check:
 	poetry run codespell --toml pyproject.toml
--- a/libs/core/langchain_core/_api/deprecation.py
+++ b/libs/core/langchain_core/_api/deprecation.py
@@ -51,15 +51,18 @@ def _validate_deprecation_params(
 ) -> None:
    """Validate the deprecation parameters."""
    if pending and removal:
-        raise ValueError("A pending deprecation cannot have a scheduled removal")
+        msg = "A pending deprecation cannot have a scheduled removal"
+        raise ValueError(msg)
    if alternative and alternative_import:
-        raise ValueError("Cannot specify both alternative and alternative_import")
+        msg = "Cannot specify both alternative and alternative_import"
+        raise ValueError(msg)

    if alternative_import and "." not in alternative_import:
-        raise ValueError(
+        msg = (
            "alternative_import must be a fully qualified module path. Got "
            f" {alternative_import}"
        )
+        raise ValueError(msg)


 def deprecated(
@@ -222,7 +225,8 @@ def deprecated(
            if not _obj_type:
                _obj_type = "attribute"
            if not _name:
-                raise ValueError(f"Field {obj} must have a name to be deprecated.")
+                msg = f"Field {obj} must have a name to be deprecated."
+                raise ValueError(msg)
            old_doc = obj.description

            def finalize(wrapper: Callable[..., Any], new_doc: str) -> T:
@@ -241,7 +245,8 @@ def deprecated(
            if not _obj_type:
                _obj_type = "attribute"
            if not _name:
-                raise ValueError(f"Field {obj} must have a name to be deprecated.")
+                msg = f"Field {obj} must have a name to be deprecated."
+                raise ValueError(msg)
            old_doc = obj.description

            def finalize(wrapper: Callable[..., Any], new_doc: str) -> T:
@@ -428,10 +433,11 @@ def warn_deprecated(
    if not pending:
        if not removal:
            removal = f"in {removal}" if removal else "within ?? minor releases"
-            raise NotImplementedError(
+            msg = (
                f"Need to determine which default deprecation schedule to use. "
                f"{removal}"
            )
+            raise NotImplementedError(msg)
        else:
            removal = f"in {removal}"

@@ -523,9 +529,8 @@ def rename_parameter(
        @functools.wraps(f)
        def wrapper(*args: _P.args, **kwargs: _P.kwargs) -> _R:
            if new in kwargs and old in kwargs:
-                raise TypeError(
-                    f"{f.__name__}() got multiple values for argument {new!r}"
-                )
+                msg = f"{f.__name__}() got multiple values for argument {new!r}"
+                raise TypeError(msg)
            if old in kwargs:
                warn_deprecated(
                    since,
--- a/libs/core/langchain_core/beta/runnables/context.py
+++ b/libs/core/langchain_core/beta/runnables/context.py
@@ -59,7 +59,8 @@ def _key_from_id(id_: str) -> str:
    elif wout_prefix.endswith(CONTEXT_CONFIG_SUFFIX_SET):
        return wout_prefix[: -len(CONTEXT_CONFIG_SUFFIX_SET)]
    else:
-        raise ValueError(f"Invalid context config id {id_}")
+        msg = f"Invalid context config id {id_}"
+        raise ValueError(msg)


 def _config_with_context(
@@ -103,16 +104,15 @@ def _config_with_context(

        for dep in deps_by_key[key]:
            if key in deps_by_key[dep]:
-                raise ValueError(
-                    f"Deadlock detected between context keys {key} and {dep}"
-                )
+                msg = f"Deadlock detected between context keys {key} and {dep}"
+                raise ValueError(msg)
        if len(setters) != 1:
-            raise ValueError(f"Expected exactly one setter for context key {key}")
+            msg = f"Expected exactly one setter for context key {key}"
+            raise ValueError(msg)
        setter_idx = setters[0][1]
        if any(getter_idx < setter_idx for _, getter_idx in getters):
-            raise ValueError(
-                f"Context setter for key {key} must be defined after all getters."
-            )
+            msg = f"Context setter for key {key} must be defined after all getters."
+            raise ValueError(msg)

        if getters:
            context_funcs[getters[0][0].id] = partial(getter, events[key], values)
@@ -271,9 +271,8 @@ class ContextSet(RunnableSerializable):
            if spec.id.endswith(CONTEXT_CONFIG_SUFFIX_GET):
                getter_key = spec.id.split("/")[1]
                if getter_key in self.keys:
-                    raise ValueError(
-                        f"Circular reference in context setter for key {getter_key}"
-                    )
+                    msg = f"Circular reference in context setter for key {getter_key}"
+                    raise ValueError(msg)
        return super().config_specs + [
            ConfigurableFieldSpec(
                id=id_,
--- a/libs/core/langchain_core/caches.py
+++ b/libs/core/langchain_core/caches.py
@@ -160,7 +160,8 @@ class InMemoryCache(BaseCache):
        """
        self._cache: dict[tuple[str, str], RETURN_VAL_TYPE] = {}
        if maxsize is not None and maxsize <= 0:
-            raise ValueError("maxsize must be greater than 0")
+            msg = "maxsize must be greater than 0"
+            raise ValueError(msg)
        self._maxsize = maxsize

    def lookup(self, prompt: str, llm_string: str) -> Optional[RETURN_VAL_TYPE]:
--- a/libs/core/langchain_core/callbacks/base.py
+++ b/libs/core/langchain_core/callbacks/base.py
@@ -275,9 +275,8 @@ class CallbackManagerMixin:
        """
        # NotImplementedError is thrown intentionally
        # Callback handler will fall back to on_llm_start if this is exception is thrown
-        raise NotImplementedError(
-            f"{self.__class__.__name__} does not implement `on_chat_model_start`"
-        )
+        msg = f"{self.__class__.__name__} does not implement `on_chat_model_start`"
+        raise NotImplementedError(msg)

    def on_retriever_start(
        self,
@@ -523,9 +522,8 @@ class AsyncCallbackHandler(BaseCallbackHandler):
        """
        # NotImplementedError is thrown intentionally
        # Callback handler will fall back to on_llm_start if this is exception is thrown
-        raise NotImplementedError(
-            f"{self.__class__.__name__} does not implement `on_chat_model_start`"
-        )
+        msg = f"{self.__class__.__name__} does not implement `on_chat_model_start`"
+        raise NotImplementedError(msg)

    async def on_llm_new_token(
        self,
--- a/libs/core/langchain_core/callbacks/manager.py
+++ b/libs/core/langchain_core/callbacks/manager.py
@@ -1510,11 +1510,12 @@ class CallbackManager(BaseCallbackManager):
        .. versionadded:: 0.2.14
        """
        if kwargs:
-            raise ValueError(
+            msg = (
                "The dispatcher API does not accept additional keyword arguments."
                "Please do not pass any additional keyword arguments, instead "
                "include them in the data field."
            )
+            raise ValueError(msg)
        if run_id is None:
            run_id = uuid.uuid4()

@@ -1729,7 +1730,12 @@ class AsyncCallbackManager(BaseCallbackManager):
                to each prompt.
        """

-        tasks = []
+        inline_tasks = []
+        non_inline_tasks = []
+        inline_handlers = [handler for handler in self.handlers if handler.run_inline]
+        non_inline_handlers = [
+            handler for handler in self.handlers if not handler.run_inline
+        ]
        managers = []

        for prompt in prompts:
@@ -1739,20 +1745,36 @@ class AsyncCallbackManager(BaseCallbackManager):
            else:
                run_id_ = uuid.uuid4()

-            tasks.append(
-                ahandle_event(
-                    self.handlers,
-                    "on_llm_start",
-                    "ignore_llm",
-                    serialized,
-                    [prompt],
-                    run_id=run_id_,
-                    parent_run_id=self.parent_run_id,
-                    tags=self.tags,
-                    metadata=self.metadata,
-                    **kwargs,
+            if inline_handlers:
+                inline_tasks.append(
+                    ahandle_event(
+                        inline_handlers,
+                        "on_llm_start",
+                        "ignore_llm",
+                        serialized,
+                        [prompt],
+                        run_id=run_id_,
+                        parent_run_id=self.parent_run_id,
+                        tags=self.tags,
+                        metadata=self.metadata,
+                        **kwargs,
+                    )
+                )
+            else:
+                non_inline_tasks.append(
+                    ahandle_event(
+                        non_inline_handlers,
+                        "on_llm_start",
+                        "ignore_llm",
+                        serialized,
+                        [prompt],
+                        run_id=run_id_,
+                        parent_run_id=self.parent_run_id,
+                        tags=self.tags,
+                        metadata=self.metadata,
+                        **kwargs,
+                    )
                )
-            )

            managers.append(
                AsyncCallbackManagerForLLMRun(
@@ -1767,7 +1789,13 @@ class AsyncCallbackManager(BaseCallbackManager):
                )
            )

-        await asyncio.gather(*tasks)
+        # Run inline tasks sequentially
+        for inline_task in inline_tasks:
+            await inline_task
+
+        # Run non-inline tasks concurrently
+        if non_inline_tasks:
+            await asyncio.gather(*non_inline_tasks)

        return managers

@@ -1791,7 +1819,8 @@ class AsyncCallbackManager(BaseCallbackManager):
                async callback managers, one for each LLM Run
                corresponding to each inner  message list.
        """
-        tasks = []
+        inline_tasks = []
+        non_inline_tasks = []
        managers = []

        for message_list in messages:
@@ -1801,9 +1830,9 @@ class AsyncCallbackManager(BaseCallbackManager):
            else:
                run_id_ = uuid.uuid4()

-            tasks.append(
-                ahandle_event(
-                    self.handlers,
+            for handler in self.handlers:
+                task = ahandle_event(
+                    [handler],
                    "on_chat_model_start",
                    "ignore_chat_model",
                    serialized,
@@ -1814,7 +1843,10 @@ class AsyncCallbackManager(BaseCallbackManager):
                    metadata=self.metadata,
                    **kwargs,
                )
-            )
+                if handler.run_inline:
+                    inline_tasks.append(task)
+                else:
+                    non_inline_tasks.append(task)

            managers.append(
                AsyncCallbackManagerForLLMRun(
@@ -1829,7 +1861,14 @@ class AsyncCallbackManager(BaseCallbackManager):
                )
            )

-        await asyncio.gather(*tasks)
+        # Run inline tasks sequentially
+        for task in inline_tasks:
+            await task
+
+        # Run non-inline tasks concurrently
+        if non_inline_tasks:
+            await asyncio.gather(*non_inline_tasks)
+
        return managers

    async def on_chain_start(
@@ -1951,11 +1990,12 @@ class AsyncCallbackManager(BaseCallbackManager):
            run_id = uuid.uuid4()

        if kwargs:
-            raise ValueError(
+            msg = (
                "The dispatcher API does not accept additional keyword arguments."
                "Please do not pass any additional keyword arguments, instead "
                "include them in the data field."
            )
+            raise ValueError(msg)
        await ahandle_event(
            self.handlers,
            "on_custom_event",
@@ -2298,11 +2338,12 @@ def _configure(

    if v1_tracing_enabled_ and not tracing_v2_enabled_:
        # if both are enabled, can silently ignore the v1 tracer
-        raise RuntimeError(
+        msg = (
            "Tracing using LangChainTracerV1 is no longer supported. "
            "Please set the LANGCHAIN_TRACING_V2 environment variable to enable "
            "tracing instead."
        )
+        raise RuntimeError(msg)

    tracer_project = _get_tracer_project()
    debug = _get_debug()
@@ -2481,13 +2522,14 @@ async def adispatch_custom_event(
    # within a tool or a lambda and have the metadata events associated
    # with the parent run rather than have a new run id generated for each.
    if callback_manager.parent_run_id is None:
-        raise RuntimeError(
+        msg = (
            "Unable to dispatch an adhoc event without a parent run id."
            "This function can only be called from within an existing run (e.g.,"
            "inside a tool or a RunnableLambda or a RunnableGenerator.)"
            "If you are doing that and still seeing this error, try explicitly"
            "passing the config parameter to this function."
        )
+        raise RuntimeError(msg)

    await callback_manager.on_custom_event(
        name,
@@ -2550,13 +2592,14 @@ def dispatch_custom_event(
    # within a tool or a lambda and have the metadata events associated
    # with the parent run rather than have a new run id generated for each.
    if callback_manager.parent_run_id is None:
-        raise RuntimeError(
+        msg = (
            "Unable to dispatch an adhoc event without a parent run id."
            "This function can only be called from within an existing run (e.g.,"
            "inside a tool or a RunnableLambda or a RunnableGenerator.)"
            "If you are doing that and still seeing this error, try explicitly"
            "passing the config parameter to this function."
        )
+        raise RuntimeError(msg)
    callback_manager.on_custom_event(
        name,
        data,
--- a/libs/core/langchain_core/chat_history.py
+++ b/libs/core/langchain_core/chat_history.py
@@ -157,10 +157,11 @@ class BaseChatMessageHistory(ABC):
            # method, so we should use it.
            self.add_messages([message])
        else:
-            raise NotImplementedError(
+            msg = (
                "add_message is not implemented for this class. "
                "Please implement add_message or add_messages."
            )
+            raise NotImplementedError(msg)

    def add_messages(self, messages: Sequence[BaseMessage]) -> None:
        """Add a list of messages.
--- a/libs/core/langchain_core/document_loaders/base.py
+++ b/libs/core/langchain_core/document_loaders/base.py
@@ -53,11 +53,12 @@ class BaseLoader(ABC):  # noqa: B024
            try:
                from langchain_text_splitters import RecursiveCharacterTextSplitter
            except ImportError as e:
-                raise ImportError(
+                msg = (
                    "Unable to import from langchain_text_splitters. Please specify "
                    "text_splitter or install langchain_text_splitters with "
                    "`pip install -U langchain-text-splitters`."
-                ) from e
+                )
+                raise ImportError(msg) from e

            _text_splitter: TextSplitter = RecursiveCharacterTextSplitter()
        else:
@@ -71,9 +72,8 @@ class BaseLoader(ABC):  # noqa: B024
        """A lazy loader for Documents."""
        if type(self).load != BaseLoader.load:
            return iter(self.load())
-        raise NotImplementedError(
-            f"{self.__class__.__name__} does not implement lazy_load()"
-        )
+        msg = f"{self.__class__.__name__} does not implement lazy_load()"
+        raise NotImplementedError(msg)

    async def alazy_load(self) -> AsyncIterator[Document]:
        """A lazy loader for Documents."""
--- a/libs/core/langchain_core/documents/base.py
+++ b/libs/core/langchain_core/documents/base.py
@@ -142,7 +142,8 @@ class Blob(BaseMedia):
    def check_blob_is_valid(cls, values: dict[str, Any]) -> Any:
        """Verify that either data or path is provided."""
        if "data" not in values and "path" not in values:
-            raise ValueError("Either data or path must be provided")
+            msg = "Either data or path must be provided"
+            raise ValueError(msg)
        return values

    def as_string(self) -> str:
@@ -155,7 +156,8 @@ class Blob(BaseMedia):
        elif isinstance(self.data, str):
            return self.data
        else:
-            raise ValueError(f"Unable to get string for blob {self}")
+            msg = f"Unable to get string for blob {self}"
+            raise ValueError(msg)

    def as_bytes(self) -> bytes:
        """Read data as bytes."""
@@ -167,7 +169,8 @@ class Blob(BaseMedia):
            with open(str(self.path), "rb") as f:
                return f.read()
        else:
-            raise ValueError(f"Unable to get bytes for blob {self}")
+            msg = f"Unable to get bytes for blob {self}"
+            raise ValueError(msg)

    @contextlib.contextmanager
    def as_bytes_io(self) -> Generator[Union[BytesIO, BufferedReader], None, None]:
@@ -178,7 +181,8 @@ class Blob(BaseMedia):
            with open(str(self.path), "rb") as f:
                yield f
        else:
-            raise NotImplementedError(f"Unable to convert blob {self}")
+            msg = f"Unable to convert blob {self}"
+            raise NotImplementedError(msg)

    @classmethod
    def from_path(
--- a/libs/core/langchain_core/embeddings/fake.py
+++ b/libs/core/langchain_core/embeddings/fake.py
@@ -53,7 +53,7 @@ class FakeEmbeddings(Embeddings, BaseModel):
    def _get_embedding(self) -> list[float]:
        import numpy as np  # type: ignore[import-not-found, import-untyped]

-        return list(np.random.normal(size=self.size))
+        return list(np.random.default_rng().normal(size=self.size))

    def embed_documents(self, texts: list[str]) -> list[list[float]]:
        return [self._get_embedding() for _ in texts]
@@ -109,8 +109,8 @@ class DeterministicFakeEmbedding(Embeddings, BaseModel):
        import numpy as np  # type: ignore[import-not-found, import-untyped]

        # set the seed for the random generator
-        np.random.seed(seed)
-        return list(np.random.normal(size=self.size))
+        rng = np.random.default_rng(seed)
+        return list(rng.normal(size=self.size))

    def _get_seed(self, text: str) -> int:
        """Get a seed for the random generator, using the hash of the text."""
--- a/libs/core/langchain_core/exceptions.py
+++ b/libs/core/langchain_core/exceptions.py
@@ -41,10 +41,11 @@ class OutputParserException(ValueError, LangChainException):  # noqa: N818
    ):
        super().__init__(error)
        if send_to_llm and (observation is None or llm_output is None):
-            raise ValueError(
+            msg = (
                "Arguments 'observation' & 'llm_output'"
                " are required if 'send_to_llm' is True"
            )
+            raise ValueError(msg)
        self.observation = observation
        self.llm_output = llm_output
        self.send_to_llm = send_to_llm
--- a/libs/core/langchain_core/indexing/api.py
+++ b/libs/core/langchain_core/indexing/api.py
@@ -73,20 +73,22 @@ class _HashedDocument(Document):

        for key in forbidden_keys:
            if key in metadata:
-                raise ValueError(
+                msg = (
                    f"Metadata cannot contain key {key} as it "
                    f"is reserved for internal use."
                )
+                raise ValueError(msg)

        content_hash = str(_hash_string_to_uuid(content))

        try:
            metadata_hash = str(_hash_nested_dict_to_uuid(metadata))
        except Exception as e:
-            raise ValueError(
+            msg = (
                f"Failed to hash metadata: {e}. "
                f"Please use a dict that can be serialized using json."
-            ) from e
+            )
+            raise ValueError(msg) from e

        values["content_hash"] = content_hash
        values["metadata_hash"] = metadata_hash
@@ -154,10 +156,11 @@ def _get_source_id_assigner(
    elif callable(source_id_key):
        return source_id_key
    else:
-        raise ValueError(
+        msg = (
            f"source_id_key should be either None, a string or a callable. "
            f"Got {source_id_key} of type {type(source_id_key)}."
        )
+        raise ValueError(msg)


 def _deduplicate_in_order(
@@ -198,6 +201,7 @@ def index(
    source_id_key: Union[str, Callable[[Document], str], None] = None,
    cleanup_batch_size: int = 1_000,
    force_update: bool = False,
+    upsert_kwargs: Optional[dict[str, Any]] = None,
 ) -> IndexingResult:
    """Index data from the loader into the vector store.

@@ -249,6 +253,12 @@ def index(
        force_update: Force update documents even if they are present in the
            record manager. Useful if you are re-indexing with updated embeddings.
            Default is False.
+        upsert_kwargs: Additional keyword arguments to pass to the add_documents
+                       method of the VectorStore or the upsert method of the
+                       DocumentIndex. For example, you can use this to
+                       specify a custom vector_field:
+                       upsert_kwargs={"vector_field": "embedding"}
+            .. versionadded:: 0.3.10

    Returns:
        Indexing result which contains information about how many documents
@@ -262,13 +272,15 @@ def index(
        ValueError: If source_id_key is not None, but is not a string or callable.
    """
    if cleanup not in {"incremental", "full", None}:
-        raise ValueError(
+        msg = (
            f"cleanup should be one of 'incremental', 'full' or None. "
            f"Got {cleanup}."
        )
+        raise ValueError(msg)

    if cleanup == "incremental" and source_id_key is None:
-        raise ValueError("Source id key is required when cleanup mode is incremental.")
+        msg = "Source id key is required when cleanup mode is incremental."
+        raise ValueError(msg)

    destination = vector_store  # Renaming internally for clarity

@@ -279,21 +291,24 @@ def index(

        for method in methods:
            if not hasattr(destination, method):
-                raise ValueError(
+                msg = (
                    f"Vectorstore {destination} does not have required method {method}"
                )
+                raise ValueError(msg)

        if type(destination).delete == VectorStore.delete:
            # Checking if the vectorstore has overridden the default delete method
            # implementation which just raises a NotImplementedError
-            raise ValueError("Vectorstore has not implemented the delete method")
+            msg = "Vectorstore has not implemented the delete method"
+            raise ValueError(msg)
    elif isinstance(destination, DocumentIndex):
        pass
    else:
-        raise TypeError(
+        msg = (
            f"Vectorstore should be either a VectorStore or a DocumentIndex. "
            f"Got {type(destination)}."
        )
+        raise TypeError(msg)

    if isinstance(docs_source, BaseLoader):
        try:
@@ -327,12 +342,13 @@ def index(
            # If the cleanup mode is incremental, source ids are required.
            for source_id, hashed_doc in zip(source_ids, hashed_docs):
                if source_id is None:
-                    raise ValueError(
+                    msg = (
                        "Source ids are required when cleanup mode is incremental. "
                        f"Document that starts with "
                        f"content: {hashed_doc.page_content[:100]} was not assigned "
                        f"as source id."
                    )
+                    raise ValueError(msg)
            # source ids cannot be None after for loop above.
            source_ids = cast(Sequence[str], source_ids)  # type: ignore[assignment]

@@ -363,10 +379,16 @@ def index(
        if docs_to_index:
            if isinstance(destination, VectorStore):
                destination.add_documents(
-                    docs_to_index, ids=uids, batch_size=batch_size
+                    docs_to_index,
+                    ids=uids,
+                    batch_size=batch_size,
+                    **(upsert_kwargs or {}),
                )
            elif isinstance(destination, DocumentIndex):
-                destination.upsert(docs_to_index)
+                destination.upsert(
+                    docs_to_index,
+                    **(upsert_kwargs or {}),
+                )

            num_added += len(docs_to_index) - len(seen_docs)
            num_updated += len(seen_docs)
@@ -387,7 +409,8 @@ def index(
            # mypy isn't good enough to determine that source ids cannot be None
            # here due to a check that's happening above, so we check again.
            if any(source_id is None for source_id in source_ids):
-                raise AssertionError("Source ids cannot be if cleanup=='incremental'.")
+                msg = "Source ids cannot be if cleanup=='incremental'."
+                raise AssertionError(msg)

            indexed_source_ids = cast(
                Sequence[str], [source_id_assigner(doc) for doc in docs_to_index]
@@ -438,6 +461,7 @@ async def aindex(
    source_id_key: Union[str, Callable[[Document], str], None] = None,
    cleanup_batch_size: int = 1_000,
    force_update: bool = False,
+    upsert_kwargs: Optional[dict[str, Any]] = None,
 ) -> IndexingResult:
    """Async index data from the loader into the vector store.

@@ -480,6 +504,12 @@ async def aindex(
        force_update: Force update documents even if they are present in the
            record manager. Useful if you are re-indexing with updated embeddings.
            Default is False.
+        upsert_kwargs: Additional keyword arguments to pass to the aadd_documents
+                       method of the VectorStore or the aupsert method of the
+                       DocumentIndex. For example, you can use this to
+                       specify a custom vector_field:
+                       upsert_kwargs={"vector_field": "embedding"}
+            .. versionadded:: 0.3.10

    Returns:
        Indexing result which contains information about how many documents
@@ -494,13 +524,15 @@ async def aindex(
    """

    if cleanup not in {"incremental", "full", None}:
-        raise ValueError(
+        msg = (
            f"cleanup should be one of 'incremental', 'full' or None. "
            f"Got {cleanup}."
        )
+        raise ValueError(msg)

    if cleanup == "incremental" and source_id_key is None:
-        raise ValueError("Source id key is required when cleanup mode is incremental.")
+        msg = "Source id key is required when cleanup mode is incremental."
+        raise ValueError(msg)

    destination = vector_store  # Renaming internally for clarity

@@ -512,21 +544,24 @@ async def aindex(

        for method in methods:
            if not hasattr(destination, method):
-                raise ValueError(
+                msg = (
                    f"Vectorstore {destination} does not have required method {method}"
                )
+                raise ValueError(msg)

        if type(destination).adelete == VectorStore.adelete:
            # Checking if the vectorstore has overridden the default delete method
            # implementation which just raises a NotImplementedError
-            raise ValueError("Vectorstore has not implemented the delete method")
+            msg = "Vectorstore has not implemented the delete method"
+            raise ValueError(msg)
    elif isinstance(destination, DocumentIndex):
        pass
    else:
-        raise TypeError(
+        msg = (
            f"Vectorstore should be either a VectorStore or a DocumentIndex. "
            f"Got {type(destination)}."
        )
+        raise TypeError(msg)
    async_doc_iterator: AsyncIterator[Document]
    if isinstance(docs_source, BaseLoader):
        try:
@@ -568,12 +603,13 @@ async def aindex(
            # If the cleanup mode is incremental, source ids are required.
            for source_id, hashed_doc in zip(source_ids, hashed_docs):
                if source_id is None:
-                    raise ValueError(
+                    msg = (
                        "Source ids are required when cleanup mode is incremental. "
                        f"Document that starts with "
                        f"content: {hashed_doc.page_content[:100]} was not assigned "
                        f"as source id."
                    )
+                    raise ValueError(msg)
            # source ids cannot be None after for loop above.
            source_ids = cast(Sequence[str], source_ids)

@@ -604,10 +640,16 @@ async def aindex(
        if docs_to_index:
            if isinstance(destination, VectorStore):
                await destination.aadd_documents(
-                    docs_to_index, ids=uids, batch_size=batch_size
+                    docs_to_index,
+                    ids=uids,
+                    batch_size=batch_size,
+                    **(upsert_kwargs or {}),
                )
            elif isinstance(destination, DocumentIndex):
-                await destination.aupsert(docs_to_index)
+                await destination.aupsert(
+                    docs_to_index,
+                    **(upsert_kwargs or {}),
+                )
            num_added += len(docs_to_index) - len(seen_docs)
            num_updated += len(seen_docs)

@@ -628,7 +670,8 @@ async def aindex(
            # mypy isn't good enough to determine that source ids cannot be None
            # here due to a check that's happening above, so we check again.
            if any(source_id is None for source_id in source_ids):
-                raise AssertionError("Source ids cannot be if cleanup=='incremental'.")
+                msg = "Source ids cannot be if cleanup=='incremental'."
+                raise AssertionError(msg)

            indexed_source_ids = cast(
                Sequence[str], [source_id_assigner(doc) for doc in docs_to_index]
--- a/libs/core/langchain_core/indexing/base.py
+++ b/libs/core/langchain_core/indexing/base.py
@@ -290,11 +290,13 @@ class InMemoryRecordManager(RecordManager):
        """

        if group_ids and len(keys) != len(group_ids):
-            raise ValueError("Length of keys must match length of group_ids")
+            msg = "Length of keys must match length of group_ids"
+            raise ValueError(msg)
        for index, key in enumerate(keys):
            group_id = group_ids[index] if group_ids else None
            if time_at_least and time_at_least > self.get_time():
-                raise ValueError("time_at_least must be in the past")
+                msg = "time_at_least must be in the past"
+                raise ValueError(msg)
            self.records[key] = {"group_id": group_id, "updated_at": self.get_time()}

    async def aupdate(
--- a/libs/core/langchain_core/indexing/in_memory.py
+++ b/libs/core/langchain_core/indexing/in_memory.py
@@ -47,7 +47,8 @@ class InMemoryDocumentIndex(DocumentIndex):
    def delete(self, ids: Optional[list[str]] = None, **kwargs: Any) -> DeleteResponse:
        """Delete by ID."""
        if ids is None:
-            raise ValueError("IDs must be provided for deletion")
+            msg = "IDs must be provided for deletion"
+            raise ValueError(msg)

        ok_ids = []

--- a/libs/core/langchain_core/language_models/base.py
+++ b/libs/core/langchain_core/language_models/base.py
@@ -60,11 +60,12 @@ def get_tokenizer() -> Any:
    try:
        from transformers import GPT2TokenizerFast  # type: ignore[import]
    except ImportError as e:
-        raise ImportError(
+        msg = (
            "Could not import transformers python package. "
            "This is needed in order to calculate get_token_ids. "
            "Please install it with `pip install transformers`."
-        ) from e
+        )
+        raise ImportError(msg) from e
    # create a GPT-2 tokenizer instance
    return GPT2TokenizerFast.from_pretrained("gpt2")

@@ -98,7 +99,7 @@ class BaseLanguageModel(
    All language model wrappers inherited from BaseLanguageModel.
    """

-    cache: Union[BaseCache, bool, None] = None
+    cache: Union[BaseCache, bool, None] = Field(default=None, exclude=True)
    """Whether to cache the response.

    * If true, will use the global cache.
@@ -236,7 +237,7 @@ class BaseLanguageModel(
        """Not implemented on this class."""
        # Implement this on child class if there is a way of steering the model to
        # generate responses that match a given schema.
-        raise NotImplementedError()
+        raise NotImplementedError

    @deprecated("0.1.7", alternative="invoke", removal="1.0")
    @abstractmethod
--- a/libs/core/langchain_core/language_models/chat_models.py
+++ b/libs/core/langchain_core/language_models/chat_models.py
@@ -89,7 +89,8 @@ def generate_from_stream(stream: Iterator[ChatGenerationChunk]) -> ChatResult:
    if generation:
        generation += list(stream)
    if generation is None:
-        raise ValueError("No generations found in stream.")
+        msg = "No generations found in stream."
+        raise ValueError(msg)
    return ChatResult(
        generations=[
            ChatGeneration(
@@ -265,10 +266,11 @@ class BaseChatModel(BaseLanguageModel[BaseMessage], ABC):
        elif isinstance(input, Sequence):
            return ChatPromptValue(messages=convert_to_messages(input))
        else:
-            raise ValueError(
+            msg = (
                f"Invalid input type {type(input)}. "
                "Must be a PromptValue, str, or list of BaseMessages."
            )
+            raise ValueError(msg)

    def invoke(
        self,
@@ -817,9 +819,8 @@ class BaseChatModel(BaseLanguageModel[BaseMessage], ABC):
            elif self.cache is None:
                pass
            else:
-                raise ValueError(
-                    "Asked to cache, but no cache found at `langchain.cache`."
-                )
+                msg = "Asked to cache, but no cache found at `langchain.cache`."
+                raise ValueError(msg)

        # Apply the rate limiter after checking the cache, since
        # we usually don't want to rate limit cache lookups, but
@@ -891,9 +892,8 @@ class BaseChatModel(BaseLanguageModel[BaseMessage], ABC):
            elif self.cache is None:
                pass
            else:
-                raise ValueError(
-                    "Asked to cache, but no cache found at `langchain.cache`."
-                )
+                msg = "Asked to cache, but no cache found at `langchain.cache`."
+                raise ValueError(msg)

        # Apply the rate limiter after checking the cache, since
        # we usually don't want to rate limit cache lookups, but
@@ -977,7 +977,7 @@ class BaseChatModel(BaseLanguageModel[BaseMessage], ABC):
        run_manager: Optional[CallbackManagerForLLMRun] = None,
        **kwargs: Any,
    ) -> Iterator[ChatGenerationChunk]:
-        raise NotImplementedError()
+        raise NotImplementedError

    async def _astream(
        self,
@@ -1020,7 +1020,8 @@ class BaseChatModel(BaseLanguageModel[BaseMessage], ABC):
        if isinstance(generation, ChatGeneration):
            return generation.message
        else:
-            raise ValueError("Unexpected generation type")
+            msg = "Unexpected generation type"
+            raise ValueError(msg)

    async def _call_async(
        self,
@@ -1036,7 +1037,8 @@ class BaseChatModel(BaseLanguageModel[BaseMessage], ABC):
        if isinstance(generation, ChatGeneration):
            return generation.message
        else:
-            raise ValueError("Unexpected generation type")
+            msg = "Unexpected generation type"
+            raise ValueError(msg)

    @deprecated("0.1.7", alternative="invoke", removal="1.0")
    def call_as_llm(
@@ -1053,7 +1055,8 @@ class BaseChatModel(BaseLanguageModel[BaseMessage], ABC):
        if isinstance(result.content, str):
            return result.content
        else:
-            raise ValueError("Cannot use predict when output is not a string.")
+            msg = "Cannot use predict when output is not a string."
+            raise ValueError(msg)

    @deprecated("0.1.7", alternative="invoke", removal="1.0")
    def predict_messages(
@@ -1077,7 +1080,8 @@ class BaseChatModel(BaseLanguageModel[BaseMessage], ABC):
        if isinstance(result.content, str):
            return result.content
        else:
-            raise ValueError("Cannot use predict when output is not a string.")
+            msg = "Cannot use predict when output is not a string."
+            raise ValueError(msg)

    @deprecated("0.1.7", alternative="ainvoke", removal="1.0")
    async def apredict_messages(
@@ -1108,7 +1112,7 @@ class BaseChatModel(BaseLanguageModel[BaseMessage], ABC):
        ],
        **kwargs: Any,
    ) -> Runnable[LanguageModelInput, BaseMessage]:
-        raise NotImplementedError()
+        raise NotImplementedError

    def with_structured_output(
        self,
@@ -1220,7 +1224,8 @@ class BaseChatModel(BaseLanguageModel[BaseMessage], ABC):
                # }
        """  # noqa: E501
        if kwargs:
-            raise ValueError(f"Received unsupported arguments {kwargs}")
+            msg = f"Received unsupported arguments {kwargs}"
+            raise ValueError(msg)

        from langchain_core.output_parsers.openai_tools import (
            JsonOutputKeyToolsParser,
@@ -1228,9 +1233,8 @@ class BaseChatModel(BaseLanguageModel[BaseMessage], ABC):
        )

        if self.bind_tools is BaseChatModel.bind_tools:
-            raise NotImplementedError(
-                "with_structured_output is not implemented for this model."
-            )
+            msg = "with_structured_output is not implemented for this model."
+            raise NotImplementedError(msg)
        llm = self.bind_tools([schema], tool_choice="any")
        if isinstance(schema, type) and is_basemodel_subclass(schema):
            output_parser: OutputParserLike = PydanticToolsParser(
--- a/libs/core/langchain_core/language_models/fake_chat_models.py
+++ b/libs/core/langchain_core/language_models/fake_chat_models.py
@@ -13,6 +13,7 @@ from langchain_core.callbacks import (
 from langchain_core.language_models.chat_models import BaseChatModel, SimpleChatModel
 from langchain_core.messages import AIMessage, AIMessageChunk, BaseMessage
 from langchain_core.outputs import ChatGeneration, ChatGenerationChunk, ChatResult
+from langchain_core.runnables import RunnableConfig


 class FakeMessagesListChatModel(BaseChatModel):
@@ -128,6 +129,33 @@ class FakeListChatModel(SimpleChatModel):
    def _identifying_params(self) -> dict[str, Any]:
        return {"responses": self.responses}

+    # manually override batch to preserve batch ordering with no concurrency
+    def batch(
+        self,
+        inputs: list[Any],
+        config: Optional[Union[RunnableConfig, list[RunnableConfig]]] = None,
+        *,
+        return_exceptions: bool = False,
+        **kwargs: Any,
+    ) -> list[BaseMessage]:
+        if isinstance(config, list):
+            return [self.invoke(m, c, **kwargs) for m, c in zip(inputs, config)]
+        return [self.invoke(m, config, **kwargs) for m in inputs]
+
+    async def abatch(
+        self,
+        inputs: list[Any],
+        config: Optional[Union[RunnableConfig, list[RunnableConfig]]] = None,
+        *,
+        return_exceptions: bool = False,
+        **kwargs: Any,
+    ) -> list[BaseMessage]:
+        if isinstance(config, list):
+            # do Not use an async iterator here because need explicit ordering
+            return [await self.ainvoke(m, c, **kwargs) for m, c in zip(inputs, config)]
+        # do Not use an async iterator here because need explicit ordering
+        return [await self.ainvoke(m, config, **kwargs) for m in inputs]
+

 class FakeChatModel(SimpleChatModel):
    """Fake Chat Model wrapper for testing purposes."""
@@ -210,18 +238,20 @@ class GenericFakeChatModel(BaseChatModel):
            messages, stop=stop, run_manager=run_manager, **kwargs
        )
        if not isinstance(chat_result, ChatResult):
-            raise ValueError(
+            msg = (
                f"Expected generate to return a ChatResult, "
                f"but got {type(chat_result)} instead."
            )
+            raise ValueError(msg)

        message = chat_result.generations[0].message

        if not isinstance(message, AIMessage):
-            raise ValueError(
+            msg = (
                f"Expected invoke to return an AIMessage, "
                f"but got {type(message)} instead."
            )
+            raise ValueError(msg)

        content = message.content

--- a/libs/core/langchain_core/language_models/llms.py
+++ b/libs/core/langchain_core/language_models/llms.py
@@ -135,15 +135,17 @@ def _resolve_cache(cache: Union[BaseCache, bool, None]) -> Optional[BaseCache]:
    elif cache is True:
        llm_cache = get_llm_cache()
        if llm_cache is None:
-            raise ValueError(
+            msg = (
                "No global cache was configured. Use `set_llm_cache`."
                "to set a global cache if you want to use a global cache."
                "Otherwise either pass a cache object or set cache to False/None"
            )
+            raise ValueError(msg)
    elif cache is False:
        llm_cache = None
    else:
-        raise ValueError(f"Unsupported cache value {cache}")
+        msg = f"Unsupported cache value {cache}"
+        raise ValueError(msg)
    return llm_cache


@@ -332,10 +334,11 @@ class BaseLLM(BaseLanguageModel[str], ABC):
        elif isinstance(input, Sequence):
            return ChatPromptValue(messages=convert_to_messages(input))
        else:
-            raise ValueError(
+            msg = (
                f"Invalid input type {type(input)}. "
                "Must be a PromptValue, str, or list of BaseMessages."
            )
+            raise ValueError(msg)

    def _get_ls_params(
        self,
@@ -695,7 +698,7 @@ class BaseLLM(BaseLanguageModel[str], ABC):
        Returns:
            An iterator of GenerationChunks.
        """
-        raise NotImplementedError()
+        raise NotImplementedError

    async def _astream(
        self,
@@ -842,10 +845,11 @@ class BaseLLM(BaseLanguageModel[str], ABC):
                prompt and additional model provider-specific output.
        """
        if not isinstance(prompts, list):
-            raise ValueError(
+            msg = (
                "Argument 'prompts' is expected to be of type List[str], received"
                f" argument of type {type(prompts)}."
            )
+            raise ValueError(msg)
        # Create callback managers
        if isinstance(metadata, list):
            metadata = [
@@ -989,10 +993,11 @@ class BaseLLM(BaseLanguageModel[str], ABC):
            return [None] * len(prompts)
        if isinstance(run_id, list):
            if len(run_id) != len(prompts):
-                raise ValueError(
+                msg = (
                    "Number of manually provided run_id's does not match batch length."
                    f" {len(run_id)} != {len(prompts)}"
                )
+                raise ValueError(msg)
            return run_id
        return [run_id] + [None] * (len(prompts) - 1)

@@ -1262,11 +1267,12 @@ class BaseLLM(BaseLanguageModel[str], ABC):
            ValueError: If the prompt is not a string.
        """
        if not isinstance(prompt, str):
-            raise ValueError(
+            msg = (
                "Argument `prompt` is expected to be a string. Instead found "
                f"{type(prompt)}. If you want to run the LLM on multiple prompts, use "
                "`generate` instead."
            )
+            raise ValueError(msg)
        return (
            self.generate(
                [prompt],
@@ -1387,7 +1393,8 @@ class BaseLLM(BaseLanguageModel[str], ABC):
            with open(file_path, "w") as f:
                yaml.dump(prompt_dict, f, default_flow_style=False)
        else:
-            raise ValueError(f"{save_path} must be json or yaml")
+            msg = f"{save_path} must be json or yaml"
+            raise ValueError(msg)


 class LLM(BaseLLM):
--- a/libs/core/langchain_core/load/dump.py
+++ b/libs/core/langchain_core/load/dump.py
@@ -37,7 +37,8 @@ def dumps(obj: Any, *, pretty: bool = False, **kwargs: Any) -> str:
        ValueError: If `default` is passed as a kwarg.
    """
    if "default" in kwargs:
-        raise ValueError("`default` should not be passed to dumps")
+        msg = "`default` should not be passed to dumps"
+        raise ValueError(msg)
    try:
        if pretty:
            indent = kwargs.pop("indent", 2)
--- a/libs/core/langchain_core/load/load.py
+++ b/libs/core/langchain_core/load/load.py
@@ -96,17 +96,19 @@ class Reviver:
            else:
                if self.secrets_from_env and key in os.environ and os.environ[key]:
                    return os.environ[key]
-                raise KeyError(f'Missing key "{key}" in load(secrets_map)')
+                msg = f'Missing key "{key}" in load(secrets_map)'
+                raise KeyError(msg)

        if (
            value.get("lc") == 1
            and value.get("type") == "not_implemented"
            and value.get("id") is not None
        ):
-            raise NotImplementedError(
+            msg = (
                "Trying to load an object that doesn't implement "
                f"serialization: {value}"
            )
+            raise NotImplementedError(msg)

        if (
            value.get("lc") == 1
@@ -121,7 +123,8 @@ class Reviver:
                # The root namespace ["langchain"] is not a valid identifier.
                or namespace == ["langchain"]
            ):
-                raise ValueError(f"Invalid namespace: {value}")
+                msg = f"Invalid namespace: {value}"
+                raise ValueError(msg)
            # Has explicit import path.
            elif mapping_key in self.import_mappings:
                import_path = self.import_mappings[mapping_key]
@@ -130,11 +133,12 @@ class Reviver:
                # Import module
                mod = importlib.import_module(".".join(import_dir))
            elif namespace[0] in DISALLOW_LOAD_FROM_PATH:
-                raise ValueError(
+                msg = (
                    "Trying to deserialize something that cannot "
                    "be deserialized in current version of langchain-core: "
                    f"{mapping_key}."
                )
+                raise ValueError(msg)
            # Otherwise, treat namespace as path.
            else:
                mod = importlib.import_module(".".join(namespace))
@@ -143,7 +147,8 @@ class Reviver:

            # The class must be a subclass of Serializable.
            if not issubclass(cls, Serializable):
-                raise ValueError(f"Invalid namespace: {value}")
+                msg = f"Invalid namespace: {value}"
+                raise ValueError(msg)

            # We don't need to recurse on kwargs
            # as json.loads will do that for us.
--- a/libs/core/langchain_core/load/serializable.py
+++ b/libs/core/langchain_core/load/serializable.py
@@ -215,11 +215,12 @@ class Serializable(BaseModel, ABC):

                for attr in deprecated_attributes:
                    if hasattr(cls, attr):
-                        raise ValueError(
+                        msg = (
                            f"Class {self.__class__} has a deprecated "
                            f"attribute {attr}. Please use the corresponding "
                            f"classmethod instead."
                        )
+                        raise ValueError(msg)

            # Get a reference to self bound to each class in the MRO
            this = cast(Serializable, self if cls is None else super(cls, self))
--- a/libs/core/langchain_core/messages/ai.py
+++ b/libs/core/langchain_core/messages/ai.py
@@ -1,8 +1,9 @@
 import json
-from typing import Any, Literal, Optional, Union
+import operator
+from typing import Any, Literal, Optional, Union, cast

 from pydantic import model_validator
-from typing_extensions import Self, TypedDict
+from typing_extensions import NotRequired, Self, TypedDict

 from langchain_core.messages.base import (
    BaseMessage,
@@ -27,6 +28,67 @@ from langchain_core.messages.tool import (
 )
 from langchain_core.utils._merge import merge_dicts, merge_lists
 from langchain_core.utils.json import parse_partial_json
+from langchain_core.utils.usage import _dict_int_op
+
+
+class InputTokenDetails(TypedDict, total=False):
+    """Breakdown of input token counts.
+
+    Does *not* need to sum to full input token count. Does *not* need to have all keys.
+
+    Example:
+
+        .. code-block:: python
+
+            {
+                "audio": 10,
+                "cache_creation": 200,
+                "cache_read": 100,
+            }
+
+    .. versionadded:: 0.3.9
+    """
+
+    audio: int
+    """Audio input tokens."""
+    cache_creation: int
+    """Input tokens that were cached and there was a cache miss.
+
+    Since there was a cache miss, the cache was created from these tokens.
+    """
+    cache_read: int
+    """Input tokens that were cached and there was a cache hit.
+
+    Since there was a cache hit, the tokens were read from the cache. More precisely,
+    the model state given these tokens was read from the cache.
+    """
+
+
+class OutputTokenDetails(TypedDict, total=False):
+    """Breakdown of output token counts.
+
+    Does *not* need to sum to full output token count. Does *not* need to have all keys.
+
+    Example:
+
+        .. code-block:: python
+
+            {
+                "audio": 10,
+                "reasoning": 200,
+            }
+
+    .. versionadded:: 0.3.9
+    """
+
+    audio: int
+    """Audio output tokens."""
+    reasoning: int
+    """Reasoning output tokens.
+
+    Tokens generated by the model in a chain of thought process (i.e. by OpenAI's o1
+    models) that are not returned as part of model output.
+    """


 class UsageMetadata(TypedDict):
@@ -39,18 +101,41 @@ class UsageMetadata(TypedDict):
        .. code-block:: python

            {
-                "input_tokens": 10,
-                "output_tokens": 20,
-                "total_tokens": 30
+                "input_tokens": 350,
+                "output_tokens": 240,
+                "total_tokens": 590,
+                "input_token_details": {
+                    "audio": 10,
+                    "cache_creation": 200,
+                    "cache_read": 100,
+                },
+                "output_token_details": {
+                    "audio": 10,
+                    "reasoning": 200,
+                }
            }
+
+    .. versionchanged:: 0.3.9
+
+        Added ``input_token_details`` and ``output_token_details``.
    """

    input_tokens: int
-    """Count of input (or prompt) tokens."""
+    """Count of input (or prompt) tokens. Sum of all input token types."""
    output_tokens: int
-    """Count of output (or completion) tokens."""
+    """Count of output (or completion) tokens. Sum of all output token types."""
    total_tokens: int
-    """Total token count."""
+    """Total token count. Sum of input_tokens + output_tokens."""
+    input_token_details: NotRequired[InputTokenDetails]
+    """Breakdown of input token counts.
+ 
+    Does *not* need to sum to full input token count. Does *not* need to have all keys.
+    """
+    output_token_details: NotRequired[OutputTokenDetails]
+    """Breakdown of output token counts.
+
+    Does *not* need to sum to full output token count. Does *not* need to have all keys.
+    """


 class AIMessage(BaseMessage):
@@ -290,7 +375,8 @@ class AIMessageChunk(AIMessage, BaseMessageChunk):
                        )
                    )
                else:
-                    raise ValueError("Malformed args.")
+                    msg = "Malformed args."
+                    raise ValueError(msg)
            except Exception:
                invalid_tool_calls.append(
                    create_invalid_tool_call(
@@ -319,9 +405,8 @@ def add_ai_message_chunks(
 ) -> AIMessageChunk:
    """Add multiple AIMessageChunks together."""
    if any(left.example != o.example for o in others):
-        raise ValueError(
-            "Cannot concatenate AIMessageChunks with different example values."
-        )
+        msg = "Cannot concatenate AIMessageChunks with different example values."
+        raise ValueError(msg)

    content = merge_content(left.content, *(o.content for o in others))
    additional_kwargs = merge_dicts(
@@ -349,17 +434,9 @@ def add_ai_message_chunks(

    # Token usage
    if left.usage_metadata or any(o.usage_metadata is not None for o in others):
-        usage_metadata_: UsageMetadata = left.usage_metadata or UsageMetadata(
-            input_tokens=0, output_tokens=0, total_tokens=0
-        )
+        usage_metadata: Optional[UsageMetadata] = left.usage_metadata
        for other in others:
-            if other.usage_metadata is not None:
-                usage_metadata_["input_tokens"] += other.usage_metadata["input_tokens"]
-                usage_metadata_["output_tokens"] += other.usage_metadata[
-                    "output_tokens"
-                ]
-                usage_metadata_["total_tokens"] += other.usage_metadata["total_tokens"]
-        usage_metadata: Optional[UsageMetadata] = usage_metadata_
+            usage_metadata = add_usage(usage_metadata, other.usage_metadata)
    else:
        usage_metadata = None

@@ -372,3 +449,115 @@ def add_ai_message_chunks(
        usage_metadata=usage_metadata,
        id=left.id,
    )
+
+
+def add_usage(
+    left: Optional[UsageMetadata], right: Optional[UsageMetadata]
+) -> UsageMetadata:
+    """Recursively add two UsageMetadata objects.
+
+    Example:
+        .. code-block:: python
+
+            from langchain_core.messages.ai import add_usage
+
+            left = UsageMetadata(
+                input_tokens=5,
+                output_tokens=0,
+                total_tokens=5,
+                input_token_details=InputTokenDetails(cache_read=3)
+            )
+            right = UsageMetadata(
+                input_tokens=0,
+                output_tokens=10,
+                total_tokens=10,
+                output_token_details=OutputTokenDetails(reasoning=4)
+            )
+
+            add_usage(left, right)
+
+        results in
+
+        .. code-block:: python
+
+            UsageMetadata(
+                input_tokens=5,
+                output_tokens=10,
+                total_tokens=15,
+                input_token_details=InputTokenDetails(cache_read=3),
+                output_token_details=OutputTokenDetails(reasoning=4)
+            )
+
+    """
+    if not (left or right):
+        return UsageMetadata(input_tokens=0, output_tokens=0, total_tokens=0)
+    if not (left and right):
+        return cast(UsageMetadata, left or right)
+
+    return UsageMetadata(
+        **cast(
+            UsageMetadata,
+            _dict_int_op(
+                cast(dict, left),
+                cast(dict, right),
+                operator.add,
+            ),
+        )
+    )
+
+
+def subtract_usage(
+    left: Optional[UsageMetadata], right: Optional[UsageMetadata]
+) -> UsageMetadata:
+    """Recursively subtract two UsageMetadata objects.
+
+    Token counts cannot be negative so the actual operation is max(left - right, 0).
+
+    Example:
+        .. code-block:: python
+
+            from langchain_core.messages.ai import subtract_usage
+
+            left = UsageMetadata(
+                input_tokens=5,
+                output_tokens=10,
+                total_tokens=15,
+                input_token_details=InputTokenDetails(cache_read=4)
+            )
+            right = UsageMetadata(
+                input_tokens=3,
+                output_tokens=8,
+                total_tokens=11,
+                output_token_details=OutputTokenDetails(reasoning=4)
+            )
+
+            subtract_usage(left, right)
+
+        results in
+
+        .. code-block:: python
+
+            UsageMetadata(
+                input_tokens=2,
+                output_tokens=2,
+                total_tokens=4,
+                input_token_details=InputTokenDetails(cache_read=4),
+                output_token_details=OutputTokenDetails(reasoning=0)
+            )
+
+    """
+    if not (left or right):
+        return UsageMetadata(input_tokens=0, output_tokens=0, total_tokens=0)
+    if not (left and right):
+        return cast(UsageMetadata, left or right)
+
+    return UsageMetadata(
+        **cast(
+            UsageMetadata,
+            _dict_int_op(
+                cast(dict, left),
+                cast(dict, right),
+                (lambda le, ri: max(le - ri, 0)),
+            ),
+        )
+    )
--- a/libs/core/langchain_core/messages/base.py
+++ b/libs/core/langchain_core/messages/base.py
@@ -223,11 +223,12 @@ class BaseMessageChunk(BaseMessage):
                response_metadata=response_metadata,
            )
        else:
-            raise TypeError(
+            msg = (
                'unsupported operand type(s) for +: "'
                f"{self.__class__.__name__}"
                f'" and "{other.__class__.__name__}"'
            )
+            raise TypeError(msg)


 def message_to_dict(message: BaseMessage) -> dict:
--- a/libs/core/langchain_core/messages/chat.py
+++ b/libs/core/langchain_core/messages/chat.py
@@ -48,9 +48,8 @@ class ChatMessageChunk(ChatMessage, BaseMessageChunk):
    def __add__(self, other: Any) -> BaseMessageChunk:  # type: ignore
        if isinstance(other, ChatMessageChunk):
            if self.role != other.role:
-                raise ValueError(
-                    "Cannot concatenate ChatMessageChunks with different roles."
-                )
+                msg = "Cannot concatenate ChatMessageChunks with different roles."
+                raise ValueError(msg)

            return self.__class__(
                role=self.role,
--- a/libs/core/langchain_core/messages/function.py
+++ b/libs/core/langchain_core/messages/function.py
@@ -54,9 +54,8 @@ class FunctionMessageChunk(FunctionMessage, BaseMessageChunk):
    def __add__(self, other: Any) -> BaseMessageChunk:  # type: ignore
        if isinstance(other, FunctionMessageChunk):
            if self.name != other.name:
-                raise ValueError(
-                    "Cannot concatenate FunctionMessageChunks with different names."
-                )
+                msg = "Cannot concatenate FunctionMessageChunks with different names."
+                raise ValueError(msg)

            return self.__class__(
                name=self.name,
--- a/libs/core/langchain_core/messages/modifier.py
+++ b/libs/core/langchain_core/messages/modifier.py
@@ -20,7 +20,8 @@ class RemoveMessage(BaseMessage):
            ValueError: If the 'content' field is passed in kwargs.
        """
        if kwargs.pop("content", None):
-            raise ValueError("RemoveMessage does not support 'content' field.")
+            msg = "RemoveMessage does not support 'content' field."
+            raise ValueError(msg)

        return super().__init__("", id=id, **kwargs)

--- a/libs/core/langchain_core/messages/tool.py
+++ b/libs/core/langchain_core/messages/tool.py
@@ -94,11 +94,12 @@ class ToolMessage(BaseMessage):
            try:
                values["content"] = str(content)
            except ValueError as e:
-                raise ValueError(
+                msg = (
                    "ToolMessage content should be a string or a list of string/dicts. "
                    f"Received:\n\n{content=}\n\n which could not be coerced into a "
                    "string."
-                ) from e
+                )
+                raise ValueError(msg) from e
        elif isinstance(content, list):
            values["content"] = []
            for i, x in enumerate(content):
@@ -106,12 +107,13 @@ class ToolMessage(BaseMessage):
                    try:
                        values["content"].append(str(x))
                    except ValueError as e:
-                        raise ValueError(
+                        msg = (
                            "ToolMessage content should be a string or a list of "
                            "string/dicts. Received a list but "
                            f"element ToolMessage.content[{i}] is not a dict and could "
                            f"not be coerced to a string.:\n\n{x}"
-                        ) from e
+                        )
+                        raise ValueError(msg) from e
                else:
                    values["content"].append(x)
        else:
@@ -147,9 +149,8 @@ class ToolMessageChunk(ToolMessage, BaseMessageChunk):
    def __add__(self, other: Any) -> BaseMessageChunk:  # type: ignore
        if isinstance(other, ToolMessageChunk):
            if self.tool_call_id != other.tool_call_id:
-                raise ValueError(
-                    "Cannot concatenate ToolMessageChunks with different names."
-                )
+                msg = "Cannot concatenate ToolMessageChunks with different names."
+                raise ValueError(msg)

            return self.__class__(
                tool_call_id=self.tool_call_id,
--- a/libs/core/langchain_core/messages/utils.py
+++ b/libs/core/langchain_core/messages/utils.py
@@ -51,10 +51,11 @@ def _get_type(v: Any) -> str:
    elif hasattr(v, "type"):
        return v.type
    else:
-        raise TypeError(
+        msg = (
            f"Expected either a dictionary with a 'type' key or an object "
            f"with a 'type' attribute. Instead got type {type(v)}."
        )
+        raise TypeError(msg)


 AnyMessage = Annotated[
@@ -120,7 +121,8 @@ def get_buffer_string(
        elif isinstance(m, ChatMessage):
            role = m.role
        else:
-            raise ValueError(f"Got unsupported message type: {m}")
+            msg = f"Got unsupported message type: {m}"
+            raise ValueError(msg)
        message = f"{role}: {m.content}"
        if isinstance(m, AIMessage) and "function_call" in m.additional_kwargs:
            message += f"{m.additional_kwargs['function_call']}"
@@ -158,7 +160,8 @@ def _message_from_dict(message: dict) -> BaseMessage:
    elif _type == "ChatMessageChunk":
        return ChatMessageChunk(**message["data"])
    else:
-        raise ValueError(f"Got unexpected message type: {_type}")
+        msg = f"Got unexpected message type: {_type}"
+        raise ValueError(msg)


 def messages_from_dict(messages: Sequence[dict]) -> list[BaseMessage]:
@@ -266,10 +269,11 @@ def _create_message_from_message_type(
    elif message_type == "remove":
        message = RemoveMessage(**kwargs)
    else:
-        raise ValueError(
+        msg = (
            f"Unexpected message type: '{message_type}'. Use one of 'human',"
            f" 'user', 'ai', 'assistant', 'function', 'tool', or 'system'."
        )
+        raise ValueError(msg)
    return message


@@ -312,14 +316,14 @@ def _convert_to_message(message: MessageLikeRepresentation) -> BaseMessage:
            # None msg content is not allowed
            msg_content = msg_kwargs.pop("content") or ""
        except KeyError as e:
-            raise ValueError(
-                f"Message dict must contain 'role' and 'content' keys, got {message}"
-            ) from e
+            msg = f"Message dict must contain 'role' and 'content' keys, got {message}"
+            raise ValueError(msg) from e
        _message = _create_message_from_message_type(
            msg_type, msg_content, **msg_kwargs
        )
    else:
-        raise NotImplementedError(f"Unsupported message type: {type(message)}")
+        msg = f"Unsupported message type: {type(message)}"
+        raise NotImplementedError(msg)

    return _message

@@ -820,11 +824,12 @@ def trim_messages(
        else:
            list_token_counter = token_counter  # type: ignore[assignment]
    else:
-        raise ValueError(
+        msg = (
            f"'token_counter' expected to be a model that implements "
            f"'get_num_tokens_from_messages()' or a function. Received object of type "
            f"{type(token_counter)}."
        )
+        raise ValueError(msg)

    try:
        from langchain_text_splitters import TextSplitter
@@ -859,9 +864,8 @@ def trim_messages(
            text_splitter=text_splitter_fn,
        )
    else:
-        raise ValueError(
-            f"Unrecognized {strategy=}. Supported strategies are 'last' and 'first'."
-        )
+        msg = f"Unrecognized {strategy=}. Supported strategies are 'last' and 'first'."
+        raise ValueError(msg)


 def _first_max_tokens(
@@ -995,10 +999,11 @@ def _msg_to_chunk(message: BaseMessage) -> BaseMessageChunk:
        if isinstance(message, msg_cls):
            return chunk_cls(**message.model_dump(exclude={"type"}))

-    raise ValueError(
+    msg = (
        f"Unrecognized message class {message.__class__}. Supported classes are "
        f"{list(_MSG_CHUNK_MAP.keys())}"
    )
+    raise ValueError(msg)


 def _chunk_to_msg(chunk: BaseMessageChunk) -> BaseMessage:
@@ -1010,10 +1015,11 @@ def _chunk_to_msg(chunk: BaseMessageChunk) -> BaseMessage:
        if isinstance(chunk, chunk_cls):
            return msg_cls(**chunk.model_dump(exclude={"type", "tool_call_chunks"}))

-    raise ValueError(
+    msg = (
        f"Unrecognized message chunk class {chunk.__class__}. Supported classes are "
        f"{list(_CHUNK_MSG_MAP.keys())}"
    )
+    raise ValueError(msg)


 def _default_text_splitter(text: str) -> list[str]:
--- a/libs/core/langchain_core/output_parsers/base.py
+++ b/libs/core/langchain_core/output_parsers/base.py
@@ -177,10 +177,11 @@ class BaseOutputParser(
                if "args" in metadata and len(metadata["args"]) > 0:
                    return metadata["args"][0]

-        raise TypeError(
+        msg = (
            f"Runnable {self.__class__.__name__} doesn't have an inferable OutputType. "
            "Override the OutputType property to specify the output type."
        )
+        raise TypeError(msg)

    def invoke(
        self,
@@ -310,10 +311,11 @@ class BaseOutputParser(
    @property
    def _type(self) -> str:
        """Return the output parser type for serialization."""
-        raise NotImplementedError(
+        msg = (
            f"_type property is not implemented in class {self.__class__.__name__}."
            " This is required for serialization."
        )
+        raise NotImplementedError(msg)

    def dict(self, **kwargs: Any) -> dict:
        """Return dictionary representation of output parser."""
--- a/libs/core/langchain_core/output_parsers/openai_functions.py
+++ b/libs/core/langchain_core/output_parsers/openai_functions.py
@@ -36,16 +36,14 @@ class OutputFunctionsParser(BaseGenerationOutputParser[Any]):
        """
        generation = result[0]
        if not isinstance(generation, ChatGeneration):
-            raise OutputParserException(
-                "This output parser can only be used with a chat generation."
-            )
+            msg = "This output parser can only be used with a chat generation."
+            raise OutputParserException(msg)
        message = generation.message
        try:
            func_call = copy.deepcopy(message.additional_kwargs["function_call"])
        except KeyError as exc:
-            raise OutputParserException(
-                f"Could not parse function call: {exc}"
-            ) from exc
+            msg = f"Could not parse function call: {exc}"
+            raise OutputParserException(msg) from exc

        if self.args_only:
            return func_call["arguments"]
@@ -88,14 +86,12 @@ class JsonOutputFunctionsParser(BaseCumulativeTransformOutputParser[Any]):
        """

        if len(result) != 1:
-            raise OutputParserException(
-                f"Expected exactly one result, but got {len(result)}"
-            )
+            msg = f"Expected exactly one result, but got {len(result)}"
+            raise OutputParserException(msg)
        generation = result[0]
        if not isinstance(generation, ChatGeneration):
-            raise OutputParserException(
-                "This output parser can only be used with a chat generation."
-            )
+            msg = "This output parser can only be used with a chat generation."
+            raise OutputParserException(msg)
        message = generation.message
        try:
            function_call = message.additional_kwargs["function_call"]
@@ -103,9 +99,8 @@ class JsonOutputFunctionsParser(BaseCumulativeTransformOutputParser[Any]):
            if partial:
                return None
            else:
-                raise OutputParserException(
-                    f"Could not parse function call: {exc}"
-                ) from exc
+                msg = f"Could not parse function call: {exc}"
+                raise OutputParserException(msg) from exc
        try:
            if partial:
                try:
@@ -129,9 +124,8 @@ class JsonOutputFunctionsParser(BaseCumulativeTransformOutputParser[Any]):
                            function_call["arguments"], strict=self.strict
                        )
                    except (json.JSONDecodeError, TypeError) as exc:
-                        raise OutputParserException(
-                            f"Could not parse function call data: {exc}"
-                        ) from exc
+                        msg = f"Could not parse function call data: {exc}"
+                        raise OutputParserException(msg) from exc
                else:
                    try:
                        return {
@@ -141,9 +135,8 @@ class JsonOutputFunctionsParser(BaseCumulativeTransformOutputParser[Any]):
                            ),
                        }
                    except (json.JSONDecodeError, TypeError) as exc:
-                        raise OutputParserException(
-                            f"Could not parse function call data: {exc}"
-                        ) from exc
+                        msg = f"Could not parse function call data: {exc}"
+                        raise OutputParserException(msg) from exc
        except KeyError:
            return None

@@ -158,7 +151,7 @@ class JsonOutputFunctionsParser(BaseCumulativeTransformOutputParser[Any]):
        Returns:
            The parsed JSON object.
        """
-        raise NotImplementedError()
+        raise NotImplementedError


 class JsonKeyOutputFunctionsParser(JsonOutputFunctionsParser):
@@ -253,10 +246,11 @@ class PydanticOutputFunctionsParser(OutputFunctionsParser):
                and issubclass(schema, BaseModel)
            )
        elif values["args_only"] and isinstance(schema, dict):
-            raise ValueError(
+            msg = (
                "If multiple pydantic schemas are provided then args_only should be"
                " False."
            )
+            raise ValueError(msg)
        return values

    def parse_result(self, result: list[Generation], *, partial: bool = False) -> Any:
--- a/libs/core/langchain_core/output_parsers/openai_tools.py
+++ b/libs/core/langchain_core/output_parsers/openai_tools.py
@@ -52,11 +52,12 @@ def parse_tool_call(
                raw_tool_call["function"]["arguments"], strict=strict
            )
        except JSONDecodeError as e:
-            raise OutputParserException(
+            msg = (
                f"Function {raw_tool_call['function']['name']} arguments:\n\n"
                f"{raw_tool_call['function']['arguments']}\n\nare not valid JSON. "
                f"Received JSONDecodeError {e}"
-            ) from e
+            )
+            raise OutputParserException(msg) from e
    parsed = {
        "name": raw_tool_call["function"]["name"] or "",
        "args": function_args or {},
@@ -170,9 +171,8 @@ class JsonOutputToolsParser(BaseCumulativeTransformOutputParser[Any]):

        generation = result[0]
        if not isinstance(generation, ChatGeneration):
-            raise OutputParserException(
-                "This output parser can only be used with a chat generation."
-            )
+            msg = "This output parser can only be used with a chat generation."
+            raise OutputParserException(msg)
        message = generation.message
        if isinstance(message, AIMessage) and message.tool_calls:
            tool_calls = [dict(tc) for tc in message.tool_calls]
@@ -207,7 +207,7 @@ class JsonOutputToolsParser(BaseCumulativeTransformOutputParser[Any]):
        Returns:
            The parsed tool calls.
        """
-        raise NotImplementedError()
+        raise NotImplementedError


 class JsonOutputKeyToolsParser(JsonOutputToolsParser):
@@ -285,10 +285,11 @@ class PydanticToolsParser(JsonOutputToolsParser):
        for res in json_results:
            try:
                if not isinstance(res["args"], dict):
-                    raise ValueError(
+                    msg = (
                        f"Tool arguments must be specified as a dict, received: "
                        f"{res['args']}"
                    )
+                    raise ValueError(msg)
                pydantic_objects.append(name_dict[res["type"]](**res["args"]))
            except (ValidationError, ValueError) as e:
                if partial:
--- a/libs/core/langchain_core/output_parsers/pydantic.py
+++ b/libs/core/langchain_core/output_parsers/pydantic.py
@@ -29,10 +29,9 @@ class PydanticOutputParser(JsonOutputParser, Generic[TBaseModel]):
                elif issubclass(self.pydantic_object, pydantic.v1.BaseModel):
                    return self.pydantic_object.parse_obj(obj)
                else:
-                    raise OutputParserException(
-                        f"Unsupported model version for PydanticOutputParser: \
+                    msg = f"Unsupported model version for PydanticOutputParser: \
                            {self.pydantic_object.__class__}"
-                    )
+                    raise OutputParserException(msg)
            except (pydantic.ValidationError, pydantic.v1.ValidationError) as e:
                raise self._parser_exception(e, obj) from e
        else:  # pydantic v1
--- a/libs/core/langchain_core/output_parsers/transform.py
+++ b/libs/core/langchain_core/output_parsers/transform.py
@@ -106,7 +106,7 @@ class BaseCumulativeTransformOutputParser(BaseTransformOutputParser[T]):
        Returns:
            The diff between the previous and current parsed output.
        """
-        raise NotImplementedError()
+        raise NotImplementedError

    def _transform(self, input: Iterator[Union[str, BaseMessage]]) -> Iterator[Any]:
        prev_parsed = None
--- a/Show More
+++ b/Show More