[docs]: updating mistral and hugging face chat model pages (#24731)

2025-08-15 07:36:08 +00:00 · 2024-08-02 15:21:25 -07:00 · 2024-08-02 15:21:25 -07:00 · 2ae76cecde
commit 2ae76cecde
parent 4305f78e40
2 changed files with 354 additions and 518 deletions
--- a/docs/docs/integrations/chat/huggingface.ipynb
+++ b/docs/docs/integrations/chat/huggingface.ipynb
@ -1,29 +1,15 @@
 {
 "cells": [
-  {
-   "cell_type": "markdown",
-   "metadata": {},
-   "source": [
-    "---\n",
-    "sidebar_label: Hugging Face\n",
-    "---"
-   ]
-  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# ChatHuggingFace\n",
    "\n",
+    "This will help you getting started with `langchain_huggingface` [chat models](/docs/concepts/#chat-models). For detailed documentation of all `ChatHuggingFace` features and configurations head to the [API reference](https://api.python.langchain.com/en/latest/chat_models/langchain_huggingface.chat_models.huggingface.ChatHuggingFace.html). For a list of models supported by Hugging Face check out [this page](https://huggingface.co/models).\n",
+    "\n",
    "## Overview\n",
-    "\n",
-    "This notebook shows how to get started using Hugging Face LLMs as chat models.\n",
-    "\n",
-    "In particular, we will:\n",
-    "1. Utilize the [HuggingFaceEndpoint](https://github.com/langchain-ai/langchain/blob/master/libs/langchain/langchain/llms/huggingface_endpoint.py) integrations to instantiate an LLM.\n",
-    "2. Utilize the `ChatHuggingFace` class to enable any of these LLMs to interface with LangChain's [Chat Messages](/docs/concepts/#message-types) abstraction.\n",
-    "3. Explore tool calling with the `ChatHuggingFace`.\n",
-    "4. Demonstrate how to use an open-source LLM to power an `ChatAgent` pipeline\n",
+    "### Integration details\n",
    "\n",
    "### Integration details\n",
    "\n",
@ -64,7 +50,22 @@
   "source": [
    "### Installation\n",
    "\n",
-    "Below we install additional packages as well for demonstration purposes:"
+    "| Class | Package | Local | Serializable | JS support | Package downloads | Package latest |\n",
+    "| :--- | :--- | :---: | :---: |  :---: | :---: | :---: |\n",
+    "| [ChatHuggingFace](https://api.python.langchain.com/en/latest/chat_models/langchain_huggingface.chat_models.huggingface.ChatHuggingFace.html) | [langchain_huggingface](https://api.python.langchain.com/en/latest/huggingface_api_reference.html) | ✅ | ❌ | ❌ | ![PyPI - Downloads](https://img.shields.io/pypi/dm/langchain_huggingface?style=flat-square&label=%20) | ![PyPI - Version](https://img.shields.io/pypi/v/langchain_huggingface?style=flat-square&label=%20) |\n",
+    "\n",
+    "### Model features\n",
+    "| [Tool calling](/docs/how_to/tool_calling) | [Structured output](/docs/how_to/structured_output/) | JSON mode | [Image input](/docs/how_to/multimodal_inputs/) | Audio input | Video input | [Token-level streaming](/docs/how_to/chat_streaming/) | Native async | [Token usage](/docs/how_to/chat_token_usage_tracking/) | [Logprobs](/docs/how_to/logprobs/) |\n",
+    "| :---: | :---: | :---: | :---: |  :---: | :---: | :---: | :---: | :---: | :---: |\n",
+    "| ✅ | ✅ | ❌ | ❌ | ❌ | ❌ | ❌ | ❌ | ❌ | ❌ | \n",
+    "\n",
+    "## Setup\n",
+    "\n",
+    "To access `langchain_huggingface` models you'll need to create a/an `Hugging Face` account, get an API key, and install the `langchain_huggingface` integration package.\n",
+    "\n",
+    "### Credentials\n",
+    "\n",
+    "You'll need to have a [Hugging Face Access Token](https://huggingface.co/docs/hub/security-tokens) saved as an environment variable: `HUGGINGFACEHUB_API_TOKEN`."
   ]
  },
  {
@ -73,14 +74,41 @@
   "metadata": {},
   "outputs": [],
   "source": [
-    "%pip install --upgrade --quiet  langchain-huggingface text-generation transformers google-search-results numexpr langchainhub sentencepiece jinja2"
+    "import getpass\n",
+    "import os\n",
+    "\n",
+    "os.environ[\"HUGGINGFACEHUB_API_TOKEN\"] = getpass.getpass(\n",
+    "    \"Enter your Hugging Face API key: \"\n",
+    ")"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 5,
+   "metadata": {},
+   "outputs": [
+    {
+     "name": "stdout",
+     "output_type": "stream",
+     "text": [
+      "\n",
+      "\u001b[1m[\u001b[0m\u001b[34;49mnotice\u001b[0m\u001b[1;39;49m]\u001b[0m\u001b[39;49m A new release of pip is available: \u001b[0m\u001b[31;49m24.0\u001b[0m\u001b[39;49m -> \u001b[0m\u001b[32;49m24.1.2\u001b[0m\n",
+      "\u001b[1m[\u001b[0m\u001b[34;49mnotice\u001b[0m\u001b[1;39;49m]\u001b[0m\u001b[39;49m To update, run: \u001b[0m\u001b[32;49mpip install --upgrade pip\u001b[0m\n",
+      "Note: you may need to restart the kernel to use updated packages.\n"
+     ]
+    }
+   ],
+   "source": [
+    "%pip install --upgrade --quiet  langchain-huggingface text-generation transformers google-search-results numexpr langchainhub sentencepiece jinja2 bitsandbytes accelerate"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
-    "## Instantiation"
+    "## Instantiation\n",
+    "\n",
+    "You can instantiate a `ChatHuggingFace` model in two different ways, either from a `HuggingFaceEndpoint` or from a `HuggingFacePipeline`."
   ]
  },
  {
@ -92,19 +120,32 @@
  },
  {
   "cell_type": "code",
-   "execution_count": null,
+   "execution_count": 10,
   "metadata": {},
-   "outputs": [],
+   "outputs": [
+    {
+     "name": "stdout",
+     "output_type": "stream",
+     "text": [
+      "The token has not been saved to the git credentials helper. Pass `add_to_git_credential=True` in this function directly or `--add-to-git-credential` if using via `huggingface-cli` if you want to set the git credential as well.\n",
+      "Token is valid (permission: fineGrained).\n",
+      "Your token has been saved to /Users/isaachershenson/.cache/huggingface/token\n",
+      "Login successful\n"
+     ]
+    }
+   ],
   "source": [
-    "from langchain_huggingface import HuggingFaceEndpoint\n",
+    "from langchain_huggingface import ChatHuggingFace, HuggingFaceEndpoint\n",
    "\n",
    "llm = HuggingFaceEndpoint(\n",
-    "    repo_id=\"meta-llama/Meta-Llama-3-70B-Instruct\",\n",
+    "    repo_id=\"HuggingFaceH4/zephyr-7b-beta\",\n",
    "    task=\"text-generation\",\n",
    "    max_new_tokens=512,\n",
    "    do_sample=False,\n",
    "    repetition_penalty=1.03,\n",
-    ")"
+    ")\n",
+    "\n",
+    "chat_model = ChatHuggingFace(llm=llm)"
   ]
  },
  {
@ -116,11 +157,194 @@
  },
  {
   "cell_type": "code",
-   "execution_count": null,
+   "execution_count": 9,
   "metadata": {},
-   "outputs": [],
+   "outputs": [
+    {
+     "data": {
+      "application/vnd.jupyter.widget-view+json": {
+       "model_id": "da32ae8ec8864ccfb480044fe2eec065",
+       "version_major": 2,
+       "version_minor": 0
+      },
+      "text/plain": [
+       "config.json:   0%|          | 0.00/638 [00:00<?, ?B/s]"
+      ]
+     },
+     "metadata": {},
+     "output_type": "display_data"
+    },
+    {
+     "data": {
+      "application/vnd.jupyter.widget-view+json": {
+       "model_id": "ee1891b7e5f64fba88ba35f444e598fb",
+       "version_major": 2,
+       "version_minor": 0
+      },
+      "text/plain": [
+       "model.safetensors.index.json:   0%|          | 0.00/23.9k [00:00<?, ?B/s]"
+      ]
+     },
+     "metadata": {},
+     "output_type": "display_data"
+    },
+    {
+     "data": {
+      "application/vnd.jupyter.widget-view+json": {
+       "model_id": "9ff1ec7f575b42adb608c15955de7888",
+       "version_major": 2,
+       "version_minor": 0
+      },
+      "text/plain": [
+       "Downloading shards:   0%|          | 0/8 [00:00<?, ?it/s]"
+      ]
+     },
+     "metadata": {},
+     "output_type": "display_data"
+    },
+    {
+     "data": {
+      "application/vnd.jupyter.widget-view+json": {
+       "model_id": "5214696698814b919f561647a684d1e4",
+       "version_major": 2,
+       "version_minor": 0
+      },
+      "text/plain": [
+       "model-00001-of-00008.safetensors:   0%|          | 0.00/1.89G [00:00<?, ?B/s]"
+      ]
+     },
+     "metadata": {},
+     "output_type": "display_data"
+    },
+    {
+     "data": {
+      "application/vnd.jupyter.widget-view+json": {
+       "model_id": "9ac334c69a2048a0a77340cca44d8c80",
+       "version_major": 2,
+       "version_minor": 0
+      },
+      "text/plain": [
+       "model-00002-of-00008.safetensors:   0%|          | 0.00/1.95G [00:00<?, ?B/s]"
+      ]
+     },
+     "metadata": {},
+     "output_type": "display_data"
+    },
+    {
+     "data": {
+      "application/vnd.jupyter.widget-view+json": {
+       "model_id": "465ad1a51d414e0daf1cd9308455be94",
+       "version_major": 2,
+       "version_minor": 0
+      },
+      "text/plain": [
+       "model-00003-of-00008.safetensors:   0%|          | 0.00/1.98G [00:00<?, ?B/s]"
+      ]
+     },
+     "metadata": {},
+     "output_type": "display_data"
+    },
+    {
+     "data": {
+      "application/vnd.jupyter.widget-view+json": {
+       "model_id": "a329c43c3d574df0afd38c7457cc639c",
+       "version_major": 2,
+       "version_minor": 0
+      },
+      "text/plain": [
+       "model-00004-of-00008.safetensors:   0%|          | 0.00/1.95G [00:00<?, ?B/s]"
+      ]
+     },
+     "metadata": {},
+     "output_type": "display_data"
+    },
+    {
+     "data": {
+      "application/vnd.jupyter.widget-view+json": {
+       "model_id": "a736a6c4023542af8c6ecc232b823d18",
+       "version_major": 2,
+       "version_minor": 0
+      },
+      "text/plain": [
+       "model-00005-of-00008.safetensors:   0%|          | 0.00/1.98G [00:00<?, ?B/s]"
+      ]
+     },
+     "metadata": {},
+     "output_type": "display_data"
+    },
+    {
+     "data": {
+      "application/vnd.jupyter.widget-view+json": {
+       "model_id": "8bdee70b843d433e8236fff83ecda022",
+       "version_major": 2,
+       "version_minor": 0
+      },
+      "text/plain": [
+       "model-00006-of-00008.safetensors:   0%|          | 0.00/1.95G [00:00<?, ?B/s]"
+      ]
+     },
+     "metadata": {},
+     "output_type": "display_data"
+    },
+    {
+     "data": {
+      "application/vnd.jupyter.widget-view+json": {
+       "model_id": "5ecb6103e0304ae188a14d598119a361",
+       "version_major": 2,
+       "version_minor": 0
+      },
+      "text/plain": [
+       "model-00007-of-00008.safetensors:   0%|          | 0.00/1.98G [00:00<?, ?B/s]"
+      ]
+     },
+     "metadata": {},
+     "output_type": "display_data"
+    },
+    {
+     "data": {
+      "application/vnd.jupyter.widget-view+json": {
+       "model_id": "174e3cb487bd453c9c70d7614254a35e",
+       "version_major": 2,
+       "version_minor": 0
+      },
+      "text/plain": [
+       "model-00008-of-00008.safetensors:   0%|          | 0.00/816M [00:00<?, ?B/s]"
+      ]
+     },
+     "metadata": {},
+     "output_type": "display_data"
+    },
+    {
+     "data": {
+      "application/vnd.jupyter.widget-view+json": {
+       "model_id": "28f8c233b04b45d7800e12c785a8c4bc",
+       "version_major": 2,
+       "version_minor": 0
+      },
+      "text/plain": [
+       "Loading checkpoint shards:   0%|          | 0/8 [00:00<?, ?it/s]"
+      ]
+     },
+     "metadata": {},
+     "output_type": "display_data"
+    },
+    {
+     "data": {
+      "application/vnd.jupyter.widget-view+json": {
+       "model_id": "449dfa023dc8430fbcde94544ba01c4f",
+       "version_major": 2,
+       "version_minor": 0
+      },
+      "text/plain": [
+       "generation_config.json:   0%|          | 0.00/111 [00:00<?, ?B/s]"
+      ]
+     },
+     "metadata": {},
+     "output_type": "display_data"
+    }
+   ],
   "source": [
-    "from langchain_huggingface import HuggingFacePipeline\n",
+    "from langchain_huggingface import ChatHuggingFace, HuggingFacePipeline\n",
    "\n",
    "llm = HuggingFacePipeline.from_model_id(\n",
    "    model_id=\"HuggingFaceH4/zephyr-7b-beta\",\n",
@ -129,8 +353,34 @@
    "        max_new_tokens=512,\n",
    "        do_sample=False,\n",
    "        repetition_penalty=1.03,\n",
-    "        return_full_text=False,\n",
    "    ),\n",
+    ")\n",
+    "\n",
+    "chat_model = ChatHuggingFace(llm=llm)"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "### Instatiating with Quantization\n",
+    "\n",
+    "To run a quantized version of your model, you can specify a `bitsandbytes` quantization config as follows:"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 6,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "from transformers import BitsAndBytesConfig\n",
+    "\n",
+    "quantization_config = BitsAndBytesConfig(\n",
+    "    load_in_4bit=True,\n",
+    "    bnb_4bit_quant_type=\"nf4\",\n",
+    "    bnb_4bit_compute_dtype=\"float16\",\n",
+    "    bnb_4bit_use_double_quant=True,\n",
    ")"
   ]
  },
@ -138,30 +388,27 @@
   "cell_type": "markdown",
   "metadata": {},
   "source": [
-    "To run a quantized version, you might specify a `bitsandbytes` quantization config as follows:\n",
-    "\n",
-    "```python\n",
-    "from transformers import BitsAndBytesConfig\n",
-    "\n",
-    "quantization_config = BitsAndBytesConfig(\n",
-    "    load_in_4bit=True,\n",
-    "    bnb_4bit_quant_type=\"nf4\",\n",
-    "    bnb_4bit_compute_dtype=\"float16\",\n",
-    "    bnb_4bit_use_double_quant=True\n",
-    ")\n",
-    "```\n",
-    "\n",
-    "and pass it to the `HuggingFacePipeline` as a part of its `model_kwargs`:\n",
-    "\n",
-    "```python\n",
-    "pipeline = HuggingFacePipeline(\n",
-    "    ...\n",
-    "\n",
+    "and pass it to the `HuggingFacePipeline` as a part of its `model_kwargs`:"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "llm = HuggingFacePipeline.from_model_id(\n",
+    "    model_id=\"HuggingFaceH4/zephyr-7b-beta\",\n",
+    "    task=\"text-generation\",\n",
+    "    pipeline_kwargs=dict(\n",
+    "        max_new_tokens=512,\n",
+    "        do_sample=False,\n",
+    "        repetition_penalty=1.03,\n",
+    "    ),\n",
    "    model_kwargs={\"quantization_config\": quantization_config},\n",
-    "    \n",
-    "    ...\n",
    ")\n",
-    "```"
+    "\n",
+    "chat_model = ChatHuggingFace(llm=llm)"
   ]
  },
  {
@ -171,34 +418,16 @@
    "## Invocation"
   ]
  },
-  {
-   "cell_type": "markdown",
-   "metadata": {},
-   "source": [
-    "Instantiate the chat model and some messages to pass. \n",
-    "\n",
-    "**Note**: you need to pass the `model_id` explicitly if you are using self-hosted `text-generation-inference`"
-   ]
-  },
  {
   "cell_type": "code",
-   "execution_count": 3,
+   "execution_count": 11,
   "metadata": {},
-   "outputs": [
-    {
-     "name": "stderr",
-     "output_type": "stream",
-     "text": [
-      "Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained.\n"
-     ]
-    }
-   ],
+   "outputs": [],
   "source": [
    "from langchain_core.messages import (\n",
    "    HumanMessage,\n",
    "    SystemMessage,\n",
    ")\n",
-    "from langchain_huggingface import ChatHuggingFace\n",
    "\n",
    "messages = [\n",
    "    SystemMessage(content=\"You're a helpful assistant\"),\n",
@ -207,343 +436,35 @@
    "    ),\n",
    "]\n",
    "\n",
-    "chat_model = ChatHuggingFace(llm=llm)"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "metadata": {},
-   "source": [
-    "Check the `model_id`"
+    "ai_msg = chat_model.invoke(messages)"
   ]
  },
  {
   "cell_type": "code",
-   "execution_count": 4,
-   "metadata": {},
-   "outputs": [
-    {
-     "data": {
-      "text/plain": [
-       "'meta-llama/Meta-Llama-3-70B-Instruct'"
-      ]
-     },
-     "execution_count": 4,
-     "metadata": {},
-     "output_type": "execute_result"
-    }
-   ],
-   "source": [
-    "chat_model.model_id"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "metadata": {},
-   "source": [
-    "Inspect how the chat messages are formatted for the LLM call."
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": 5,
-   "metadata": {},
-   "outputs": [
-    {
-     "data": {
-      "text/plain": [
-       "\"<|begin_of_text|><|start_header_id|>system<|end_header_id|>\\n\\nYou're a helpful assistant<|eot_id|><|start_header_id|>user<|end_header_id|>\\n\\nWhat happens when an unstoppable force meets an immovable object?<|eot_id|><|start_header_id|>assistant<|end_header_id|>\\n\\n\""
-      ]
-     },
-     "execution_count": 5,
-     "metadata": {},
-     "output_type": "execute_result"
-    }
-   ],
-   "source": [
-    "chat_model._to_chat_prompt(messages)"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "metadata": {},
-   "source": [
-    "Call the model."
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": 6,
+   "execution_count": 12,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
-      "One of the classic thought experiments in physics!\n",
+      "According to the popular phrase and hypothetical scenario, when an unstoppable force meets an immovable object, a paradoxical situation arises as both forces are seemingly contradictory. On one hand, an unstoppable force is an entity that cannot be stopped or prevented from moving forward, while on the other hand, an immovable object is something that cannot be moved or displaced from its position. \n",
      "\n",
-      "The concept of an unstoppable force meeting an immovable object is a paradox that has puzzled philosophers and physicists for centuries. It's a mind-bending scenario that challenges our understanding of the fundamental laws of physics.\n",
-      "\n",
-      "In essence, an unstoppable force is something that cannot be halted or slowed down, while an immovable object is something that cannot be moved or displaced. If we assume that both entities exist in the same universe, we run into a logical contradiction.\n",
-      "\n",
-      "Here\n"
+      "In this scenario, it is un\n"
     ]
    }
   ],
   "source": [
-    "res = chat_model.invoke(messages)\n",
-    "print(res.content)"
+    "print(ai_msg.content)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
-    "## Chaining\n",
+    "## API reference\n",
    "\n",
-    "We can [chain](/docs/how_to/sequence/) our model with a prompt template like so:"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": null,
-   "metadata": {},
-   "outputs": [],
-   "source": [
-    "from langchain_core.prompts import ChatPromptTemplate\n",
-    "\n",
-    "prompt = ChatPromptTemplate(\n",
-    "    [\n",
-    "        (\n",
-    "            \"system\",\n",
-    "            \"You are a helpful assistant that translates {input_language} to {output_language}.\",\n",
-    "        ),\n",
-    "        (\"human\", \"{input}\"),\n",
-    "    ]\n",
-    ")\n",
-    "\n",
-    "chain = prompt | llm\n",
-    "chain.invoke(\n",
-    "    {\n",
-    "        \"input_language\": \"English\",\n",
-    "        \"output_language\": \"German\",\n",
-    "        \"input\": \"I love programming.\",\n",
-    "    }\n",
-    ")"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "metadata": {},
-   "source": [
-    "## Tool calling with `ChatHuggingFace`\n",
-    "\n",
-    "`text-generation-inference` supports tool with open source LLMs starting from v2.0.1"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "metadata": {},
-   "source": [
-    "Create a basic tool (`Calculator`):"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": 7,
-   "metadata": {},
-   "outputs": [],
-   "source": [
-    "from langchain_core.pydantic_v1 import BaseModel, Field\n",
-    "\n",
-    "\n",
-    "class Calculator(BaseModel):\n",
-    "    \"\"\"Multiply two integers together.\"\"\"\n",
-    "\n",
-    "    a: int = Field(..., description=\"First integer\")\n",
-    "    b: int = Field(..., description=\"Second integer\")"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "metadata": {},
-   "source": [
-    "Bind the tool to the `chat_model` and give it a try:"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": 8,
-   "metadata": {},
-   "outputs": [
-    {
-     "data": {
-      "text/plain": [
-       "[Calculator(a=3, b=12)]"
-      ]
-     },
-     "execution_count": 8,
-     "metadata": {},
-     "output_type": "execute_result"
-    }
-   ],
-   "source": [
-    "from langchain_core.output_parsers.openai_tools import PydanticToolsParser\n",
-    "\n",
-    "llm_with_multiply = chat_model.bind_tools([Calculator], tool_choice=\"auto\")\n",
-    "parser = PydanticToolsParser(tools=[Calculator])\n",
-    "tool_chain = llm_with_multiply | parser\n",
-    "tool_chain.invoke(\"How much is 3 multiplied by 12?\")"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "metadata": {},
-   "source": [
-    "## Use with agents\n",
-    "\n",
-    "Here we'll test out `Zephyr-7B-beta` as a zero-shot `ReAct` Agent. \n",
-    "\n",
-    "The agent is based on the paper [ReAct: Synergizing Reasoning and Acting in Language Models](https://arxiv.org/abs/2210.03629)\n",
-    "\n",
-    "The example below is taken from [here](https://python.langchain.com/v0.1/docs/modules/agents/agent_types/react/#using-chat-models).\n",
-    "\n",
-    "> Note: To run this section, you'll need to have a [SerpAPI Token](https://serpapi.com/) saved as an environment variable: `SERPAPI_API_KEY`"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": null,
-   "metadata": {},
-   "outputs": [],
-   "source": [
-    "from langchain import hub\n",
-    "from langchain.agents import AgentExecutor, load_tools\n",
-    "from langchain.agents.format_scratchpad import format_log_to_str\n",
-    "from langchain.agents.output_parsers import (\n",
-    "    ReActJsonSingleInputOutputParser,\n",
-    ")\n",
-    "from langchain.tools.render import render_text_description\n",
-    "from langchain_community.utilities import SerpAPIWrapper"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "metadata": {},
-   "source": [
-    "Configure the agent with a `react-json` style prompt and access to a search engine and calculator."
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": null,
-   "metadata": {},
-   "outputs": [],
-   "source": [
-    "# setup tools\n",
-    "tools = load_tools([\"serpapi\", \"llm-math\"], llm=llm)\n",
-    "\n",
-    "# setup ReAct style prompt\n",
-    "prompt = hub.pull(\"hwchase17/react-json\")\n",
-    "prompt = prompt.partial(\n",
-    "    tools=render_text_description(tools),\n",
-    "    tool_names=\", \".join([t.name for t in tools]),\n",
-    ")\n",
-    "\n",
-    "# define the agent\n",
-    "chat_model_with_stop = chat_model.bind(stop=[\"\\nObservation\"])\n",
-    "agent = (\n",
-    "    {\n",
-    "        \"input\": lambda x: x[\"input\"],\n",
-    "        \"agent_scratchpad\": lambda x: format_log_to_str(x[\"intermediate_steps\"]),\n",
-    "    }\n",
-    "    | prompt\n",
-    "    | chat_model_with_stop\n",
-    "    | ReActJsonSingleInputOutputParser()\n",
-    ")\n",
-    "\n",
-    "# instantiate AgentExecutor\n",
-    "agent_executor = AgentExecutor(agent=agent, tools=tools, verbose=True)"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": null,
-   "metadata": {},
-   "outputs": [
-    {
-     "name": "stdout",
-     "output_type": "stream",
-     "text": [
-      "\n",
-      "\n",
-      "\u001b[1m> Entering new AgentExecutor chain...\u001b[0m\n",
-      "\u001b[32;1m\u001b[1;3mQuestion: Who is Leo DiCaprio's girlfriend? What is her current age raised to the 0.43 power?\n",
-      "\n",
-      "Thought: I need to use the Search tool to find out who Leo DiCaprio's current girlfriend is. Then, I can use the Calculator tool to raise her current age to the power of 0.43.\n",
-      "\n",
-      "Action:\n",
-      "```\n",
-      "{\n",
-      "  \"action\": \"Search\",\n",
-      "  \"action_input\": \"leo dicaprio girlfriend\"\n",
-      "}\n",
-      "```\n",
-      "\u001b[0m\u001b[36;1m\u001b[1;3mLeonardo DiCaprio may have found The One in Vittoria Ceretti. “They are in love,” a source exclusively reveals in the latest issue of Us Weekly. “Leo was clearly very proud to be showing Vittoria off and letting everyone see how happy they are together.”\u001b[0m\u001b[32;1m\u001b[1;3mNow that we know Leo DiCaprio's current girlfriend is Vittoria Ceretti, let's find out her current age.\n",
-      "\n",
-      "Action:\n",
-      "```\n",
-      "{\n",
-      "  \"action\": \"Search\",\n",
-      "  \"action_input\": \"vittoria ceretti age\"\n",
-      "}\n",
-      "```\n",
-      "\u001b[0m\u001b[36;1m\u001b[1;3m25 years\u001b[0m\u001b[32;1m\u001b[1;3mNow that we know Vittoria Ceretti's current age is 25, let's use the Calculator tool to raise it to the power of 0.43.\n",
-      "\n",
-      "Action:\n",
-      "```\n",
-      "{\n",
-      "  \"action\": \"Calculator\",\n",
-      "  \"action_input\": \"25^0.43\"\n",
-      "}\n",
-      "```\n",
-      "\u001b[0m\u001b[33;1m\u001b[1;3mAnswer: 3.991298452658078\u001b[0m\u001b[32;1m\u001b[1;3mFinal Answer: Vittoria Ceretti, Leo DiCaprio's current girlfriend, when raised to the power of 0.43 is approximately 4.0 rounded to two decimal places. Her current age is 25 years old.\u001b[0m\n",
-      "\n",
-      "\u001b[1m> Finished chain.\u001b[0m\n"
-     ]
-    },
-    {
-     "data": {
-      "text/plain": [
-       "{'input': \"Who is Leo DiCaprio's girlfriend? What is her current age raised to the 0.43 power?\",\n",
-       " 'output': \"Vittoria Ceretti, Leo DiCaprio's current girlfriend, when raised to the power of 0.43 is approximately 4.0 rounded to two decimal places. Her current age is 25 years old.\"}"
-      ]
-     },
-     "execution_count": 11,
-     "metadata": {},
-     "output_type": "execute_result"
-    }
-   ],
-   "source": [
-    "agent_executor.invoke(\n",
-    "    {\n",
-    "        \"input\": \"Who is Leo DiCaprio's girlfriend? What is her current age raised to the 0.43 power?\"\n",
-    "    }\n",
-    ")"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "metadata": {},
-   "source": [
-    "Wahoo! Our open-source 7b parameter Zephyr model was able to:\n",
-    "\n",
-    "1. Plan out a series of actions: `I need to use the Search tool to find out who Leo DiCaprio's current girlfriend is. Then, I can use the Calculator tool to raise her current age to the power of 0.43.`\n",
-    "2. Then execute a search using the SerpAPI tool to find who Leo DiCaprio's current girlfriend is\n",
-    "3. Execute another search to find her age\n",
-    "4. And finally use a calculator tool to calculate her age raised to the power of 0.43\n",
-    "\n",
-    "It's exciting to see how far open-source LLM's can go as general purpose reasoning agents. Give it a try yourself!"
+    "For detailed documentation of all `ChatHuggingFace` features and configurations head to the API reference: https://api.python.langchain.com/en/latest/chat_models/langchain_huggingface.chat_models.huggingface.ChatHuggingFace.html"
   ]
  },
  {
@ -572,7 +493,7 @@
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
-   "version": "3.10.4"
+   "version": "3.11.9"
  }
 },
 "nbformat": 4,
--- a/docs/docs/integrations/chat/mistralai.ipynb
+++ b/docs/docs/integrations/chat/mistralai.ipynb
@ -12,12 +12,12 @@
  },
  {
   "cell_type": "markdown",
-   "id": "a14c83bf-af26-4f22-8c1a-d632c5795ecf",
+   "id": "d295c2a2",
   "metadata": {},
   "source": [
-    "# MistralAI\n",
+    "# ChatMistralAI\n",
    "\n",
-    "This will help you getting started with Mistral [chat models](/docs/concepts/#chat-models), accessed via their [API](https://docs.mistral.ai/api/). For detailed documentation of all ChatMistralAI features and configurations head to the [API reference](https://api.python.langchain.com/en/latest/chat_models/langchain_mistralai.chat_models.ChatMistralAI.html).\n",
+    "This will help you getting started with Mistral [chat models](/docs/concepts/#chat-models). For detailed documentation of all `ChatMistralAI` features and configurations head to the [API reference](https://api.python.langchain.com/en/latest/chat_models/langchain_mistralai.chat_models.ChatMistralAI.html). The `ChatMistralAI` class is built on top of the [Mistral API](https://docs.mistral.ai/api/). For a list of all the models supported by Mistral, check out [this page](https://docs.mistral.ai/getting-started/models/).\n",
    "\n",
    "## Overview\n",
    "### Integration details\n",
@ -29,36 +29,35 @@
    "### Model features\n",
    "| [Tool calling](/docs/how_to/tool_calling) | [Structured output](/docs/how_to/structured_output/) | JSON mode | [Image input](/docs/how_to/multimodal_inputs/) | Audio input | Video input | [Token-level streaming](/docs/how_to/chat_streaming/) | Native async | [Token usage](/docs/how_to/chat_token_usage_tracking/) | [Logprobs](/docs/how_to/logprobs/) |\n",
    "| :---: | :---: | :---: | :---: |  :---: | :---: | :---: | :---: | :---: | :---: |\n",
-    "| ✅ | ✅ | ❌ | ❌ | ❌ | ❌ | ✅ | ✅ | ✅ | ❌ | \n",
+    "| ✅ | ✅ | ✅ | ❌ | ❌ | ❌ | ✅ | ✅ | ✅ | ❌ | \n",
    "\n",
    "## Setup\n",
    "\n",
-    "To access Mistral models you'll need to create a Mistral account, get an API key, and install the `langchain-mistralai` integration package.\n",
+    "\n",
+    "To access `ChatMistralAI` models you'll need to create a Mistral account, get an API key, and install the `langchain_mistralai` integration package.\n",
    "\n",
    "### Credentials\n",
    "\n",
-    "A valid [API key](https://console.mistral.ai/users/api-keys/) is needed to communicate with the API. Once you've obtained an API key, store it in the `MISTRAL_API_KEY` environment variable:"
+    "\n",
+    "A valid [API key](https://console.mistral.ai/users/api-keys/) is needed to communicate with the API. Once you've done this set the MISTRAL_API_KEY environment variable:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
-   "id": "9acd8340-09d4-4ece-871a-a35b0732c7d8",
+   "id": "2461605e",
   "metadata": {},
   "outputs": [],
   "source": [
    "import getpass\n",
    "import os\n",
    "\n",
-    "if not os.getenv(\"__MODULE_NAME___API_KEY\"):\n",
-    "    os.environ[\"__MODULE_NAME___API_KEY\"] = getpass.getpass(\n",
-    "        \"Enter your __ModuleName__ API key: \"\n",
-    "    )"
+    "os.environ[\"MISTRAL_API_KEY\"] = getpass.getpass(\"Enter your Mistral API key: \")"
   ]
  },
  {
   "cell_type": "markdown",
-   "id": "42c979b1-df49-4f6c-9fe6-d9dbf3ea8c2a",
+   "id": "788f37ac",
   "metadata": {},
   "source": [
    "If you want to get automated tracing of your model calls you can also set your [LangSmith](https://docs.smith.langchain.com/) API key by uncommenting below:"
@ -67,37 +66,37 @@
  {
   "cell_type": "code",
   "execution_count": null,
-   "id": "cc4f11ec-5cb3-4caf-b3cd-7a20c41b0cfe",
+   "id": "007209d5",
   "metadata": {},
   "outputs": [],
   "source": [
-    "# os.environ[\"LANGCHAIN_TRACING_V2\"] = \"true\"\n",
-    "# os.environ[\"LANGCHAIN_API_KEY\"] = getpass.getpass(\"Enter your LangSmith API key: \")"
+    "# os.environ[\"LANGSMITH_API_KEY\"] = getpass.getpass(\"Enter your LangSmith API key: \")\n",
+    "# os.environ[\"LANGSMITH_TRACING\"] = \"true\""
   ]
  },
  {
   "cell_type": "markdown",
-   "id": "0fc42221-97b2-466b-95db-10368e17ca56",
+   "id": "0f5c74f9",
   "metadata": {},
   "source": [
    "### Installation\n",
    "\n",
-    "The LangChain MistralAI integration lives in the `langchain-mistralai` package:"
+    "The LangChain Mistral integration lives in the `langchain_mistralai` package:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
-   "id": "85cb1ab8-9f2c-4b93-8415-ad65819dcb38",
+   "id": "1ab11a65",
   "metadata": {},
   "outputs": [],
   "source": [
-    "%pip install -qU langchain-mistralai"
+    "%pip install -qU langchain_mistralai"
   ]
  },
  {
   "cell_type": "markdown",
-   "id": "502127fd",
+   "id": "fb1a335e",
   "metadata": {},
   "source": [
    "## Instantiation\n",
@ -107,19 +106,24 @@
  },
  {
   "cell_type": "code",
-   "execution_count": 1,
-   "id": "2dfa801a-d040-4c09-9634-58604e8eaf16",
+   "execution_count": 5,
+   "id": "e6c38580",
   "metadata": {},
   "outputs": [],
   "source": [
-    "from langchain_mistralai.chat_models import ChatMistralAI\n",
+    "from langchain_mistralai import ChatMistralAI\n",
    "\n",
-    "llm = ChatMistralAI(model=\"mistral-large-latest\")"
+    "llm = ChatMistralAI(\n",
+    "    model=\"mistral-large-latest\",\n",
+    "    temperature=0,\n",
+    "    max_retries=2,\n",
+    "    # other params...\n",
+    ")"
   ]
  },
  {
   "cell_type": "markdown",
-   "id": "f668acff-eb14-4b3a-959a-df5bfc02968b",
+   "id": "aec79099",
   "metadata": {},
   "source": [
    "## Invocation"
@ -127,17 +131,17 @@
  },
  {
   "cell_type": "code",
-   "execution_count": 2,
-   "id": "86e3f9e6-67ec-4fbf-8ff1-85331200f412",
+   "execution_count": 6,
+   "id": "8838c3cc",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
-       "AIMessage(content=\"J'adore la programmation.\", response_metadata={'token_usage': {'prompt_tokens': 27, 'total_tokens': 36, 'completion_tokens': 9}, 'model': 'mistral-large-latest', 'finish_reason': 'stop'}, id='run-d6196c33-9410-413b-b454-4ed0bec1f0c7-0', usage_metadata={'input_tokens': 27, 'output_tokens': 9, 'total_tokens': 36})"
+       "AIMessage(content='Sure, I\\'d be happy to help you translate that sentence into French! The English sentence \"I love programming\" translates to \"J\\'aime programmer\" in French. Let me know if you have any other questions or need further assistance!', response_metadata={'token_usage': {'prompt_tokens': 32, 'total_tokens': 84, 'completion_tokens': 52}, 'model': 'mistral-small', 'finish_reason': 'stop'}, id='run-64bac156-7160-4b68-b67e-4161f63e021f-0', usage_metadata={'input_tokens': 32, 'output_tokens': 52, 'total_tokens': 84})"
      ]
     },
-     "execution_count": 2,
+     "execution_count": 6,
     "metadata": {},
     "output_type": "execute_result"
    }
@ -156,15 +160,15 @@
  },
  {
   "cell_type": "code",
-   "execution_count": 3,
-   "id": "8f8a24bc-b7f0-4d3a-b310-8a4e0ba125dd",
+   "execution_count": 7,
+   "id": "bbf6a048",
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
-      "J'adore la programmation.\n"
+      "Sure, I'd be happy to help you translate that sentence into French! The English sentence \"I love programming\" translates to \"J'aime programmer\" in French. Let me know if you have any other questions or need further assistance!\n"
     ]
    }
   ],
@ -174,116 +178,27 @@
  },
  {
   "cell_type": "markdown",
-   "id": "c361ab1e-8c0c-4206-9e3c-9d1424a12b9c",
-   "metadata": {},
-   "source": [
-    "### Async"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": 4,
-   "id": "c5fac0e9-05a4-4fc1-a3b3-e5bbb24b971b",
-   "metadata": {
-    "tags": []
-   },
-   "outputs": [
-    {
-     "data": {
-      "text/plain": [
-       "AIMessage(content=\"J'aime programmer.\", response_metadata={'token_usage': {'prompt_tokens': 27, 'total_tokens': 34, 'completion_tokens': 7}, 'model': 'mistral-large-latest', 'finish_reason': 'stop'}, id='run-1873888a-186f-49a8-ab81-24335bd3099b-0', usage_metadata={'input_tokens': 27, 'output_tokens': 7, 'total_tokens': 34})"
-      ]
-     },
-     "execution_count": 4,
-     "metadata": {},
-     "output_type": "execute_result"
-    }
-   ],
-   "source": [
-    "await llm.ainvoke(messages)"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "id": "86ccef97",
-   "metadata": {},
-   "source": [
-    "### Streaming\n"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": 5,
-   "id": "025be980-e50d-4a68-93dc-c9c7b500ce34",
-   "metadata": {
-    "tags": []
-   },
-   "outputs": [
-    {
-     "name": "stdout",
-     "output_type": "stream",
-     "text": [
-      "J'adore programmer."
-     ]
-    }
-   ],
-   "source": [
-    "for chunk in llm.stream(messages):\n",
-    "    print(chunk.content, end=\"\")"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "id": "f6189577",
-   "metadata": {},
-   "source": [
-    "### Batch"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": 6,
-   "id": "e63aebcb",
-   "metadata": {},
-   "outputs": [
-    {
-     "data": {
-      "text/plain": [
-       "[AIMessage(content=\"J'adore la programmation.\", response_metadata={'token_usage': {'prompt_tokens': 27, 'total_tokens': 36, 'completion_tokens': 9}, 'model': 'mistral-large-latest', 'finish_reason': 'stop'}, id='run-2aa2a189-c405-4cf5-bd31-e9025e4c8536-0', usage_metadata={'input_tokens': 27, 'output_tokens': 9, 'total_tokens': 36})]"
-      ]
-     },
-     "execution_count": 6,
-     "metadata": {},
-     "output_type": "execute_result"
-    }
-   ],
-   "source": [
-    "llm.batch([messages])"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "id": "38e39e71",
+   "id": "32b87f87",
   "metadata": {},
   "source": [
    "## Chaining\n",
    "\n",
-    "You can also easily combine with a prompt template for easy structuring of user input. We can do this using [LCEL](/docs/concepts#langchain-expression-language-lcel)"
+    "We can [chain](/docs/how_to/sequence/) our model with a prompt template like so:"
   ]
  },
  {
   "cell_type": "code",
-   "execution_count": 7,
-   "id": "ee43a1ae",
+   "execution_count": 8,
+   "id": "24e2c51c",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
-       "AIMessage(content='Ich liebe Programmieren.', response_metadata={'token_usage': {'prompt_tokens': 21, 'total_tokens': 28, 'completion_tokens': 7}, 'model': 'mistral-large-latest', 'finish_reason': 'stop'}, id='run-409ebc9a-b4a0-4734-ab6f-e11f6b4f808f-0', usage_metadata={'input_tokens': 21, 'output_tokens': 7, 'total_tokens': 28})"
+       "AIMessage(content='Ich liebe Programmierung. (German translation)', response_metadata={'token_usage': {'prompt_tokens': 26, 'total_tokens': 38, 'completion_tokens': 12}, 'model': 'mistral-small', 'finish_reason': 'stop'}, id='run-dfd4094f-e347-47b0-9056-8ebd7ea35fe7-0', usage_metadata={'input_tokens': 26, 'output_tokens': 12, 'total_tokens': 38})"
      ]
     },
-     "execution_count": 7,
+     "execution_count": 8,
     "metadata": {},
     "output_type": "execute_result"
    }
@ -291,7 +206,7 @@
   "source": [
    "from langchain_core.prompts import ChatPromptTemplate\n",
    "\n",
-    "prompt = ChatPromptTemplate(\n",
+    "prompt = ChatPromptTemplate.from_messages(\n",
    "    [\n",
    "        (\n",
    "            \"system\",\n",
@ -313,12 +228,12 @@
  },
  {
   "cell_type": "markdown",
-   "id": "eb7e01fb-a433-48b1-a4c2-e6009523a896",
+   "id": "cb9b5834",
   "metadata": {},
   "source": [
    "## API reference\n",
    "\n",
-    "For detailed documentation of all ChatMistralAI features and configurations head to the API reference: https://api.python.langchain.com/en/latest/chat_models/langchain_mistralai.chat_models.ChatMistralAI.html"
+    "Head to the [API reference](https://api.python.langchain.com/en/latest/chat_models/langchain_mistralai.chat_models.ChatMistralAI.html) for detailed documentation of all attributes and methods."
   ]
  }
 ],
@ -338,7 +253,7 @@
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
-   "version": "3.10.4"
+   "version": "3.11.9"
  }
 },
 "nbformat": 4,