Merge branch 'wip-v0.4' into cc/0.4/docs

docs: v0.4 top level refinements (#32474 )
docs(ollama): update Ollama integration documentation for new chat model (#32475 )
2026-02-10 11:10:23 +00:00 · 2025-08-11 15:11:33 -04:00 · 2025-08-11 09:15:59 -04:00 · 2025-08-11 09:13:54 -04:00 · 2025-08-08 17:28:54 -04:00 · 2025-08-08 14:52:25 -04:00
12 changed files with 815 additions and 174 deletions
--- a/docs/docs/how_to/local_llms.ipynb
+++ b/docs/docs/how_to/local_llms.ipynb
@@ -45,8 +45,8 @@
    "A few frameworks for this have emerged to support inference of open-source LLMs on various devices:\n",
    "\n",
    "1. [`llama.cpp`](https://github.com/ggerganov/llama.cpp): C++ implementation of llama inference code with [weight optimization / quantization](https://finbarr.ca/how-is-llama-cpp-possible/)\n",
-    "2. [`gpt4all`](https://docs.gpt4all.io/index.html): Optimized C backend for inference\n",
-    "3. [`Ollama`](https://ollama.ai/): Bundles model weights and environment into an app that runs on device and serves the LLM\n",
+    "2. [`gpt4all`](https://github.com/nomic-ai/gpt4all): Optimized C backend for inference\n",
+    "3. [`ollama`](https://github.com/ollama/ollama): Bundles model weights and environment into an app that runs on device and serves the LLM\n",
    "4. [`llamafile`](https://github.com/Mozilla-Ocho/llamafile): Bundles model weights and everything needed to run the model in a single file, allowing you to run the LLM locally from this file without any additional installation steps\n",
    "\n",
    "In general, these frameworks will do a few things:\n",
@@ -74,12 +74,12 @@
    "\n",
    "## Quickstart\n",
    "\n",
-    "[`Ollama`](https://ollama.ai/) is one way to easily run inference on macOS.\n",
+    "[Ollama](https://ollama.ai/) is one way to easily run inference on macOS.\n",
    " \n",
-    "The instructions [here](https://github.com/jmorganca/ollama?tab=readme-ov-file#ollama) provide details, which we summarize:\n",
+    "The instructions [here](https://github.com/ollama/ollama?tab=readme-ov-file#ollama) provide details, which we summarize:\n",
    " \n",
    "* [Download and run](https://ollama.ai/download) the app\n",
-    "* From command line, fetch a model from this [list of options](https://github.com/jmorganca/ollama): e.g., `ollama pull llama3.1:8b`\n",
+    "* From command line, fetch a model from this [list of options](https://ollama.com/search): e.g., `ollama pull llama3.1:8b`\n",
    "* When the app is running, all models are automatically served on `localhost:11434`\n"
   ]
  },
@@ -111,11 +111,11 @@
    }
   ],
   "source": [
-    "from langchain_ollama import OllamaLLM\n",
+    "from langchain_ollama import ChatOllama\n",
    "\n",
-    "llm = OllamaLLM(model=\"llama3.1:8b\")\n",
+    "llm = ChatOllama(model=\"gpt-oss:20b\")\n",
    "\n",
-    "llm.invoke(\"The first man on the moon was ...\")"
+    "llm.invoke(\"The first man on the moon was ...\").content"
   ]
  },
  {
@@ -149,40 +149,7 @@
   ],
   "source": [
    "for chunk in llm.stream(\"The first man on the moon was ...\"):\n",
-    "    print(chunk, end=\"|\", flush=True)"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "id": "e5731060",
-   "metadata": {},
-   "source": [
-    "Ollama also includes a chat model wrapper that handles formatting conversation turns:"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": 4,
-   "id": "f14a778a",
-   "metadata": {},
-   "outputs": [
-    {
-     "data": {
-      "text/plain": [
-       "AIMessage(content='The answer is a historic one!\\n\\nThe first man to walk on the Moon was Neil Armstrong, an American astronaut and commander of the Apollo 11 mission. On July 20, 1969, Armstrong stepped out of the lunar module Eagle onto the surface of the Moon, famously declaring:\\n\\n\"That\\'s one small step for man, one giant leap for mankind.\"\\n\\nArmstrong was followed by fellow astronaut Edwin \"Buzz\" Aldrin, who also walked on the Moon during the mission. Michael Collins remained in orbit around the Moon in the command module Columbia.\\n\\nNeil Armstrong passed away on August 25, 2012, but his legacy as a pioneering astronaut and engineer continues to inspire people around the world!', response_metadata={'model': 'llama3.1:8b', 'created_at': '2024-08-01T00:38:29.176717Z', 'message': {'role': 'assistant', 'content': ''}, 'done_reason': 'stop', 'done': True, 'total_duration': 10681861417, 'load_duration': 34270292, 'prompt_eval_count': 19, 'prompt_eval_duration': 6209448000, 'eval_count': 141, 'eval_duration': 4432022000}, id='run-7bed57c5-7f54-4092-912c-ae49073dcd48-0', usage_metadata={'input_tokens': 19, 'output_tokens': 141, 'total_tokens': 160})"
-      ]
-     },
-     "execution_count": 4,
-     "metadata": {},
-     "output_type": "execute_result"
-    }
-   ],
-   "source": [
-    "from langchain_ollama import ChatOllama\n",
-    "\n",
-    "chat_model = ChatOllama(model=\"llama3.1:8b\")\n",
-    "\n",
-    "chat_model.invoke(\"Who was the first man on the moon?\")"
+    "    print(chunk.text(), end=\"|\", flush=True)"
   ]
  },
  {
@@ -200,7 +167,7 @@
    "\n",
    "### Running Apple silicon GPU\n",
    "\n",
-    "`Ollama` and [`llamafile`](https://github.com/Mozilla-Ocho/llamafile?tab=readme-ov-file#gpu-support) will automatically utilize the GPU on Apple devices.\n",
+    "`ollama` and [`llamafile`](https://github.com/Mozilla-Ocho/llamafile?tab=readme-ov-file#gpu-support) will automatically utilize the GPU on Apple devices.\n",
    " \n",
    "Other frameworks require the user to set up the environment to utilize the Apple GPU.\n",
    "\n",
@@ -212,15 +179,15 @@
    "\n",
    "In particular, ensure that conda is using the correct virtual environment that you created (`miniforge3`).\n",
    "\n",
-    "E.g., for me:\n",
+    "e.g.,\n",
    "\n",
-    "```\n",
+    "```shell\n",
    "conda activate /Users/rlm/miniforge3/envs/llama\n",
    "```\n",
    "\n",
    "With the above confirmed, then:\n",
    "\n",
-    "```\n",
+    "```shell\n",
    "CMAKE_ARGS=\"-DLLAMA_METAL=on\" FORCE_CMAKE=1 pip install -U llama-cpp-python --no-cache-dir\n",
    "```"
   ]
@@ -234,17 +201,13 @@
    "\n",
    "There are various ways to gain access to quantized model weights.\n",
    "\n",
-    "1. [`HuggingFace`](https://huggingface.co/TheBloke) - Many quantized model are available for download and can be run with framework such as [`llama.cpp`](https://github.com/ggerganov/llama.cpp). You can also download models in [`llamafile` format](https://huggingface.co/models?other=llamafile) from HuggingFace.\n",
-    "2. [`gpt4all`](https://gpt4all.io/index.html) - The model explorer offers a leaderboard of metrics and associated quantized models available for download \n",
-    "3. [`Ollama`](https://github.com/jmorganca/ollama) - Several models can be accessed directly via `pull`\n",
+    "1. [HuggingFace](https://huggingface.co/TheBloke) - Many quantized model are available for download and can be run with framework such as [`llama.cpp`](https://github.com/ggerganov/llama.cpp). You can also download models in [`llamafile` format](https://huggingface.co/models?other=llamafile) from HuggingFace.\n",
+    "2. [gpt4all](https://gpt4all.io/index.html) - The model explorer offers a leaderboard of metrics and associated quantized models available for download \n",
+    "3. [ollama](https://github.com/ollama/ollama) - Several models can be accessed directly via `pull`\n",
    "\n",
    "### Ollama\n",
    "\n",
-    "With [Ollama](https://github.com/jmorganca/ollama), fetch a model via `ollama pull <model family>:<tag>`:\n",
-    "\n",
-    "* E.g., for Llama 2 7b: `ollama pull llama2` will download the most basic version of the model (e.g., smallest # parameters and 4 bit quantization)\n",
-    "* We can also specify a particular version from the [model list](https://github.com/jmorganca/ollama?tab=readme-ov-file#model-library), e.g., `ollama pull llama2:13b`\n",
-    "* See the full set of parameters on the [API reference page](https://python.langchain.com/api_reference/community/llms/langchain_community.llms.ollama.Ollama.html)"
+    "With [Ollama](https://github.com/ollama/ollama), fetch a model via `ollama pull <model family>:<tag>`:"
   ]
  },
  {
@@ -265,7 +228,7 @@
    }
   ],
   "source": [
-    "llm = OllamaLLM(model=\"llama2:13b\")\n",
+    "llm = ChatOllama(model=\"gpt-oss:20b\")\n",
    "llm.invoke(\"The first man on the moon was ... think step by step\")"
   ]
  },
@@ -684,12 +647,6 @@
    "\n",
    "In addition, [here](https://blog.langchain.dev/using-langsmith-to-support-fine-tuning-of-open-source-llms/) is an overview on fine-tuning, which can utilize open-source LLMs."
   ]
-  },
-  {
-   "cell_type": "markdown",
-   "id": "14c2c170",
-   "metadata": {},
-   "source": []
  }
 ],
 "metadata": {
--- a/docs/docs/integrations/chat/ollama.ipynb
+++ b/docs/docs/integrations/chat/ollama.ipynb
@@ -17,25 +17,29 @@
   "source": [
    "# ChatOllama\n",
    "\n",
-    "[Ollama](https://ollama.ai/) allows you to run open-source large language models, such as Llama 2, locally.\n",
+    "[Ollama](https://ollama.com/) allows you to run open-source large language models, such as `gpt-oss`, locally.\n",
    "\n",
-    "Ollama bundles model weights, configuration, and data into a single package, defined by a Modelfile.\n",
+    "`ollama` bundles model weights, configuration, and data into a single package, defined by a Modelfile.\n",
    "\n",
    "It optimizes setup and configuration details, including GPU usage.\n",
    "\n",
-    "For a complete list of supported models and model variants, see the [Ollama model library](https://github.com/jmorganca/ollama#model-library).\n",
+    "For a complete list of supported models and model variants, see the [Ollama model library](https://ollama.com/search).\n",
+    "\n",
+    ":::warning\n",
+    "This page is for the new v1 `ChatOllama` class with standard content block output. If you are looking for the legacy v0 `Ollama` class, see the [v0.3 documentation](https://python.langchain.com/v0.3/docs/integrations/chat/ollama/).\n",
+    ":::\n",
    "\n",
    "## Overview\n",
    "### Integration details\n",
    "\n",
-    "| Class | Package | Local | Serializable | [JS support](https://js.langchain.com/v0.2/docs/integrations/chat/ollama) | Package downloads | Package latest |\n",
+    "| Class | Package | Local | Serializable | [JS support](https://js.langchain.com/docs/integrations/chat/ollama/) | Package downloads | Package latest |\n",
    "| :--- | :--- | :---: | :---: |  :---: | :---: | :---: |\n",
-    "| [ChatOllama](https://python.langchain.com/v0.2/api_reference/ollama/chat_models/langchain_ollama.chat_models.ChatOllama.html) | [langchain-ollama](https://python.langchain.com/v0.2/api_reference/ollama/index.html) | ✅ | ❌ | ✅ | ![PyPI - Downloads](https://img.shields.io/pypi/dm/langchain-ollama?style=flat-square&label=%20) | ![PyPI - Version](https://img.shields.io/pypi/v/langchain-ollama?style=flat-square&label=%20) |\n",
+    "| [ChatOllama](https://python.langchain.com/api_reference/ollama/chat_models/langchain_ollama.chat_models.ChatOllama.html#chatollama) | [langchain-ollama](https://python.langchain.com/api_reference/ollama/index.html) | ✅ | ❌ | ✅ | ![PyPI - Downloads](https://img.shields.io/pypi/dm/langchain-ollama?style=flat-square&label=%20) | ![PyPI - Version](https://img.shields.io/pypi/v/langchain-ollama?style=flat-square&label=%20) |\n",
    "\n",
    "### Model features\n",
    "| [Tool calling](/docs/how_to/tool_calling/) | [Structured output](/docs/how_to/structured_output/) | JSON mode | [Image input](/docs/how_to/multimodal_inputs/) | Audio input | Video input | [Token-level streaming](/docs/how_to/chat_streaming/) | Native async | [Token usage](/docs/how_to/chat_token_usage_tracking/) | [Logprobs](/docs/how_to/logprobs/) |\n",
    "| :---: |:----------------------------------------------------:| :---: | :---: |  :---: | :---: | :---: | :---: | :---: | :---: |\n",
-    "| ✅ |                          ✅                           | ✅ | ❌ | ❌ | ❌ | ✅ | ✅ | ❌ | ❌ |\n",
+    "| ✅ |                          ✅                           | ✅ | ✅ | ❌ | ❌ | ✅ | ✅ | ❌ | ❌ |\n",
    "\n",
    "## Setup\n",
    "\n",
@@ -45,17 +49,17 @@
    "    * macOS users can install via Homebrew with `brew install ollama` and start with `brew services start ollama`\n",
    "* Fetch available LLM model via `ollama pull <name-of-model>`\n",
    "    * View a list of available models via the [model library](https://ollama.ai/library)\n",
-    "    * e.g., `ollama pull llama3`\n",
+    "    * e.g., `ollama pull gpt-oss:20b`\n",
    "* This will download the default tagged version of the model. Typically, the default points to the latest, smallest sized-parameter model.\n",
    "\n",
    "> On Mac, the models will be download to `~/.ollama/models`\n",
    ">\n",
    "> On Linux (or WSL), the models will be stored at `/usr/share/ollama/.ollama/models`\n",
    "\n",
-    "* Specify the exact version of the model of interest as such `ollama pull vicuna:13b-v1.5-16k-q4_0` (View the [various tags for the `Vicuna`](https://ollama.ai/library/vicuna/tags) model in this instance)\n",
+    "* Specify the exact version of the model of interest as such `ollama pull gpt-oss:20b`\n",
    "* To view all pulled models, use `ollama list`\n",
    "* To chat directly with a model from the command line, use `ollama run <name-of-model>`\n",
-    "* View the [Ollama documentation](https://github.com/ollama/ollama/tree/main/docs) for more commands. You can run `ollama help` in the terminal to see available commands.\n"
+    "* View the [Ollama documentation](https://github.com/ollama/ollama/blob/main/docs/README.md) for more commands. You can run `ollama help` in the terminal to see available commands.\n"
   ]
  },
  {
@@ -102,7 +106,11 @@
   "id": "b18bd692076f7cf7",
   "metadata": {},
   "source": [
-    "Make sure you're using the latest Ollama version for structured outputs. Update by running:"
+    ":::warning\n",
+    "Make sure you're using the latest Ollama client version!\n",
+    ":::\n",
+    "\n",
+    "Update by running:"
   ]
  },
  {
@@ -127,15 +135,16 @@
  },
  {
   "cell_type": "code",
-   "execution_count": 9,
+   "execution_count": 2,
   "id": "cb09c344-1836-4e0c-acf8-11d13ac1dbae",
   "metadata": {},
   "outputs": [],
   "source": [
-    "from langchain_ollama import ChatOllama\n",
+    "from langchain_ollama.v1 import ChatOllama\n",
    "\n",
    "llm = ChatOllama(\n",
-    "    model=\"llama3.1\",\n",
+    "    model=\"gpt-oss:20b\",\n",
+    "    validate_model_on_init=True,\n",
    "    temperature=0,\n",
    "    # other params...\n",
    ")"
@@ -158,46 +167,56 @@
   },
   "outputs": [
    {
-     "data": {
-      "text/plain": [
-       "AIMessage(content='The translation of \"I love programming\" in French is:\\n\\n\"J\\'adore le programmation.\"', additional_kwargs={}, response_metadata={'model': 'llama3.1', 'created_at': '2025-06-25T18:43:00.483666Z', 'done': True, 'done_reason': 'stop', 'total_duration': 619971208, 'load_duration': 27793125, 'prompt_eval_count': 35, 'prompt_eval_duration': 36354583, 'eval_count': 22, 'eval_duration': 555182667, 'model_name': 'llama3.1'}, id='run--348bb5ef-9dd9-4271-bc7e-a9ddb54c28c1-0', usage_metadata={'input_tokens': 35, 'output_tokens': 22, 'total_tokens': 57})"
-      ]
-     },
-     "execution_count": 5,
-     "metadata": {},
-     "output_type": "execute_result"
+     "name": "stdout",
+     "output_type": "stream",
+     "text": [
+      "AIMessage(type='ai', name=None, id='lc_run--5521db11-a5eb-4e46-956c-1455151cdaa3-0', lc_version='v1', content=[{'type': 'text', 'text': 'The translation of \"I love programming\" to French is:\\n\\n\"Je aime le programmation\"\\n\\nHowever, a more common and idiomatic way to express this in French would be:\\n\\n\"J\\'aime programmer\"\\n\\nThis phrase uses the verb \"aimer\" (to love) in the present tense, which is more suitable for expressing a general feeling or preference.'}], usage_metadata={'input_tokens': 34, 'output_tokens': 73, 'total_tokens': 107}, response_metadata={'model_name': 'llama3.2', 'created_at': '2025-08-08T23:07:44.439483Z', 'done': True, 'done_reason': 'stop', 'total_duration': 1410566833, 'load_duration': 28419542, 'prompt_eval_count': 34, 'prompt_eval_duration': 141642125, 'eval_count': 73, 'eval_duration': 1240075000}, parsed=None)\n",
+      "\n",
+      "Content:\n",
+      "The translation of \"I love programming\" to French is:\n",
+      "\n",
+      "\"Je aime le programmation\"\n",
+      "\n",
+      "However, a more common and idiomatic way to express this in French would be:\n",
+      "\n",
+      "\"J'aime programmer\"\n",
+      "\n",
+      "This phrase uses the verb \"aimer\" (to love) in the present tense, which is more suitable for expressing a general feeling or preference.\n"
+     ]
    }
   ],
   "source": [
-    "messages = [\n",
-    "    (\n",
-    "        \"system\",\n",
-    "        \"You are a helpful assistant that translates English to French. Translate the user sentence.\",\n",
-    "    ),\n",
-    "    (\"human\", \"I love programming.\"),\n",
-    "]\n",
-    "ai_msg = llm.invoke(messages)\n",
-    "ai_msg"
+    "ai_msg = llm.invoke(\"Translate 'I love programming' to French.\")\n",
+    "print(f\"{ai_msg}\\n\")\n",
+    "print(f\"Content:\\n{ai_msg.text}\")"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "ede35e47",
+   "metadata": {},
+   "source": [
+    "## Streaming"
   ]
  },
  {
   "cell_type": "code",
-   "execution_count": 11,
-   "id": "d86145b3-bfef-46e8-b227-4dda5c9c2705",
+   "execution_count": 10,
+   "id": "77474829",
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
-      "The translation of \"I love programming\" in French is:\n",
-      "\n",
-      "\"J'adore le programmation.\"\n"
+      "Hi| there|!| I|'m| just| a| chat|bot|,| so| I| don|'t| have| feelings|,| but| I|'m| here| and| ready| to| help| you| with| anything| you| need|!| How| can| I| assist| you| today|?| 😊|"
     ]
    }
   ],
   "source": [
-    "print(ai_msg.content)"
+    "for chunk in llm.stream(\"How are you doing?\"):\n",
+    "    if chunk.text:\n",
+    "        print(chunk.text, end=\"|\", flush=True)"
   ]
  },
  {
@@ -219,10 +238,10 @@
    {
     "data": {
      "text/plain": [
-       "AIMessage(content='\"Programmieren ist meine Leidenschaft.\"\\n\\n(I translated \"programming\" to the German word \"Programmieren\", and added \"ist meine Leidenschaft\" which means \"is my passion\")', additional_kwargs={}, response_metadata={'model': 'llama3.1', 'created_at': '2025-06-25T18:43:29.350032Z', 'done': True, 'done_reason': 'stop', 'total_duration': 1194744459, 'load_duration': 26982500, 'prompt_eval_count': 30, 'prompt_eval_duration': 117043458, 'eval_count': 41, 'eval_duration': 1049892167, 'model_name': 'llama3.1'}, id='run--efc6436e-2346-43d9-8118-3c20b3cdf0d0-0', usage_metadata={'input_tokens': 30, 'output_tokens': 41, 'total_tokens': 71})"
+       "'Ich liebe Programmierung.'"
      ]
     },
-     "execution_count": 7,
+     "execution_count": 19,
     "metadata": {},
     "output_type": "execute_result"
    }
@@ -241,13 +260,15 @@
    ")\n",
    "\n",
    "chain = prompt | llm\n",
-    "chain.invoke(\n",
+    "result = chain.invoke(\n",
    "    {\n",
    "        \"input_language\": \"English\",\n",
    "        \"output_language\": \"German\",\n",
    "        \"input\": \"I love programming.\",\n",
    "    }\n",
-    ")"
+    ")\n",
+    "\n",
+    "result.text"
   ]
  },
  {
@@ -257,10 +278,10 @@
   "source": [
    "## Tool calling\n",
    "\n",
-    "We can use [tool calling](/docs/concepts/tool_calling/) with an LLM [that has been fine-tuned for tool use](https://ollama.com/search?&c=tools) such as `llama3.1`:\n",
+    "We can use [tool calling](/docs/concepts/tool_calling/) with an LLM [that has been fine-tuned for tool use](https://ollama.com/search?&c=tools) such as `gpt-oss`:\n",
    "\n",
    "```\n",
-    "ollama pull llama3.1\n",
+    "ollama pull gpt-oss:20b\n",
    "```\n",
    "\n",
    "Details on creating custom tools are available in [this guide](/docs/how_to/custom_tools/). Below, we demonstrate how to create a tool using the `@tool` decorator on a normal python function."
@@ -276,16 +297,16 @@
     "name": "stdout",
     "output_type": "stream",
     "text": [
-      "[{'name': 'validate_user', 'args': {'addresses': ['123 Fake St, Boston, MA', '234 Pretend Boulevard, Houston, TX'], 'user_id': '123'}, 'id': 'aef33a32-a34b-4b37-b054-e0d85584772f', 'type': 'tool_call'}]\n"
+      "[{'type': 'tool_call', 'id': 'f365489e-1dc4-4d60-aaff-e56290ae4f99', 'name': 'validate_user', 'args': {'addresses': ['123 Fake St in Boston MA', '234 Pretend Boulevard in Houston TX'], 'user_id': 123}}]\n"
     ]
    }
   ],
   "source": [
    "from typing import List\n",
    "\n",
-    "from langchain_core.messages import AIMessage\n",
+    "from langchain_core.v1.messages import AIMessage\n",
    "from langchain_core.tools import tool\n",
-    "from langchain_ollama import ChatOllama\n",
+    "from langchain_ollama.v1 import ChatOllama\n",
    "\n",
    "\n",
    "@tool\n",
@@ -300,7 +321,8 @@
    "\n",
    "\n",
    "llm = ChatOllama(\n",
-    "    model=\"llama3.1\",\n",
+    "    model=\"gpt-oss:20b\",\n",
+    "    validate_model_on_init=True,\n",
    "    temperature=0,\n",
    ").bind_tools([validate_user])\n",
    "\n",
@@ -314,6 +336,50 @@
    "    print(result.tool_calls)"
   ]
  },
+  {
+   "cell_type": "markdown",
+   "id": "4321b6a8",
+   "metadata": {},
+   "source": [
+    "## Structured output"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 16,
+   "id": "20f8ae70",
+   "metadata": {},
+   "outputs": [
+    {
+     "name": "stdout",
+     "output_type": "stream",
+     "text": [
+      "Name: Alice, Age: 28, Job: Software Engineer\n"
+     ]
+    }
+   ],
+   "source": [
+    "from langchain_ollama.v1 import ChatOllama\n",
+    "from pydantic import BaseModel, Field\n",
+    "\n",
+    "llm = ChatOllama(model=\"llama3.2\", validate_model_on_init=True, temperature=0)\n",
+    "\n",
+    "\n",
+    "class Person(BaseModel):\n",
+    "    \"\"\"Information about a person.\"\"\"\n",
+    "\n",
+    "    name: str = Field(description=\"The person's full name\")\n",
+    "    age: int = Field(description=\"The person's age in years\")\n",
+    "    occupation: str = Field(description=\"The person's job or profession\")\n",
+    "\n",
+    "\n",
+    "structured_llm = llm.with_structured_output(Person)\n",
+    "response: Person = structured_llm.invoke(\n",
+    "    \"Tell me about a fictional software engineer named Alice who is 28 years old.\"\n",
+    ")\n",
+    "print(f\"Name: {response.name}, Age: {response.age}, Job: {response.occupation}\")"
+   ]
+  },
  {
   "cell_type": "markdown",
   "id": "4c5e0197",
@@ -321,11 +387,9 @@
   "source": [
    "## Multi-modal\n",
    "\n",
-    "Ollama has support for multi-modal LLMs, such as [bakllava](https://ollama.com/library/bakllava) and [llava](https://ollama.com/library/llava).\n",
+    "Ollama has limited support for multi-modal LLMs, such as [gemma3](https://ollama.com/library/gemma3).\n",
    "\n",
-    "    ollama pull bakllava\n",
-    "\n",
-    "Be sure to update Ollama so that you have the most recent version to support multi-modal."
+    "### Image input"
   ]
  },
  {
@@ -408,15 +472,15 @@
     "name": "stdout",
     "output_type": "stream",
     "text": [
-      "90%\n"
+      "Based on the image, the dollar-based gross retention rate is **90%**.\n"
     ]
    }
   ],
   "source": [
-    "from langchain_core.messages import HumanMessage\n",
-    "from langchain_ollama import ChatOllama\n",
+    "from langchain_core.v1.messages import HumanMessage\n",
+    "from langchain_ollama.v1 import ChatOllama\n",
    "\n",
-    "llm = ChatOllama(model=\"bakllava\", temperature=0)\n",
+    "llm = ChatOllama(model=\"gemma3:4b\", validate_model_on_init=True, temperature=0)\n",
    "\n",
    "\n",
    "def prompt_func(data):\n",
@@ -424,8 +488,9 @@
    "    image = data[\"image\"]\n",
    "\n",
    "    image_part = {\n",
-    "        \"type\": \"image_url\",\n",
-    "        \"image_url\": f\"data:image/jpeg;base64,{image}\",\n",
+    "        \"type\": \"image\",\n",
+    "        \"base64\": f\"data:image/jpeg;base64,{image}\",\n",
+    "        \"mime_type\": \"image/jpeg\",\n",
    "    }\n",
    "\n",
    "    content_parts = []\n",
@@ -435,7 +500,7 @@
    "    content_parts.append(image_part)\n",
    "    content_parts.append(text_part)\n",
    "\n",
-    "    return [HumanMessage(content=content_parts)]\n",
+    "    return [HumanMessage(content_parts)]\n",
    "\n",
    "\n",
    "from langchain_core.output_parsers import StrOutputParser\n",
@@ -454,11 +519,9 @@
   "id": "fb6a331f-1507-411f-89e5-c4d598154f3c",
   "metadata": {},
   "source": [
-    "## Reasoning models and custom message roles\n",
+    "## Reasoning models\n",
    "\n",
-    "Some models, such as IBM's [Granite 3.2](https://ollama.com/library/granite3.2), support custom message roles to enable thinking processes.\n",
-    "\n",
-    "To access Granite 3.2's thinking features, pass a message with a `\"control\"` role with content set to `\"thinking\"`. Because `\"control\"` is a non-standard message role, we can use a [ChatMessage](https://python.langchain.com/api_reference/core/messages/langchain_core.messages.chat.ChatMessage.html) object to implement it:"
+    "Many models support outputting their reasoning process in addition to the final answer. This is useful for debugging and understanding how the model arrived at its conclusion. This train of thought reasoning is available in models such as `gpt-oss`, `qwen3:8b`, and `deepseek-r1`. To enable reasoning output, set the `reasoning` parameter to `True` either when instantiating the model or during invocation."
   ]
  },
  {
@@ -471,30 +534,25 @@
     "name": "stdout",
     "output_type": "stream",
     "text": [
-      "Here is my thought process:\n",
-      "The user is asking for the value of 3 raised to the power of 3, which is a basic exponentiation operation.\n",
-      "\n",
-      "Here is my response:\n",
-      "\n",
-      "3^3 (read as \"3 to the power of 3\") equals 27. \n",
-      "\n",
-      "This calculation is performed by multiplying 3 by itself three times: 3*3*3 = 27.\n"
+      "Response including reasoning: [{'type': 'reasoning', 'reasoning': \"Okay, so I need to figure out what 3^3 is. Let me start by recalling what exponents mean. From what I remember, when you have a number raised to a power, like a^b, it means you multiply the number by itself b times. So, for example, 2^3 would be 2 multiplied by itself three times: 2 × 2 × 2. Let me check if that's right. Yeah, I think that's correct. So applying that to 3^3, it should be 3 multiplied by itself three times.\\n\\nWait, let me make sure I'm not confusing the base and the exponent. The base is the number being multiplied, and the exponent is how many times it's multiplied. So in 3^3, the base is 3 and the exponent is 3. That means I need to multiply 3 by itself three times. Let me write that out step by step.\\n\\nFirst, multiply the first two 3s: 3 × 3. What's 3 times 3? That's 9. Okay, so the first multiplication gives me 9. Now, I need to multiply that result by the third 3. So 9 × 3. Let me calculate that. 9 times 3 is... 27. So putting it all together, 3 × 3 × 3 equals 27. \\n\\nWait, let me verify that again. Maybe I should do it in a different way to make sure I didn't make a mistake. Let's break it down. 3^3 is the same as 3 × 3 × 3. Let me compute 3 × 3 first, which is 9, and then multiply that by 3. 9 × 3 is indeed 27. Hmm, that seems right. \\n\\nAlternatively, I can think of exponents as repeated multiplication. So 3^1 is 3, 3^2 is 3 × 3 = 9, and 3^3 is 3 × 3 × 3 = 27. Yeah, that progression makes sense. Each time the exponent increases by 1, you multiply by the base again. So starting from 3^1 = 3, then 3^2 is 3 × 3 = 9, then 3^3 is 9 × 3 = 27. \\n\\nIs there another way to check this? Maybe using exponent rules. For example, if I know that 3^2 is 9, then multiplying by another 3 would give me 3^3. Since 9 × 3 is 27, that confirms it again. \\n\\nAlternatively, maybe I can use logarithms or something else, but that might be overcomplicating. Since exponents are straightforward multiplication, I think my initial calculation is correct. \\n\\nWait, just to be thorough, maybe I can use a calculator to verify. Let me imagine pressing 3, then the exponent key, then 3. If I do that, it should give me 27. Yeah, that's what I remember. So all methods point to 27. \\n\\nI think I've checked it multiple ways: breaking down the multiplication step by step, using the exponent progression, and even considering a calculator verification. All of them lead to the same answer. Therefore, I'm confident that 3^3 equals 27.\\n\"}, {'type': 'text', 'text': 'To determine the value of $3^3$, we start by understanding what an exponent represents. The expression $a^b$ means multiplying the base $a$ by itself $b$ times. \\n\\n### Step-by-Step Calculation:\\n1. **Identify the base and exponent**:  \\n   In $3^3$, the base is **3**, and the exponent is **3**. This means we multiply 3 by itself three times.\\n\\n2. **Perform the multiplication**:  \\n   - First, multiply the first two 3s:  \\n     $3 \\\\times 3 = 9$  \\n   - Next, multiply the result by the third 3:  \\n     $9 \\\\times 3 = 27$\\n\\n3. **Verify the result**:  \\n   - $3^1 = 3$  \\n   - $3^2 = 3 \\\\times 3 = 9$  \\n   - $3^3 = 3 \\\\times 3 \\\\times 3 = 27$  \\n   This progression confirms the calculation.\\n\\n### Final Answer:\\n$$\\n3^3 = \\\\boxed{27}\\n$$'}]\n",
+      "Response without reasoning: [{'type': 'text', 'text': \"Sure! Let's break down what **3³** means and how to calculate it step by step.\\n\\n---\\n\\n### Step 1: Understand the notation\\nThe expression **3³** means **3 multiplied by itself three times**. The small number (3) is called the **exponent**, and it tells us how many times the base number (3) is used as a factor.\\n\\nSo:\\n$$\\n3^3 = 3 \\\\times 3 \\\\times 3\\n$$\\n\\n---\\n\\n### Step 2: Perform the multiplication step by step\\n\\n1. Multiply the first two 3s:\\n   $$\\n   3 \\\\times 3 = 9\\n   $$\\n\\n2. Now multiply the result by the third 3:\\n   $$\\n   9 \\\\times 3 = 27\\n   $$\\n\\n---\\n\\n### Step 3: Final Answer\\n\\n$$\\n3^3 = 27\\n$$\\n\\n---\\n\\n### Summary\\n- **3³** means **3 × 3 × 3**\\n- **3 × 3 = 9**\\n- **9 × 3 = 27**\\n- So, **3³ = 27**\\n\\nLet me know if you'd like to explore exponents further!\"}]\n"
     ]
    }
   ],
   "source": [
-    "from langchain_core.messages import ChatMessage, HumanMessage\n",
-    "from langchain_ollama import ChatOllama\n",
+    "from langchain_ollama.v1 import ChatOllama\n",
    "\n",
-    "llm = ChatOllama(model=\"granite3.2:8b\")\n",
+    "# All outputs from `llm` will include reasoning unless overridden during invocation\n",
+    "llm = ChatOllama(model=\"qwen3:8b\", validate_model_on_init=True, reasoning=True)\n",
    "\n",
-    "messages = [\n",
-    "    ChatMessage(role=\"control\", content=\"thinking\"),\n",
-    "    HumanMessage(\"What is 3^3?\"),\n",
-    "]\n",
+    "response_a = llm.invoke(\"What is 3^3? Explain your reasoning step by step.\")\n",
+    "print(f\"Response including reasoning: {response_a.content}\")\n",
    "\n",
-    "response = llm.invoke(messages)\n",
-    "print(response.content)"
+    "# Test override; note no ReasoningContentBlock in the response\n",
+    "response_b = llm.invoke(\n",
+    "    \"What is 3^3? Explain your reasoning step by step.\", reasoning=False\n",
+    ")\n",
+    "print(f\"Response without reasoning: {response_b.content}\")"
   ]
  },
  {
@@ -502,7 +560,7 @@
   "id": "6271d032-da40-44d4-9b52-58370e164be3",
   "metadata": {},
   "source": [
-    "Note that the model exposes its thought process in addition to its final response."
+    "Note that the model exposes its thought process as a `ReasoningContentBlock` addition to its final response."
   ]
  },
  {
--- a/docs/docs/integrations/providers/ollama.mdx
+++ b/docs/docs/integrations/providers/ollama.mdx
@@ -1,14 +1,16 @@
 # Ollama

 >[Ollama](https://ollama.com/) allows you to run open-source large language models,
-> such as [Llama3.1](https://ai.meta.com/blog/meta-llama-3-1/), locally.
+> such as [gpt-oss](https://ollama.com/library/gpt-oss), locally.
 >
->`Ollama` bundles model weights, configuration, and data into a single package, defined by a Modelfile.
->It optimizes setup and configuration details, including GPU usage.
->For a complete list of supported models and model variants, see the [Ollama model library](https://ollama.ai/library).
+>The `ollama` [package](https://pypi.org/project/ollama/0.5.3/) bundles model weights,
+> configuration, and data into a single package, defined by a Modelfile. It optimizes
+> setup and configuration details, including GPU usage.
+>For a complete list of supported models and model variants, see the
+> [Ollama model library](https://ollama.com/search).

-See [this guide](/docs/how_to/local_llms) for more details
-on how to use `Ollama` with LangChain.
+See [this guide](/docs/how_to/local_llms/#ollama) for more details
+on how to use `ollama` with LangChain.

 ## Installation and Setup
 ### Ollama installation
@@ -23,10 +25,10 @@ Ollama will start as a background service automatically, if this is disabled, ru
 ollama serve
 ```

-After starting ollama, run `ollama pull <name-of-model>` to download a model from the [Ollama model library](https://ollama.ai/library):
+After starting ollama, run `ollama pull <name-of-model>` to download a model from the [Ollama model library](https://ollama.com/library):

 ```bash
-ollama pull llama3.1
+ollama pull gpt-oss:20b
 ```

 - This will download the default tagged version of the model. Typically, the default points to the latest, smallest sized-parameter model.
--- a/docs/docs/integrations/vectorstores/sqlserver.ipynb
+++ b/docs/docs/integrations/vectorstores/sqlserver.ipynb
@@ -591,7 +591,7 @@
  },
  {
   "cell_type": "code",
-   "execution_count": 36,
+   "execution_count": null,
   "metadata": {
    "azdata_cell_guid": "d9127900-0942-48f1-bd4d-081c7fa3fcae",
    "language": "python"
@@ -606,7 +606,7 @@
    }
   ],
   "source": [
-    "from langchain.document_loaders import AzureBlobStorageFileLoader\n",
+    "from langchain_community.document_loaders import AzureBlobStorageFileLoader\n",
    "from langchain.text_splitter import RecursiveCharacterTextSplitter\n",
    "from langchain_core.documents import Document\n",
    "\n",
--- a/docs/docs/versions/v0_4/how_to_update.mdx
+++ b/docs/docs/versions/v0_4/how_to_update.mdx
@@ -0,0 +1,107 @@
+---
+sidebar_position: 3
+---
+
+# How to update your code
+
+*Last updated: 08.08.25*
+
+If you maintain custom callbacks or output parsers, type checkers may raise errors if
+they do not accept the new message types as inputs. This guide describes how to
+address those issues.
+
+If you do not maintain custom callbacks or output parsers, there are no breaking
+changes. See our guide on the [new message types](/docs/versions/v0_4/messages) to learn
+about new features introduced in v0.4.
+
+## Custom callbacks
+
+[BaseCallbackHandler](https://python.langchain.com/api_reference/core/callbacks/langchain_core.callbacks.base.BaseCallbackHandler.html)
+now includes an attribute `accepts_new_messages` that defaults to False. When this
+attribute is False, the callback system in langchain-core will automatically convert
+new message types to old, so there should be no runtime errors. You can update callback
+signatures as below to fix type-checking errors:
+
+```python
+from langchain_core.v1.messages import AIMessage, AIMessageChunk, MessageV1
+
+def on_chat_model_start(
+    self,
+    serialized: dict[str, Any],
+    # highlight-next-line
+    messages: Union[list[list[BaseMessage]], list[MessageV1]],
+    *,
+    run_id: UUID,
+    parent_run_id: Optional[UUID] = None,
+    tags: Optional[list[str]] = None,
+    metadata: Optional[dict[str, Any]] = None,
+    **kwargs: Any,
+) -> Any:
+
+def on_llm_new_token(
+    self,
+    token: str,
+    *,
+    chunk: Optional[
+        # highlight-next-line
+        Union[GenerationChunk, ChatGenerationChunk, AIMessageChunk]
+    ] = None,
+    run_id: UUID,
+    parent_run_id: Optional[UUID] = None,
+    **kwargs: Any,
+) -> Any:
+
+def on_llm_end(
+    self,
+    # highlight-next-line
+    response: Union[LLMResult, AIMessage],
+    *,
+    run_id: UUID,
+    parent_run_id: Optional[UUID] = None,
+    **kwargs: Any,
+) -> Any:
+```
+You can also safely type-ignore mypy `override` errors here unless you switch
+`accepts_new_messages` to True.
+
+
+## Custom output parsers
+
+All output parsers in `langchain-core` have been updated to accept the new message
+types.
+
+If you maintain a custom output parser, `langchain-core` exposes a
+`convert_from_v1_message` function so that your parser can easily operate on the new
+message types:
+
+```python
+from langchain_core.messages.utils import convert_from_v1_message
+from langchain_core.v1.messages import AIMessage
+
+def parse_result(
+    self,
+    # highlight-next-line
+    result: Union[list[Generation], AIMessage],
+    *,
+    partial: bool = False,
+) -> Union[list[AgentAction], AgentFinish]:
+    # highlight-start
+    if isinstance(result, AIMessage):
+        result = [ChatGeneration(message=convert_from_v1_message(result))]
+    # highlight-end
+    ...
+
+def _transform(
+    # higlight-next-line
+    self, input: Iterator[Union[str, BaseMessage, AIMessage]]
+) -> Iterator[AddableDict]:
+    for chunk in input:
+        # higlight-start
+        if isinstance(chunk, AIMessage):
+            chunk = convert_from_v1_message(chunk)
+        # higlight-end
+        ...
+```
+This will allow your parser to work as before. You can also update the parser to
+natively handle the new message types to save this conversion step. See our guide on
+the [new message types](/docs/versions/v0_4/messages) for details.
--- a/docs/docs/versions/v0_4/index.mdx
+++ b/docs/docs/versions/v0_4/index.mdx
@@ -0,0 +1,81 @@
+---
+sidebar_position: 1
+---
+
+# LangChain v0.4
+
+*Last updated: 08.08.25*
+
+## What's changed
+
+LangChain v0.4 allows developers to opt-in to new message types that will become default
+in LangChain v1.0. LangChain v1.0 will be released this fall. These messages provide
+fully typed, provider-agnostic content, introducing standard content blocks for
+reasoning, citations, server-side tool calls, and other LLM features. They also offer
+performance benefits over existing message classes.
+
+New message types have been added to a `v1` namespace in `langchain-core`. Select
+integration packages now also expose a `v1` namespace containing chat models that
+work with the new message types.
+
+Input types for callbacks and output parsers have been widened to accept the new message
+types. If you maintain custom callbacks or output parsers, type checkers may raise
+errors if they do not accept the new message types as inputs. Refer to
+[this guide](/docs/versions/v0_4/how_to_update) for how to address those issues. This
+is the only breaking change.
+
+## What's new
+
+You can access the new chat models through [init_chat_model](/docs/how_to/chat_models_universal_init/) by setting `message_version="v1"`:
+
+```python
+from langchain.chat_models import init_chat_model
+
+llm = init_chat_model("openai:gpt-5", message_version="v1")
+
+input_message = {"role": "user", "content": "Hello, world!"}
+llm.invoke([input_message])
+```
+
+You can also access the `v1` namespaces directly:
+```python
+from langchain_core.v1.messages import HumanMessage
+from langchain_openai.v1 import ChatOpenAI
+
+input_message = HumanMessage("Hello, world!")
+llm.invoke([input_message])
+```
+
+:::info New message details
+
+See our guide on the [new message types](/docs/versions/v0_4/messages) for details.
+
+:::
+
+## How to update your code
+
+If you maintain custom callbacks or output parsers, type checkers may raise errors if
+they do not accept the new message types as inputs. Refer to
+[this guide](/docs/versions/v0_4/how_to_update) for how to address those issues.
+
+If you do not maintain custom callbacks or output parsers, there are no breaking
+changes. See our guide on the [new message types](/docs/versions/v0_4/messages) to learn
+about new features introduced in v0.4.
+
+
+
+### Base packages
+
+| Package                  | Latest | Recommended constraint |
+|--------------------------|--------|------------------------|
+| langchain                | 0.4.0  | >=0.4,&lt;1.0             |
+| langchain-community      | 0.4.0  | >=0.4,&lt;1.0             |
+| langchain-text-splitters | 0.4.0  | >=0.4,&lt;1.0             |
+| langchain-core           | 0.4.0  | >=0.4,&lt;1.0             |
+| langchain-experimental   | 0.4.0  | >=0.4,&lt;1.0             |
+
+
+### Integration packages
+
+...
+
--- a/docs/docs/versions/v0_4/messages.mdx
+++ b/docs/docs/versions/v0_4/messages.mdx
@@ -0,0 +1,429 @@
+---
+sidebar_position: 2
+---
+
+# LangChain v1.0 message types
+
+*Last updated: 08.08.25*
+
+LangChain v0.4 allows developers to opt-in to new message types that will become default
+in LangChain v1.0. LangChain v1.0 will be released this fall.
+
+These messages should be considered a beta feature and are subject to change in
+LangChain v1.0, although we do not anticipate any significant changes.
+
+## Benefits
+
+The new message types offer improvements in performance, type-safety, and consistency
+across OpenAI, Anthropic, Gemini, and other providers.
+
+### Performance
+
+Importantly, the new messages are Python dataclasses, saving some runtime from
+instantiating (layers of) Pydantic BaseModels.
+
+LangChain v0.4 introduces a new `BaseChatModel` class in `langchain_core.v1.chat_models`
+that is faster and leaner than the existing `BaseChatModel` class, offering significant
+reductions in overhead above provider SDKs.
+
+### Type-safety
+
+Message content is typed as
+```python
+import langchain_core.messages.content_blocks as types
+
+content: list[types.ContentBlock]
+```
+
+where we have introduced standard types for text, reasoning, citations, server-side
+tool executions (e.g., web search and code interpreters). These include
+[tool calls](https://python.langchain.com/docs/concepts/tool_calling/) and the
+[multi-modal types](/docs/how_to/multimodal_inputs/) introduced in earlier versions
+of LangChain. There are no breaking changes associated with the existing content types.
+
+**This is the most significant change from the existing message classes**, which permit
+strings, lists of strings, or lists of untyped dicts as content. We have added a
+`.text` getter so that developers can easily recover string content. Consequently, we
+have deprecated `.text()` (as a method) in favor of the new property.
+
+`.tool_calls`, instead of an attribute, is now also a getter with an associated setter,
+so that usage is largely the same. See [usage comparison](#usage-comparison), below,
+for details.
+
+### Consistency
+
+Many chat models can generate a variety of content in a single conversational turn,
+including reasoning, tool calls and responses, images, text with citations, and other
+structured objects. We have standardized these types, resulting in improved
+inter-operability of messages across models.
+
+## Usage comparison
+
+| Task                    | Previous                               | New                                                              |
+|-------------------------|----------------------------------------|------------------------------------------------------------------|
+| Get text content (str)  | `message.content` or `message.text()`  | `message.text`                                                   |
+| Get content blocks      | `message.content`                      | `message.content`                                                |
+| Get `additional_kwargs` | `message.additional_kwargs`            | `[block for block in message.content if block["type"] == "..."]` |
+
+Getting `response_metadata` and `tool_calls` has not changed.
+
+### Changes in content blocks
+
+For providers that generate `list[dict]` content, the dict elements have changed to
+conform to the new content block types. Refer to the
+[API reference](https://python.langchain.com/api_reference/core/messages.html) for
+details. Below we show some examples.
+
+Importantly:
+- Where provider-specific fields map to fields on standard types, LangChain manages
+the translation.
+- Where provider-specific fields do not map to fields on standard types, LangChain
+stores them in an `"extras"` key (see below for examples).
+
+<details>
+  <summary>Citations and web search</summary>
+
+<div className="row">
+  <div className="col col--6" style={{minWidth: 0}}>
+    **Old content**
+```python
+from langchain.chat_models import init_chat_model
+
+llm = init_chat_model("openai:gpt-5-mini", output_version="responses/v1")
+llm_with_tools = llm.bind_tools([{"type": "web_search_preview"}])
+
+response = llm_with_tools.invoke("What was a positive news story from today?")
+response.content
+```
+```
+[
+  {
+    "type": "reasoning",
+    "id": "rs_abc123",
+    "summary": []
+  },
+  {
+    "type": "web_search_call",
+    "id": "ws_abc123",
+    "action": {
+      "query": "positive news today August 8 2025 'good news' 'Aug 8 2025' 'today' ",
+      "type": "search"
+    },
+    "status": "completed"
+  },
+  {
+    "type": "text",
+    "text": "Here are two positive news items from today...",
+    "annotations": [
+      {
+        "type": "url_citation",
+        "end_index": 455,
+        "start_index": 196,
+        "title": "Document title",
+        "url": "<document url>"
+      },
+      {
+        "type": "url_citation",
+        "end_index": 1022,
+        "start_index": 707,
+        "title": "Another Document",
+        "url": "<another document url>"
+      },
+    ],
+    "id": "msg_abc123"
+  }
+]
+```
+</div>
+
+  <div className="col col--6" style={{minWidth: 0}}>
+    **New content**
+```python
+from langchain.chat_models import init_chat_model
+
+llm = init_chat_model("openai:gpt-5-mini", message_version="v1")
+llm_with_tools = llm.bind_tools([{"type": "web_search_preview"}])
+
+response = llm_with_tools.invoke("What was a positive news story from today?")
+response.content
+```
+```
+[
+  {
+    "type": "reasoning",
+    "id": "rs_abc123"
+  },
+  {
+    "type": "web_search_call",
+    "id": "ws_abc123",
+    "query": "positive news August 8 2025 'good news' 'today' ",
+    "extras": {
+      "action": {"type": "search"},
+      "status": "completed",
+    }
+  },
+  {
+    "type": "web_search_result",
+    "id": "ws_abc123"
+  },
+  {
+    "type": "text",
+    "text": "Here are two positive news items from today...",
+    "annotations": [
+      {
+        "type": "citation",
+        "end_index": 455,
+        "start_index": 196,
+        "title": "Document title",
+        "url": "<document url>"
+      },
+      {
+        "type": "citation",
+        "end_index": 1022,
+        "start_index": 707,
+        "title": "Another Document",
+        "url": "<another document url>"
+      }
+    ],
+    "id": "msg_abc123"
+  }
+]
+```
+  </div>
+</div>
+</details>
+
+
+<details>
+  <summary>Reasoning</summary>
+
+<div className="row">
+  <div className="col col--6" style={{minWidth: 0}}>
+    **Old content**
+```python
+from langchain.chat_models import init_chat_model
+
+llm = init_chat_model(
+    "openai:gpt-5",
+    reasoning={"effort": "medium", "summary": "auto"},
+    output_version="responses/v1",
+)
+response = llm.invoke(
+    "What was the third tallest building in the world in the year 2000?"
+)
+response.content
+```
+```
+[
+  {
+    "type": "reasoning",
+    "id": "rs_abc123",
+    "summary": [
+      {
+        "text": "The user is asking about...",
+        "type": "summary_text"
+      },
+      {
+        "text": "We should consider...",
+        "type": "summary_text"
+      }
+    ]
+  },
+  {
+    "type": "text",
+    "text": "In the year 2000 the third-tallest building in the world was...",
+    "id": "msg_abc123"
+  }
+]
+```
+  </div>
+
+  <div className="col col--6" style={{minWidth: 0}}>
+    **New content**
+```python
+from langchain.chat_models import init_chat_model
+
+llm = init_chat_model(
+    "openai:gpt-5",
+    reasoning={"effort": "medium", "summary": "auto"},
+    message_version="v1",
+)
+response = llm.invoke(
+    "What was the third tallest building in the world in the year 2000?"
+)
+response.content
+```
+```
+[
+  {
+    "type": "reasoning",
+    "reasoning": "The user is asking about...",
+    "id": "rs_abc123"
+  },
+  {
+    "type": "reasoning",
+    "reasoning": "We should consider...",
+    "id": "rs_abc123"
+  },
+  {
+    "type": "text",
+    "text": "In the year 2000 the third-tallest building in the world was...",
+    "id": "msg_abc123"
+  }
+]
+```
+  </div>
+</div>
+</details>
+
+
+<details>
+  <summary>Non-standard blocks</summary>
+
+Where content blocks from specific providers do not map to a standard type, they are
+structured into a `"non_standard"` block:
+```python
+{
+    "type": "non_standard",
+    "value": original_block,
+}
+```
+<div className="row">
+  <div className="col col--6" style={{minWidth: 0}}>
+    **Old content**
+```python
+from langchain.chat_models import init_chat_model
+
+llm = init_chat_model("openai:gpt-5-mini", output_version="responses/v1")
+llm_with_tools = llm.bind_tools(
+    [
+        {
+            "type": "file_search",
+            "vector_store_ids": ["vs_67d0baa0544c8191be194a85e19cbf92"],
+        }
+    ]
+)
+
+response = llm_with_tools.invoke("What is deep research by OpenAI?")
+response.content
+```
+```
+[
+  {
+    "type": "reasoning",
+    "id": "rs_abc123",
+    "summary": []
+  },
+  {
+    "type": "file_search_call",
+    "id": "fs_abc123",
+    "queries": [
+      "What is deep research by OpenAI?",
+      "deep research OpenAI definition"
+    ],
+    "status": "completed"
+  },
+  {
+    "type": "reasoning",
+    "id": "rs_def456",
+    "summary": []
+  },
+  {
+    "type": "text",
+    "text": "Deep research is...",
+    "annotations": [
+      {
+        "type": "file_citation",
+        "file_id": "file-abc123",
+        "filename": "sample_file.pdf",
+        "index": 305
+      },
+      {
+        "type": "file_citation",
+        "file_id": "file-abc123",
+        "filename": "sample_file.pdf",
+        "index": 675
+      },
+    ],
+    "id": "msg_abc123"
+  }
+]
+```
+</div>
+
+  <div className="col col--6" style={{minWidth: 0}}>
+    **New content**
+```python
+from langchain.chat_models import init_chat_model
+
+llm = init_chat_model("openai:gpt-5-mini", message_version="v1")
+llm_with_tools = llm.bind_tools(
+    [
+        {
+            "type": "file_search",
+            "vector_store_ids": ["vs_67d0baa0544c8191be194a85e19cbf92"],
+        }
+    ]
+)
+
+response = llm_with_tools.invoke("What is deep research by OpenAI?")
+response.content
+```
+```
+[
+  {
+    "type": "reasoning",
+    "id": "rs_abc123",
+    "summary": []
+  },
+  {
+    "type": "non_standard",
+    "value": {
+      "type": "file_search_call",
+      "id": "fs_abc123",
+      "queries": [
+        "What is deep research by OpenAI?",
+        "deep research OpenAI definition"
+      ],
+      "status": "completed"
+    }
+  },
+  {
+    "type": "reasoning",
+    "id": "rs_def456",
+    "summary": []
+  },
+  {
+    "type": "text",
+    "text": "Deep research is...",
+    "annotations": [
+      {
+        "type": "citation",
+        "title": "sample_file.pdf",
+        "extras": {
+          "file_id": "file-abc123",
+          "index": 305
+        }
+      },
+      {
+        "type": "citation",
+        "title": "sample_file.pdf",
+        "extras": {
+          "file_id": "file-abc123",
+          "index": 675
+        }
+      },
+    ],
+    "id": "msg_abc123"
+  }
+]
+```
+  </div>
+</div>
+</details>
+
+
+## Feature gaps
+
+The new message types do not yet support LangChain's caching layer. Support will be
+added in the coming weeks.
--- a/docs/docusaurus.config.js
+++ b/docs/docusaurus.config.js
@@ -224,13 +224,17 @@ const config = {
          },
          {
            type: "dropdown",
-            label: "v0.3",
+            label: "v0.4",
            position: "right",
            items: [
              {
-                label: "v0.3",
+                label: "v0.4",
                href: "/docs/introduction",
              },
+              {
+                label: "v0.3",
+                href: "https://python.langchain.com/v0.3/docs/introduction/",
+              },
              {
                label: "v0.2",
                href: "https://python.langchain.com/v0.2/docs/introduction",
--- a/docs/sidebars.js
+++ b/docs/sidebars.js
@@ -82,6 +82,14 @@ module.exports = {
      collapsed: false,
      collapsible: false,
      items: [
+        {
+          type: "category",
+          label: "v0.4",
+          items: [{
+            type: 'autogenerated',
+            dirName: 'versions/v0_4',
+          }],
+        },
        {
          type: 'doc',
          id: 'versions/v0_3/index',
@@ -418,7 +426,7 @@ module.exports = {
            },
          ],
        },
-        
+
      ],
      link: {
        type: "generated-index",
--- a/docs/vercel.json
+++ b/docs/vercel.json
@@ -23,11 +23,19 @@
    {
      "source": "/v0.2/:path(.*/?)*",
      "destination": "https://langchain-v02.vercel.app/v0.2/:path*"
+    },
+    {
+      "source": "/v0.3",
+      "destination": "https://langchain-v03.vercel.app/v0.3"
+    },
+    {
+      "source": "/v0.3/:path(.*/?)*",
+      "destination": "https://langchain-v03.vercel.app/v0.3/:path*"
    }
  ],
  "redirects": [
    {
-      "source": "/v0.3/docs/:path(.*/?)*",
+      "source": "/v0.4/docs/:path(.*/?)*",
      "destination": "/docs/:path*"
    },
    {
--- a/libs/core/langchain_core/v1/messages.py
+++ b/libs/core/langchain_core/v1/messages.py
@@ -736,7 +736,7 @@ class SystemMessage:
    custom_role: Optional[str] = None
    """If provided, a custom role for the system message.

-    Example: ``'developer'``.
+    Example: ``'developer'``, ``'control'``.

    Integration packages may use this field to assign the system message role if it
    contains a recognized value.
--- a/libs/partners/ollama/langchain_ollama/v1/chat_models/base.py
+++ b/libs/partners/ollama/langchain_ollama/v1/chat_models/base.py
@@ -191,6 +191,7 @@ class ChatOllama(BaseChatModel):

            llm = ChatOllama(
                model = "llama3",
+                validate_model_on_init = True,
                temperature = 0.8,
                num_predict = 256,
                # other params ...
@@ -199,13 +200,7 @@ class ChatOllama(BaseChatModel):
    Invoke:
        .. code-block:: python

-            from langchain_core.v1.messages import HumanMessage
-            from langchain_core.messages.content_blocks import TextContentBlock
-
-            messages = [
-                HumanMessage("Hello!")
-            ]
-            llm.invoke(messages)
+            llm.invoke("Hello!")

        .. code-block:: python

@@ -214,13 +209,7 @@ class ChatOllama(BaseChatModel):
    Stream:
        .. code-block:: python

-            from langchain_core.v1.messages import HumanMessage
-            from langchain_core.messages.content_blocks import TextContentBlock
-
-            messages = [
-                HumanMessage(Return the words Hello World!")
-            ]
-            for chunk in llm.stream(messages):
+            for chunk in llm.stream("Return the words Hello World!"):
                print(chunk.content, end="")

        .. code-block:: python
@@ -232,7 +221,7 @@ class ChatOllama(BaseChatModel):
    Multi-modal input:
        .. code-block:: python

-            from langchain_core.messages.content_blocks import ImageContentBlock
+            from langchain_core.messages.content_blocks import TextContentBlock, ImageContentBlock

            response = llm.invoke([
                HumanMessage(content=[
@@ -254,9 +243,7 @@ class ChatOllama(BaseChatModel):
                b: int = Field(..., description="Second integer")

            llm_with_tools = llm.bind_tools([Multiply])
-            ans = llm_with_tools.invoke([
-                HumanMessage("What is 45*67")
-            ])
+            ans = llm_with_tools.invoke("What is 45*67")
            ans.tool_calls

        .. code-block:: python
@@ -278,7 +265,7 @@ class ChatOllama(BaseChatModel):
    streaming: bool = False
    """Whether to use streaming for invocation.

-    If True, invoke will use streaming internally.
+    If True, ``invoke`` will use streaming internally.

    """
Author	SHA1	Message	Date
Mason Daugherty	91e825b92c	Merge branch 'wip-v0.4' into cc/0.4/docs	2025-08-11 15:11:33 -04:00
Mason Daugherty	3c5cc349b6	docs: v0.4 top level refinements (#32474 )	2025-08-11 09:15:59 -04:00
Mason Daugherty	5cfb7ce57b	docs(ollama): update Ollama integration documentation for new chat model (#32475 )	2025-08-11 09:13:54 -04:00
Mason Daugherty	978119ef3c	Merge branch 'wip-v0.4' into cc/0.4/docs	2025-08-08 17:28:54 -04:00
Chester Curme	dd68b762d9	headers -> details	2025-08-08 14:52:25 -04:00
Chester Curme	c784f63701	details -> columns	2025-08-08 14:50:51 -04:00
Chester Curme	aed20287af	x	2025-08-08 14:29:20 -04:00
Chester Curme	5ada33b3e6	x	2025-08-08 14:06:44 -04:00
Chester Curme	a1c79711b3	update	2025-08-08 12:50:36 -04:00
Chester Curme	1dc22c602e	update	2025-08-08 11:00:03 -04:00
Chester Curme	18732e5b8b	fix sidebar	2025-08-08 10:57:39 -04:00
Chester Curme	8f19ca30b0	update migration guides	2025-08-08 10:15:16 -04:00
Chester Curme	a369b3aed5	update sidebar label	2025-08-06 16:43:18 -04:00
Chester Curme	5eec2207c0	update docusaurus config	2025-08-06 16:27:25 -04:00
Chester Curme	9b468a10a5	update vercel.json	2025-08-06 16:11:17 -04:00
Chester Curme	b7494d6566	x	2025-08-06 15:53:06 -04:00