Compare commits

..

16 Commits

Author SHA1 Message Date
Mason Daugherty
91e825b92c Merge branch 'wip-v0.4' into cc/0.4/docs 2025-08-11 15:11:33 -04:00
Mason Daugherty
3c5cc349b6 docs: v0.4 top level refinements (#32474) 2025-08-11 09:15:59 -04:00
Mason Daugherty
5cfb7ce57b docs(ollama): update Ollama integration documentation for new chat model (#32475) 2025-08-11 09:13:54 -04:00
Mason Daugherty
978119ef3c Merge branch 'wip-v0.4' into cc/0.4/docs 2025-08-08 17:28:54 -04:00
Chester Curme
dd68b762d9 headers -> details 2025-08-08 14:52:25 -04:00
Chester Curme
c784f63701 details -> columns 2025-08-08 14:50:51 -04:00
Chester Curme
aed20287af x 2025-08-08 14:29:20 -04:00
Chester Curme
5ada33b3e6 x 2025-08-08 14:06:44 -04:00
Chester Curme
a1c79711b3 update 2025-08-08 12:50:36 -04:00
Chester Curme
1dc22c602e update 2025-08-08 11:00:03 -04:00
Chester Curme
18732e5b8b fix sidebar 2025-08-08 10:57:39 -04:00
Chester Curme
8f19ca30b0 update migration guides 2025-08-08 10:15:16 -04:00
Chester Curme
a369b3aed5 update sidebar label 2025-08-06 16:43:18 -04:00
Chester Curme
5eec2207c0 update docusaurus config 2025-08-06 16:27:25 -04:00
Chester Curme
9b468a10a5 update vercel.json 2025-08-06 16:11:17 -04:00
Chester Curme
b7494d6566 x 2025-08-06 15:53:06 -04:00
12 changed files with 815 additions and 174 deletions

View File

@@ -45,8 +45,8 @@
"A few frameworks for this have emerged to support inference of open-source LLMs on various devices:\n",
"\n",
"1. [`llama.cpp`](https://github.com/ggerganov/llama.cpp): C++ implementation of llama inference code with [weight optimization / quantization](https://finbarr.ca/how-is-llama-cpp-possible/)\n",
"2. [`gpt4all`](https://docs.gpt4all.io/index.html): Optimized C backend for inference\n",
"3. [`Ollama`](https://ollama.ai/): Bundles model weights and environment into an app that runs on device and serves the LLM\n",
"2. [`gpt4all`](https://github.com/nomic-ai/gpt4all): Optimized C backend for inference\n",
"3. [`ollama`](https://github.com/ollama/ollama): Bundles model weights and environment into an app that runs on device and serves the LLM\n",
"4. [`llamafile`](https://github.com/Mozilla-Ocho/llamafile): Bundles model weights and everything needed to run the model in a single file, allowing you to run the LLM locally from this file without any additional installation steps\n",
"\n",
"In general, these frameworks will do a few things:\n",
@@ -74,12 +74,12 @@
"\n",
"## Quickstart\n",
"\n",
"[`Ollama`](https://ollama.ai/) is one way to easily run inference on macOS.\n",
"[Ollama](https://ollama.ai/) is one way to easily run inference on macOS.\n",
" \n",
"The instructions [here](https://github.com/jmorganca/ollama?tab=readme-ov-file#ollama) provide details, which we summarize:\n",
"The instructions [here](https://github.com/ollama/ollama?tab=readme-ov-file#ollama) provide details, which we summarize:\n",
" \n",
"* [Download and run](https://ollama.ai/download) the app\n",
"* From command line, fetch a model from this [list of options](https://github.com/jmorganca/ollama): e.g., `ollama pull llama3.1:8b`\n",
"* From command line, fetch a model from this [list of options](https://ollama.com/search): e.g., `ollama pull llama3.1:8b`\n",
"* When the app is running, all models are automatically served on `localhost:11434`\n"
]
},
@@ -111,11 +111,11 @@
}
],
"source": [
"from langchain_ollama import OllamaLLM\n",
"from langchain_ollama import ChatOllama\n",
"\n",
"llm = OllamaLLM(model=\"llama3.1:8b\")\n",
"llm = ChatOllama(model=\"gpt-oss:20b\")\n",
"\n",
"llm.invoke(\"The first man on the moon was ...\")"
"llm.invoke(\"The first man on the moon was ...\").content"
]
},
{
@@ -149,40 +149,7 @@
],
"source": [
"for chunk in llm.stream(\"The first man on the moon was ...\"):\n",
" print(chunk, end=\"|\", flush=True)"
]
},
{
"cell_type": "markdown",
"id": "e5731060",
"metadata": {},
"source": [
"Ollama also includes a chat model wrapper that handles formatting conversation turns:"
]
},
{
"cell_type": "code",
"execution_count": 4,
"id": "f14a778a",
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"AIMessage(content='The answer is a historic one!\\n\\nThe first man to walk on the Moon was Neil Armstrong, an American astronaut and commander of the Apollo 11 mission. On July 20, 1969, Armstrong stepped out of the lunar module Eagle onto the surface of the Moon, famously declaring:\\n\\n\"That\\'s one small step for man, one giant leap for mankind.\"\\n\\nArmstrong was followed by fellow astronaut Edwin \"Buzz\" Aldrin, who also walked on the Moon during the mission. Michael Collins remained in orbit around the Moon in the command module Columbia.\\n\\nNeil Armstrong passed away on August 25, 2012, but his legacy as a pioneering astronaut and engineer continues to inspire people around the world!', response_metadata={'model': 'llama3.1:8b', 'created_at': '2024-08-01T00:38:29.176717Z', 'message': {'role': 'assistant', 'content': ''}, 'done_reason': 'stop', 'done': True, 'total_duration': 10681861417, 'load_duration': 34270292, 'prompt_eval_count': 19, 'prompt_eval_duration': 6209448000, 'eval_count': 141, 'eval_duration': 4432022000}, id='run-7bed57c5-7f54-4092-912c-ae49073dcd48-0', usage_metadata={'input_tokens': 19, 'output_tokens': 141, 'total_tokens': 160})"
]
},
"execution_count": 4,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"from langchain_ollama import ChatOllama\n",
"\n",
"chat_model = ChatOllama(model=\"llama3.1:8b\")\n",
"\n",
"chat_model.invoke(\"Who was the first man on the moon?\")"
" print(chunk.text(), end=\"|\", flush=True)"
]
},
{
@@ -200,7 +167,7 @@
"\n",
"### Running Apple silicon GPU\n",
"\n",
"`Ollama` and [`llamafile`](https://github.com/Mozilla-Ocho/llamafile?tab=readme-ov-file#gpu-support) will automatically utilize the GPU on Apple devices.\n",
"`ollama` and [`llamafile`](https://github.com/Mozilla-Ocho/llamafile?tab=readme-ov-file#gpu-support) will automatically utilize the GPU on Apple devices.\n",
" \n",
"Other frameworks require the user to set up the environment to utilize the Apple GPU.\n",
"\n",
@@ -212,15 +179,15 @@
"\n",
"In particular, ensure that conda is using the correct virtual environment that you created (`miniforge3`).\n",
"\n",
"E.g., for me:\n",
"e.g.,\n",
"\n",
"```\n",
"```shell\n",
"conda activate /Users/rlm/miniforge3/envs/llama\n",
"```\n",
"\n",
"With the above confirmed, then:\n",
"\n",
"```\n",
"```shell\n",
"CMAKE_ARGS=\"-DLLAMA_METAL=on\" FORCE_CMAKE=1 pip install -U llama-cpp-python --no-cache-dir\n",
"```"
]
@@ -234,17 +201,13 @@
"\n",
"There are various ways to gain access to quantized model weights.\n",
"\n",
"1. [`HuggingFace`](https://huggingface.co/TheBloke) - Many quantized model are available for download and can be run with framework such as [`llama.cpp`](https://github.com/ggerganov/llama.cpp). You can also download models in [`llamafile` format](https://huggingface.co/models?other=llamafile) from HuggingFace.\n",
"2. [`gpt4all`](https://gpt4all.io/index.html) - The model explorer offers a leaderboard of metrics and associated quantized models available for download \n",
"3. [`Ollama`](https://github.com/jmorganca/ollama) - Several models can be accessed directly via `pull`\n",
"1. [HuggingFace](https://huggingface.co/TheBloke) - Many quantized model are available for download and can be run with framework such as [`llama.cpp`](https://github.com/ggerganov/llama.cpp). You can also download models in [`llamafile` format](https://huggingface.co/models?other=llamafile) from HuggingFace.\n",
"2. [gpt4all](https://gpt4all.io/index.html) - The model explorer offers a leaderboard of metrics and associated quantized models available for download \n",
"3. [ollama](https://github.com/ollama/ollama) - Several models can be accessed directly via `pull`\n",
"\n",
"### Ollama\n",
"\n",
"With [Ollama](https://github.com/jmorganca/ollama), fetch a model via `ollama pull <model family>:<tag>`:\n",
"\n",
"* E.g., for Llama 2 7b: `ollama pull llama2` will download the most basic version of the model (e.g., smallest # parameters and 4 bit quantization)\n",
"* We can also specify a particular version from the [model list](https://github.com/jmorganca/ollama?tab=readme-ov-file#model-library), e.g., `ollama pull llama2:13b`\n",
"* See the full set of parameters on the [API reference page](https://python.langchain.com/api_reference/community/llms/langchain_community.llms.ollama.Ollama.html)"
"With [Ollama](https://github.com/ollama/ollama), fetch a model via `ollama pull <model family>:<tag>`:"
]
},
{
@@ -265,7 +228,7 @@
}
],
"source": [
"llm = OllamaLLM(model=\"llama2:13b\")\n",
"llm = ChatOllama(model=\"gpt-oss:20b\")\n",
"llm.invoke(\"The first man on the moon was ... think step by step\")"
]
},
@@ -684,12 +647,6 @@
"\n",
"In addition, [here](https://blog.langchain.dev/using-langsmith-to-support-fine-tuning-of-open-source-llms/) is an overview on fine-tuning, which can utilize open-source LLMs."
]
},
{
"cell_type": "markdown",
"id": "14c2c170",
"metadata": {},
"source": []
}
],
"metadata": {

View File

@@ -17,25 +17,29 @@
"source": [
"# ChatOllama\n",
"\n",
"[Ollama](https://ollama.ai/) allows you to run open-source large language models, such as Llama 2, locally.\n",
"[Ollama](https://ollama.com/) allows you to run open-source large language models, such as `gpt-oss`, locally.\n",
"\n",
"Ollama bundles model weights, configuration, and data into a single package, defined by a Modelfile.\n",
"`ollama` bundles model weights, configuration, and data into a single package, defined by a Modelfile.\n",
"\n",
"It optimizes setup and configuration details, including GPU usage.\n",
"\n",
"For a complete list of supported models and model variants, see the [Ollama model library](https://github.com/jmorganca/ollama#model-library).\n",
"For a complete list of supported models and model variants, see the [Ollama model library](https://ollama.com/search).\n",
"\n",
":::warning\n",
"This page is for the new v1 `ChatOllama` class with standard content block output. If you are looking for the legacy v0 `Ollama` class, see the [v0.3 documentation](https://python.langchain.com/v0.3/docs/integrations/chat/ollama/).\n",
":::\n",
"\n",
"## Overview\n",
"### Integration details\n",
"\n",
"| Class | Package | Local | Serializable | [JS support](https://js.langchain.com/v0.2/docs/integrations/chat/ollama) | Package downloads | Package latest |\n",
"| Class | Package | Local | Serializable | [JS support](https://js.langchain.com/docs/integrations/chat/ollama/) | Package downloads | Package latest |\n",
"| :--- | :--- | :---: | :---: | :---: | :---: | :---: |\n",
"| [ChatOllama](https://python.langchain.com/v0.2/api_reference/ollama/chat_models/langchain_ollama.chat_models.ChatOllama.html) | [langchain-ollama](https://python.langchain.com/v0.2/api_reference/ollama/index.html) | ✅ | ❌ | ✅ | ![PyPI - Downloads](https://img.shields.io/pypi/dm/langchain-ollama?style=flat-square&label=%20) | ![PyPI - Version](https://img.shields.io/pypi/v/langchain-ollama?style=flat-square&label=%20) |\n",
"| [ChatOllama](https://python.langchain.com/api_reference/ollama/chat_models/langchain_ollama.chat_models.ChatOllama.html#chatollama) | [langchain-ollama](https://python.langchain.com/api_reference/ollama/index.html) | ✅ | ❌ | ✅ | ![PyPI - Downloads](https://img.shields.io/pypi/dm/langchain-ollama?style=flat-square&label=%20) | ![PyPI - Version](https://img.shields.io/pypi/v/langchain-ollama?style=flat-square&label=%20) |\n",
"\n",
"### Model features\n",
"| [Tool calling](/docs/how_to/tool_calling/) | [Structured output](/docs/how_to/structured_output/) | JSON mode | [Image input](/docs/how_to/multimodal_inputs/) | Audio input | Video input | [Token-level streaming](/docs/how_to/chat_streaming/) | Native async | [Token usage](/docs/how_to/chat_token_usage_tracking/) | [Logprobs](/docs/how_to/logprobs/) |\n",
"| :---: |:----------------------------------------------------:| :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: |\n",
"| ✅ | ✅ | ✅ | | ❌ | ❌ | ✅ | ✅ | ❌ | ❌ |\n",
"| ✅ | ✅ | ✅ | | ❌ | ❌ | ✅ | ✅ | ❌ | ❌ |\n",
"\n",
"## Setup\n",
"\n",
@@ -45,17 +49,17 @@
" * macOS users can install via Homebrew with `brew install ollama` and start with `brew services start ollama`\n",
"* Fetch available LLM model via `ollama pull <name-of-model>`\n",
" * View a list of available models via the [model library](https://ollama.ai/library)\n",
" * e.g., `ollama pull llama3`\n",
" * e.g., `ollama pull gpt-oss:20b`\n",
"* This will download the default tagged version of the model. Typically, the default points to the latest, smallest sized-parameter model.\n",
"\n",
"> On Mac, the models will be download to `~/.ollama/models`\n",
">\n",
"> On Linux (or WSL), the models will be stored at `/usr/share/ollama/.ollama/models`\n",
"\n",
"* Specify the exact version of the model of interest as such `ollama pull vicuna:13b-v1.5-16k-q4_0` (View the [various tags for the `Vicuna`](https://ollama.ai/library/vicuna/tags) model in this instance)\n",
"* Specify the exact version of the model of interest as such `ollama pull gpt-oss:20b`\n",
"* To view all pulled models, use `ollama list`\n",
"* To chat directly with a model from the command line, use `ollama run <name-of-model>`\n",
"* View the [Ollama documentation](https://github.com/ollama/ollama/tree/main/docs) for more commands. You can run `ollama help` in the terminal to see available commands.\n"
"* View the [Ollama documentation](https://github.com/ollama/ollama/blob/main/docs/README.md) for more commands. You can run `ollama help` in the terminal to see available commands.\n"
]
},
{
@@ -102,7 +106,11 @@
"id": "b18bd692076f7cf7",
"metadata": {},
"source": [
"Make sure you're using the latest Ollama version for structured outputs. Update by running:"
":::warning\n",
"Make sure you're using the latest Ollama client version!\n",
":::\n",
"\n",
"Update by running:"
]
},
{
@@ -127,15 +135,16 @@
},
{
"cell_type": "code",
"execution_count": 9,
"execution_count": 2,
"id": "cb09c344-1836-4e0c-acf8-11d13ac1dbae",
"metadata": {},
"outputs": [],
"source": [
"from langchain_ollama import ChatOllama\n",
"from langchain_ollama.v1 import ChatOllama\n",
"\n",
"llm = ChatOllama(\n",
" model=\"llama3.1\",\n",
" model=\"gpt-oss:20b\",\n",
" validate_model_on_init=True,\n",
" temperature=0,\n",
" # other params...\n",
")"
@@ -158,46 +167,56 @@
},
"outputs": [
{
"data": {
"text/plain": [
"AIMessage(content='The translation of \"I love programming\" in French is:\\n\\n\"J\\'adore le programmation.\"', additional_kwargs={}, response_metadata={'model': 'llama3.1', 'created_at': '2025-06-25T18:43:00.483666Z', 'done': True, 'done_reason': 'stop', 'total_duration': 619971208, 'load_duration': 27793125, 'prompt_eval_count': 35, 'prompt_eval_duration': 36354583, 'eval_count': 22, 'eval_duration': 555182667, 'model_name': 'llama3.1'}, id='run--348bb5ef-9dd9-4271-bc7e-a9ddb54c28c1-0', usage_metadata={'input_tokens': 35, 'output_tokens': 22, 'total_tokens': 57})"
]
},
"execution_count": 5,
"metadata": {},
"output_type": "execute_result"
"name": "stdout",
"output_type": "stream",
"text": [
"AIMessage(type='ai', name=None, id='lc_run--5521db11-a5eb-4e46-956c-1455151cdaa3-0', lc_version='v1', content=[{'type': 'text', 'text': 'The translation of \"I love programming\" to French is:\\n\\n\"Je aime le programmation\"\\n\\nHowever, a more common and idiomatic way to express this in French would be:\\n\\n\"J\\'aime programmer\"\\n\\nThis phrase uses the verb \"aimer\" (to love) in the present tense, which is more suitable for expressing a general feeling or preference.'}], usage_metadata={'input_tokens': 34, 'output_tokens': 73, 'total_tokens': 107}, response_metadata={'model_name': 'llama3.2', 'created_at': '2025-08-08T23:07:44.439483Z', 'done': True, 'done_reason': 'stop', 'total_duration': 1410566833, 'load_duration': 28419542, 'prompt_eval_count': 34, 'prompt_eval_duration': 141642125, 'eval_count': 73, 'eval_duration': 1240075000}, parsed=None)\n",
"\n",
"Content:\n",
"The translation of \"I love programming\" to French is:\n",
"\n",
"\"Je aime le programmation\"\n",
"\n",
"However, a more common and idiomatic way to express this in French would be:\n",
"\n",
"\"J'aime programmer\"\n",
"\n",
"This phrase uses the verb \"aimer\" (to love) in the present tense, which is more suitable for expressing a general feeling or preference.\n"
]
}
],
"source": [
"messages = [\n",
" (\n",
" \"system\",\n",
" \"You are a helpful assistant that translates English to French. Translate the user sentence.\",\n",
" ),\n",
" (\"human\", \"I love programming.\"),\n",
"]\n",
"ai_msg = llm.invoke(messages)\n",
"ai_msg"
"ai_msg = llm.invoke(\"Translate 'I love programming' to French.\")\n",
"print(f\"{ai_msg}\\n\")\n",
"print(f\"Content:\\n{ai_msg.text}\")"
]
},
{
"cell_type": "markdown",
"id": "ede35e47",
"metadata": {},
"source": [
"## Streaming"
]
},
{
"cell_type": "code",
"execution_count": 11,
"id": "d86145b3-bfef-46e8-b227-4dda5c9c2705",
"execution_count": 10,
"id": "77474829",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"The translation of \"I love programming\" in French is:\n",
"\n",
"\"J'adore le programmation.\"\n"
"Hi| there|!| I|'m| just| a| chat|bot|,| so| I| don|'t| have| feelings|,| but| I|'m| here| and| ready| to| help| you| with| anything| you| need|!| How| can| I| assist| you| today|?| 😊|"
]
}
],
"source": [
"print(ai_msg.content)"
"for chunk in llm.stream(\"How are you doing?\"):\n",
" if chunk.text:\n",
" print(chunk.text, end=\"|\", flush=True)"
]
},
{
@@ -219,10 +238,10 @@
{
"data": {
"text/plain": [
"AIMessage(content='\"Programmieren ist meine Leidenschaft.\"\\n\\n(I translated \"programming\" to the German word \"Programmieren\", and added \"ist meine Leidenschaft\" which means \"is my passion\")', additional_kwargs={}, response_metadata={'model': 'llama3.1', 'created_at': '2025-06-25T18:43:29.350032Z', 'done': True, 'done_reason': 'stop', 'total_duration': 1194744459, 'load_duration': 26982500, 'prompt_eval_count': 30, 'prompt_eval_duration': 117043458, 'eval_count': 41, 'eval_duration': 1049892167, 'model_name': 'llama3.1'}, id='run--efc6436e-2346-43d9-8118-3c20b3cdf0d0-0', usage_metadata={'input_tokens': 30, 'output_tokens': 41, 'total_tokens': 71})"
"'Ich liebe Programmierung.'"
]
},
"execution_count": 7,
"execution_count": 19,
"metadata": {},
"output_type": "execute_result"
}
@@ -241,13 +260,15 @@
")\n",
"\n",
"chain = prompt | llm\n",
"chain.invoke(\n",
"result = chain.invoke(\n",
" {\n",
" \"input_language\": \"English\",\n",
" \"output_language\": \"German\",\n",
" \"input\": \"I love programming.\",\n",
" }\n",
")"
")\n",
"\n",
"result.text"
]
},
{
@@ -257,10 +278,10 @@
"source": [
"## Tool calling\n",
"\n",
"We can use [tool calling](/docs/concepts/tool_calling/) with an LLM [that has been fine-tuned for tool use](https://ollama.com/search?&c=tools) such as `llama3.1`:\n",
"We can use [tool calling](/docs/concepts/tool_calling/) with an LLM [that has been fine-tuned for tool use](https://ollama.com/search?&c=tools) such as `gpt-oss`:\n",
"\n",
"```\n",
"ollama pull llama3.1\n",
"ollama pull gpt-oss:20b\n",
"```\n",
"\n",
"Details on creating custom tools are available in [this guide](/docs/how_to/custom_tools/). Below, we demonstrate how to create a tool using the `@tool` decorator on a normal python function."
@@ -276,16 +297,16 @@
"name": "stdout",
"output_type": "stream",
"text": [
"[{'name': 'validate_user', 'args': {'addresses': ['123 Fake St, Boston, MA', '234 Pretend Boulevard, Houston, TX'], 'user_id': '123'}, 'id': 'aef33a32-a34b-4b37-b054-e0d85584772f', 'type': 'tool_call'}]\n"
"[{'type': 'tool_call', 'id': 'f365489e-1dc4-4d60-aaff-e56290ae4f99', 'name': 'validate_user', 'args': {'addresses': ['123 Fake St in Boston MA', '234 Pretend Boulevard in Houston TX'], 'user_id': 123}}]\n"
]
}
],
"source": [
"from typing import List\n",
"\n",
"from langchain_core.messages import AIMessage\n",
"from langchain_core.v1.messages import AIMessage\n",
"from langchain_core.tools import tool\n",
"from langchain_ollama import ChatOllama\n",
"from langchain_ollama.v1 import ChatOllama\n",
"\n",
"\n",
"@tool\n",
@@ -300,7 +321,8 @@
"\n",
"\n",
"llm = ChatOllama(\n",
" model=\"llama3.1\",\n",
" model=\"gpt-oss:20b\",\n",
" validate_model_on_init=True,\n",
" temperature=0,\n",
").bind_tools([validate_user])\n",
"\n",
@@ -314,6 +336,50 @@
" print(result.tool_calls)"
]
},
{
"cell_type": "markdown",
"id": "4321b6a8",
"metadata": {},
"source": [
"## Structured output"
]
},
{
"cell_type": "code",
"execution_count": 16,
"id": "20f8ae70",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Name: Alice, Age: 28, Job: Software Engineer\n"
]
}
],
"source": [
"from langchain_ollama.v1 import ChatOllama\n",
"from pydantic import BaseModel, Field\n",
"\n",
"llm = ChatOllama(model=\"llama3.2\", validate_model_on_init=True, temperature=0)\n",
"\n",
"\n",
"class Person(BaseModel):\n",
" \"\"\"Information about a person.\"\"\"\n",
"\n",
" name: str = Field(description=\"The person's full name\")\n",
" age: int = Field(description=\"The person's age in years\")\n",
" occupation: str = Field(description=\"The person's job or profession\")\n",
"\n",
"\n",
"structured_llm = llm.with_structured_output(Person)\n",
"response: Person = structured_llm.invoke(\n",
" \"Tell me about a fictional software engineer named Alice who is 28 years old.\"\n",
")\n",
"print(f\"Name: {response.name}, Age: {response.age}, Job: {response.occupation}\")"
]
},
{
"cell_type": "markdown",
"id": "4c5e0197",
@@ -321,11 +387,9 @@
"source": [
"## Multi-modal\n",
"\n",
"Ollama has support for multi-modal LLMs, such as [bakllava](https://ollama.com/library/bakllava) and [llava](https://ollama.com/library/llava).\n",
"Ollama has limited support for multi-modal LLMs, such as [gemma3](https://ollama.com/library/gemma3).\n",
"\n",
" ollama pull bakllava\n",
"\n",
"Be sure to update Ollama so that you have the most recent version to support multi-modal."
"### Image input"
]
},
{
@@ -408,15 +472,15 @@
"name": "stdout",
"output_type": "stream",
"text": [
"90%\n"
"Based on the image, the dollar-based gross retention rate is **90%**.\n"
]
}
],
"source": [
"from langchain_core.messages import HumanMessage\n",
"from langchain_ollama import ChatOllama\n",
"from langchain_core.v1.messages import HumanMessage\n",
"from langchain_ollama.v1 import ChatOllama\n",
"\n",
"llm = ChatOllama(model=\"bakllava\", temperature=0)\n",
"llm = ChatOllama(model=\"gemma3:4b\", validate_model_on_init=True, temperature=0)\n",
"\n",
"\n",
"def prompt_func(data):\n",
@@ -424,8 +488,9 @@
" image = data[\"image\"]\n",
"\n",
" image_part = {\n",
" \"type\": \"image_url\",\n",
" \"image_url\": f\"data:image/jpeg;base64,{image}\",\n",
" \"type\": \"image\",\n",
" \"base64\": f\"data:image/jpeg;base64,{image}\",\n",
" \"mime_type\": \"image/jpeg\",\n",
" }\n",
"\n",
" content_parts = []\n",
@@ -435,7 +500,7 @@
" content_parts.append(image_part)\n",
" content_parts.append(text_part)\n",
"\n",
" return [HumanMessage(content=content_parts)]\n",
" return [HumanMessage(content_parts)]\n",
"\n",
"\n",
"from langchain_core.output_parsers import StrOutputParser\n",
@@ -454,11 +519,9 @@
"id": "fb6a331f-1507-411f-89e5-c4d598154f3c",
"metadata": {},
"source": [
"## Reasoning models and custom message roles\n",
"## Reasoning models\n",
"\n",
"Some models, such as IBM's [Granite 3.2](https://ollama.com/library/granite3.2), support custom message roles to enable thinking processes.\n",
"\n",
"To access Granite 3.2's thinking features, pass a message with a `\"control\"` role with content set to `\"thinking\"`. Because `\"control\"` is a non-standard message role, we can use a [ChatMessage](https://python.langchain.com/api_reference/core/messages/langchain_core.messages.chat.ChatMessage.html) object to implement it:"
"Many models support outputting their reasoning process in addition to the final answer. This is useful for debugging and understanding how the model arrived at its conclusion. This train of thought reasoning is available in models such as `gpt-oss`, `qwen3:8b`, and `deepseek-r1`. To enable reasoning output, set the `reasoning` parameter to `True` either when instantiating the model or during invocation."
]
},
{
@@ -471,30 +534,25 @@
"name": "stdout",
"output_type": "stream",
"text": [
"Here is my thought process:\n",
"The user is asking for the value of 3 raised to the power of 3, which is a basic exponentiation operation.\n",
"\n",
"Here is my response:\n",
"\n",
"3^3 (read as \"3 to the power of 3\") equals 27. \n",
"\n",
"This calculation is performed by multiplying 3 by itself three times: 3*3*3 = 27.\n"
"Response including reasoning: [{'type': 'reasoning', 'reasoning': \"Okay, so I need to figure out what 3^3 is. Let me start by recalling what exponents mean. From what I remember, when you have a number raised to a power, like a^b, it means you multiply the number by itself b times. So, for example, 2^3 would be 2 multiplied by itself three times: 2 × 2 × 2. Let me check if that's right. Yeah, I think that's correct. So applying that to 3^3, it should be 3 multiplied by itself three times.\\n\\nWait, let me make sure I'm not confusing the base and the exponent. The base is the number being multiplied, and the exponent is how many times it's multiplied. So in 3^3, the base is 3 and the exponent is 3. That means I need to multiply 3 by itself three times. Let me write that out step by step.\\n\\nFirst, multiply the first two 3s: 3 × 3. What's 3 times 3? That's 9. Okay, so the first multiplication gives me 9. Now, I need to multiply that result by the third 3. So 9 × 3. Let me calculate that. 9 times 3 is... 27. So putting it all together, 3 × 3 × 3 equals 27. \\n\\nWait, let me verify that again. Maybe I should do it in a different way to make sure I didn't make a mistake. Let's break it down. 3^3 is the same as 3 × 3 × 3. Let me compute 3 × 3 first, which is 9, and then multiply that by 3. 9 × 3 is indeed 27. Hmm, that seems right. \\n\\nAlternatively, I can think of exponents as repeated multiplication. So 3^1 is 3, 3^2 is 3 × 3 = 9, and 3^3 is 3 × 3 × 3 = 27. Yeah, that progression makes sense. Each time the exponent increases by 1, you multiply by the base again. So starting from 3^1 = 3, then 3^2 is 3 × 3 = 9, then 3^3 is 9 × 3 = 27. \\n\\nIs there another way to check this? Maybe using exponent rules. For example, if I know that 3^2 is 9, then multiplying by another 3 would give me 3^3. Since 9 × 3 is 27, that confirms it again. \\n\\nAlternatively, maybe I can use logarithms or something else, but that might be overcomplicating. Since exponents are straightforward multiplication, I think my initial calculation is correct. \\n\\nWait, just to be thorough, maybe I can use a calculator to verify. Let me imagine pressing 3, then the exponent key, then 3. If I do that, it should give me 27. Yeah, that's what I remember. So all methods point to 27. \\n\\nI think I've checked it multiple ways: breaking down the multiplication step by step, using the exponent progression, and even considering a calculator verification. All of them lead to the same answer. Therefore, I'm confident that 3^3 equals 27.\\n\"}, {'type': 'text', 'text': 'To determine the value of $3^3$, we start by understanding what an exponent represents. The expression $a^b$ means multiplying the base $a$ by itself $b$ times. \\n\\n### Step-by-Step Calculation:\\n1. **Identify the base and exponent**: \\n In $3^3$, the base is **3**, and the exponent is **3**. This means we multiply 3 by itself three times.\\n\\n2. **Perform the multiplication**: \\n - First, multiply the first two 3s: \\n $3 \\\\times 3 = 9$ \\n - Next, multiply the result by the third 3: \\n $9 \\\\times 3 = 27$\\n\\n3. **Verify the result**: \\n - $3^1 = 3$ \\n - $3^2 = 3 \\\\times 3 = 9$ \\n - $3^3 = 3 \\\\times 3 \\\\times 3 = 27$ \\n This progression confirms the calculation.\\n\\n### Final Answer:\\n$$\\n3^3 = \\\\boxed{27}\\n$$'}]\n",
"Response without reasoning: [{'type': 'text', 'text': \"Sure! Let's break down what **3³** means and how to calculate it step by step.\\n\\n---\\n\\n### Step 1: Understand the notation\\nThe expression **3³** means **3 multiplied by itself three times**. The small number (3) is called the **exponent**, and it tells us how many times the base number (3) is used as a factor.\\n\\nSo:\\n$$\\n3^3 = 3 \\\\times 3 \\\\times 3\\n$$\\n\\n---\\n\\n### Step 2: Perform the multiplication step by step\\n\\n1. Multiply the first two 3s:\\n $$\\n 3 \\\\times 3 = 9\\n $$\\n\\n2. Now multiply the result by the third 3:\\n $$\\n 9 \\\\times 3 = 27\\n $$\\n\\n---\\n\\n### Step 3: Final Answer\\n\\n$$\\n3^3 = 27\\n$$\\n\\n---\\n\\n### Summary\\n- **3³** means **3 × 3 × 3**\\n- **3 × 3 = 9**\\n- **9 × 3 = 27**\\n- So, **3³ = 27**\\n\\nLet me know if you'd like to explore exponents further!\"}]\n"
]
}
],
"source": [
"from langchain_core.messages import ChatMessage, HumanMessage\n",
"from langchain_ollama import ChatOllama\n",
"from langchain_ollama.v1 import ChatOllama\n",
"\n",
"llm = ChatOllama(model=\"granite3.2:8b\")\n",
"# All outputs from `llm` will include reasoning unless overridden during invocation\n",
"llm = ChatOllama(model=\"qwen3:8b\", validate_model_on_init=True, reasoning=True)\n",
"\n",
"messages = [\n",
" ChatMessage(role=\"control\", content=\"thinking\"),\n",
" HumanMessage(\"What is 3^3?\"),\n",
"]\n",
"response_a = llm.invoke(\"What is 3^3? Explain your reasoning step by step.\")\n",
"print(f\"Response including reasoning: {response_a.content}\")\n",
"\n",
"response = llm.invoke(messages)\n",
"print(response.content)"
"# Test override; note no ReasoningContentBlock in the response\n",
"response_b = llm.invoke(\n",
" \"What is 3^3? Explain your reasoning step by step.\", reasoning=False\n",
")\n",
"print(f\"Response without reasoning: {response_b.content}\")"
]
},
{
@@ -502,7 +560,7 @@
"id": "6271d032-da40-44d4-9b52-58370e164be3",
"metadata": {},
"source": [
"Note that the model exposes its thought process in addition to its final response."
"Note that the model exposes its thought process as a `ReasoningContentBlock` addition to its final response."
]
},
{

View File

@@ -1,14 +1,16 @@
# Ollama
>[Ollama](https://ollama.com/) allows you to run open-source large language models,
> such as [Llama3.1](https://ai.meta.com/blog/meta-llama-3-1/), locally.
> such as [gpt-oss](https://ollama.com/library/gpt-oss), locally.
>
>`Ollama` bundles model weights, configuration, and data into a single package, defined by a Modelfile.
>It optimizes setup and configuration details, including GPU usage.
>For a complete list of supported models and model variants, see the [Ollama model library](https://ollama.ai/library).
>The `ollama` [package](https://pypi.org/project/ollama/0.5.3/) bundles model weights,
> configuration, and data into a single package, defined by a Modelfile. It optimizes
> setup and configuration details, including GPU usage.
>For a complete list of supported models and model variants, see the
> [Ollama model library](https://ollama.com/search).
See [this guide](/docs/how_to/local_llms) for more details
on how to use `Ollama` with LangChain.
See [this guide](/docs/how_to/local_llms/#ollama) for more details
on how to use `ollama` with LangChain.
## Installation and Setup
### Ollama installation
@@ -23,10 +25,10 @@ Ollama will start as a background service automatically, if this is disabled, ru
ollama serve
```
After starting ollama, run `ollama pull <name-of-model>` to download a model from the [Ollama model library](https://ollama.ai/library):
After starting ollama, run `ollama pull <name-of-model>` to download a model from the [Ollama model library](https://ollama.com/library):
```bash
ollama pull llama3.1
ollama pull gpt-oss:20b
```
- This will download the default tagged version of the model. Typically, the default points to the latest, smallest sized-parameter model.

View File

@@ -591,7 +591,7 @@
},
{
"cell_type": "code",
"execution_count": 36,
"execution_count": null,
"metadata": {
"azdata_cell_guid": "d9127900-0942-48f1-bd4d-081c7fa3fcae",
"language": "python"
@@ -606,7 +606,7 @@
}
],
"source": [
"from langchain.document_loaders import AzureBlobStorageFileLoader\n",
"from langchain_community.document_loaders import AzureBlobStorageFileLoader\n",
"from langchain.text_splitter import RecursiveCharacterTextSplitter\n",
"from langchain_core.documents import Document\n",
"\n",

View File

@@ -0,0 +1,107 @@
---
sidebar_position: 3
---
# How to update your code
*Last updated: 08.08.25*
If you maintain custom callbacks or output parsers, type checkers may raise errors if
they do not accept the new message types as inputs. This guide describes how to
address those issues.
If you do not maintain custom callbacks or output parsers, there are no breaking
changes. See our guide on the [new message types](/docs/versions/v0_4/messages) to learn
about new features introduced in v0.4.
## Custom callbacks
[BaseCallbackHandler](https://python.langchain.com/api_reference/core/callbacks/langchain_core.callbacks.base.BaseCallbackHandler.html)
now includes an attribute `accepts_new_messages` that defaults to False. When this
attribute is False, the callback system in langchain-core will automatically convert
new message types to old, so there should be no runtime errors. You can update callback
signatures as below to fix type-checking errors:
```python
from langchain_core.v1.messages import AIMessage, AIMessageChunk, MessageV1
def on_chat_model_start(
self,
serialized: dict[str, Any],
# highlight-next-line
messages: Union[list[list[BaseMessage]], list[MessageV1]],
*,
run_id: UUID,
parent_run_id: Optional[UUID] = None,
tags: Optional[list[str]] = None,
metadata: Optional[dict[str, Any]] = None,
**kwargs: Any,
) -> Any:
def on_llm_new_token(
self,
token: str,
*,
chunk: Optional[
# highlight-next-line
Union[GenerationChunk, ChatGenerationChunk, AIMessageChunk]
] = None,
run_id: UUID,
parent_run_id: Optional[UUID] = None,
**kwargs: Any,
) -> Any:
def on_llm_end(
self,
# highlight-next-line
response: Union[LLMResult, AIMessage],
*,
run_id: UUID,
parent_run_id: Optional[UUID] = None,
**kwargs: Any,
) -> Any:
```
You can also safely type-ignore mypy `override` errors here unless you switch
`accepts_new_messages` to True.
## Custom output parsers
All output parsers in `langchain-core` have been updated to accept the new message
types.
If you maintain a custom output parser, `langchain-core` exposes a
`convert_from_v1_message` function so that your parser can easily operate on the new
message types:
```python
from langchain_core.messages.utils import convert_from_v1_message
from langchain_core.v1.messages import AIMessage
def parse_result(
self,
# highlight-next-line
result: Union[list[Generation], AIMessage],
*,
partial: bool = False,
) -> Union[list[AgentAction], AgentFinish]:
# highlight-start
if isinstance(result, AIMessage):
result = [ChatGeneration(message=convert_from_v1_message(result))]
# highlight-end
...
def _transform(
# higlight-next-line
self, input: Iterator[Union[str, BaseMessage, AIMessage]]
) -> Iterator[AddableDict]:
for chunk in input:
# higlight-start
if isinstance(chunk, AIMessage):
chunk = convert_from_v1_message(chunk)
# higlight-end
...
```
This will allow your parser to work as before. You can also update the parser to
natively handle the new message types to save this conversion step. See our guide on
the [new message types](/docs/versions/v0_4/messages) for details.

View File

@@ -0,0 +1,81 @@
---
sidebar_position: 1
---
# LangChain v0.4
*Last updated: 08.08.25*
## What's changed
LangChain v0.4 allows developers to opt-in to new message types that will become default
in LangChain v1.0. LangChain v1.0 will be released this fall. These messages provide
fully typed, provider-agnostic content, introducing standard content blocks for
reasoning, citations, server-side tool calls, and other LLM features. They also offer
performance benefits over existing message classes.
New message types have been added to a `v1` namespace in `langchain-core`. Select
integration packages now also expose a `v1` namespace containing chat models that
work with the new message types.
Input types for callbacks and output parsers have been widened to accept the new message
types. If you maintain custom callbacks or output parsers, type checkers may raise
errors if they do not accept the new message types as inputs. Refer to
[this guide](/docs/versions/v0_4/how_to_update) for how to address those issues. This
is the only breaking change.
## What's new
You can access the new chat models through [init_chat_model](/docs/how_to/chat_models_universal_init/) by setting `message_version="v1"`:
```python
from langchain.chat_models import init_chat_model
llm = init_chat_model("openai:gpt-5", message_version="v1")
input_message = {"role": "user", "content": "Hello, world!"}
llm.invoke([input_message])
```
You can also access the `v1` namespaces directly:
```python
from langchain_core.v1.messages import HumanMessage
from langchain_openai.v1 import ChatOpenAI
input_message = HumanMessage("Hello, world!")
llm.invoke([input_message])
```
:::info New message details
See our guide on the [new message types](/docs/versions/v0_4/messages) for details.
:::
## How to update your code
If you maintain custom callbacks or output parsers, type checkers may raise errors if
they do not accept the new message types as inputs. Refer to
[this guide](/docs/versions/v0_4/how_to_update) for how to address those issues.
If you do not maintain custom callbacks or output parsers, there are no breaking
changes. See our guide on the [new message types](/docs/versions/v0_4/messages) to learn
about new features introduced in v0.4.
### Base packages
| Package | Latest | Recommended constraint |
|--------------------------|--------|------------------------|
| langchain | 0.4.0 | >=0.4,&lt;1.0 |
| langchain-community | 0.4.0 | >=0.4,&lt;1.0 |
| langchain-text-splitters | 0.4.0 | >=0.4,&lt;1.0 |
| langchain-core | 0.4.0 | >=0.4,&lt;1.0 |
| langchain-experimental | 0.4.0 | >=0.4,&lt;1.0 |
### Integration packages
...

View File

@@ -0,0 +1,429 @@
---
sidebar_position: 2
---
# LangChain v1.0 message types
*Last updated: 08.08.25*
LangChain v0.4 allows developers to opt-in to new message types that will become default
in LangChain v1.0. LangChain v1.0 will be released this fall.
These messages should be considered a beta feature and are subject to change in
LangChain v1.0, although we do not anticipate any significant changes.
## Benefits
The new message types offer improvements in performance, type-safety, and consistency
across OpenAI, Anthropic, Gemini, and other providers.
### Performance
Importantly, the new messages are Python dataclasses, saving some runtime from
instantiating (layers of) Pydantic BaseModels.
LangChain v0.4 introduces a new `BaseChatModel` class in `langchain_core.v1.chat_models`
that is faster and leaner than the existing `BaseChatModel` class, offering significant
reductions in overhead above provider SDKs.
### Type-safety
Message content is typed as
```python
import langchain_core.messages.content_blocks as types
content: list[types.ContentBlock]
```
where we have introduced standard types for text, reasoning, citations, server-side
tool executions (e.g., web search and code interpreters). These include
[tool calls](https://python.langchain.com/docs/concepts/tool_calling/) and the
[multi-modal types](/docs/how_to/multimodal_inputs/) introduced in earlier versions
of LangChain. There are no breaking changes associated with the existing content types.
**This is the most significant change from the existing message classes**, which permit
strings, lists of strings, or lists of untyped dicts as content. We have added a
`.text` getter so that developers can easily recover string content. Consequently, we
have deprecated `.text()` (as a method) in favor of the new property.
`.tool_calls`, instead of an attribute, is now also a getter with an associated setter,
so that usage is largely the same. See [usage comparison](#usage-comparison), below,
for details.
### Consistency
Many chat models can generate a variety of content in a single conversational turn,
including reasoning, tool calls and responses, images, text with citations, and other
structured objects. We have standardized these types, resulting in improved
inter-operability of messages across models.
## Usage comparison
| Task | Previous | New |
|-------------------------|----------------------------------------|------------------------------------------------------------------|
| Get text content (str) | `message.content` or `message.text()` | `message.text` |
| Get content blocks | `message.content` | `message.content` |
| Get `additional_kwargs` | `message.additional_kwargs` | `[block for block in message.content if block["type"] == "..."]` |
Getting `response_metadata` and `tool_calls` has not changed.
### Changes in content blocks
For providers that generate `list[dict]` content, the dict elements have changed to
conform to the new content block types. Refer to the
[API reference](https://python.langchain.com/api_reference/core/messages.html) for
details. Below we show some examples.
Importantly:
- Where provider-specific fields map to fields on standard types, LangChain manages
the translation.
- Where provider-specific fields do not map to fields on standard types, LangChain
stores them in an `"extras"` key (see below for examples).
<details>
<summary>Citations and web search</summary>
<div className="row">
<div className="col col--6" style={{minWidth: 0}}>
**Old content**
```python
from langchain.chat_models import init_chat_model
llm = init_chat_model("openai:gpt-5-mini", output_version="responses/v1")
llm_with_tools = llm.bind_tools([{"type": "web_search_preview"}])
response = llm_with_tools.invoke("What was a positive news story from today?")
response.content
```
```
[
{
"type": "reasoning",
"id": "rs_abc123",
"summary": []
},
{
"type": "web_search_call",
"id": "ws_abc123",
"action": {
"query": "positive news today August 8 2025 'good news' 'Aug 8 2025' 'today' ",
"type": "search"
},
"status": "completed"
},
{
"type": "text",
"text": "Here are two positive news items from today...",
"annotations": [
{
"type": "url_citation",
"end_index": 455,
"start_index": 196,
"title": "Document title",
"url": "<document url>"
},
{
"type": "url_citation",
"end_index": 1022,
"start_index": 707,
"title": "Another Document",
"url": "<another document url>"
},
],
"id": "msg_abc123"
}
]
```
</div>
<div className="col col--6" style={{minWidth: 0}}>
**New content**
```python
from langchain.chat_models import init_chat_model
llm = init_chat_model("openai:gpt-5-mini", message_version="v1")
llm_with_tools = llm.bind_tools([{"type": "web_search_preview"}])
response = llm_with_tools.invoke("What was a positive news story from today?")
response.content
```
```
[
{
"type": "reasoning",
"id": "rs_abc123"
},
{
"type": "web_search_call",
"id": "ws_abc123",
"query": "positive news August 8 2025 'good news' 'today' ",
"extras": {
"action": {"type": "search"},
"status": "completed",
}
},
{
"type": "web_search_result",
"id": "ws_abc123"
},
{
"type": "text",
"text": "Here are two positive news items from today...",
"annotations": [
{
"type": "citation",
"end_index": 455,
"start_index": 196,
"title": "Document title",
"url": "<document url>"
},
{
"type": "citation",
"end_index": 1022,
"start_index": 707,
"title": "Another Document",
"url": "<another document url>"
}
],
"id": "msg_abc123"
}
]
```
</div>
</div>
</details>
<details>
<summary>Reasoning</summary>
<div className="row">
<div className="col col--6" style={{minWidth: 0}}>
**Old content**
```python
from langchain.chat_models import init_chat_model
llm = init_chat_model(
"openai:gpt-5",
reasoning={"effort": "medium", "summary": "auto"},
output_version="responses/v1",
)
response = llm.invoke(
"What was the third tallest building in the world in the year 2000?"
)
response.content
```
```
[
{
"type": "reasoning",
"id": "rs_abc123",
"summary": [
{
"text": "The user is asking about...",
"type": "summary_text"
},
{
"text": "We should consider...",
"type": "summary_text"
}
]
},
{
"type": "text",
"text": "In the year 2000 the third-tallest building in the world was...",
"id": "msg_abc123"
}
]
```
</div>
<div className="col col--6" style={{minWidth: 0}}>
**New content**
```python
from langchain.chat_models import init_chat_model
llm = init_chat_model(
"openai:gpt-5",
reasoning={"effort": "medium", "summary": "auto"},
message_version="v1",
)
response = llm.invoke(
"What was the third tallest building in the world in the year 2000?"
)
response.content
```
```
[
{
"type": "reasoning",
"reasoning": "The user is asking about...",
"id": "rs_abc123"
},
{
"type": "reasoning",
"reasoning": "We should consider...",
"id": "rs_abc123"
},
{
"type": "text",
"text": "In the year 2000 the third-tallest building in the world was...",
"id": "msg_abc123"
}
]
```
</div>
</div>
</details>
<details>
<summary>Non-standard blocks</summary>
Where content blocks from specific providers do not map to a standard type, they are
structured into a `"non_standard"` block:
```python
{
"type": "non_standard",
"value": original_block,
}
```
<div className="row">
<div className="col col--6" style={{minWidth: 0}}>
**Old content**
```python
from langchain.chat_models import init_chat_model
llm = init_chat_model("openai:gpt-5-mini", output_version="responses/v1")
llm_with_tools = llm.bind_tools(
[
{
"type": "file_search",
"vector_store_ids": ["vs_67d0baa0544c8191be194a85e19cbf92"],
}
]
)
response = llm_with_tools.invoke("What is deep research by OpenAI?")
response.content
```
```
[
{
"type": "reasoning",
"id": "rs_abc123",
"summary": []
},
{
"type": "file_search_call",
"id": "fs_abc123",
"queries": [
"What is deep research by OpenAI?",
"deep research OpenAI definition"
],
"status": "completed"
},
{
"type": "reasoning",
"id": "rs_def456",
"summary": []
},
{
"type": "text",
"text": "Deep research is...",
"annotations": [
{
"type": "file_citation",
"file_id": "file-abc123",
"filename": "sample_file.pdf",
"index": 305
},
{
"type": "file_citation",
"file_id": "file-abc123",
"filename": "sample_file.pdf",
"index": 675
},
],
"id": "msg_abc123"
}
]
```
</div>
<div className="col col--6" style={{minWidth: 0}}>
**New content**
```python
from langchain.chat_models import init_chat_model
llm = init_chat_model("openai:gpt-5-mini", message_version="v1")
llm_with_tools = llm.bind_tools(
[
{
"type": "file_search",
"vector_store_ids": ["vs_67d0baa0544c8191be194a85e19cbf92"],
}
]
)
response = llm_with_tools.invoke("What is deep research by OpenAI?")
response.content
```
```
[
{
"type": "reasoning",
"id": "rs_abc123",
"summary": []
},
{
"type": "non_standard",
"value": {
"type": "file_search_call",
"id": "fs_abc123",
"queries": [
"What is deep research by OpenAI?",
"deep research OpenAI definition"
],
"status": "completed"
}
},
{
"type": "reasoning",
"id": "rs_def456",
"summary": []
},
{
"type": "text",
"text": "Deep research is...",
"annotations": [
{
"type": "citation",
"title": "sample_file.pdf",
"extras": {
"file_id": "file-abc123",
"index": 305
}
},
{
"type": "citation",
"title": "sample_file.pdf",
"extras": {
"file_id": "file-abc123",
"index": 675
}
},
],
"id": "msg_abc123"
}
]
```
</div>
</div>
</details>
## Feature gaps
The new message types do not yet support LangChain's caching layer. Support will be
added in the coming weeks.

View File

@@ -224,13 +224,17 @@ const config = {
},
{
type: "dropdown",
label: "v0.3",
label: "v0.4",
position: "right",
items: [
{
label: "v0.3",
label: "v0.4",
href: "/docs/introduction",
},
{
label: "v0.3",
href: "https://python.langchain.com/v0.3/docs/introduction/",
},
{
label: "v0.2",
href: "https://python.langchain.com/v0.2/docs/introduction",

View File

@@ -82,6 +82,14 @@ module.exports = {
collapsed: false,
collapsible: false,
items: [
{
type: "category",
label: "v0.4",
items: [{
type: 'autogenerated',
dirName: 'versions/v0_4',
}],
},
{
type: 'doc',
id: 'versions/v0_3/index',
@@ -418,7 +426,7 @@ module.exports = {
},
],
},
],
link: {
type: "generated-index",

View File

@@ -23,11 +23,19 @@
{
"source": "/v0.2/:path(.*/?)*",
"destination": "https://langchain-v02.vercel.app/v0.2/:path*"
},
{
"source": "/v0.3",
"destination": "https://langchain-v03.vercel.app/v0.3"
},
{
"source": "/v0.3/:path(.*/?)*",
"destination": "https://langchain-v03.vercel.app/v0.3/:path*"
}
],
"redirects": [
{
"source": "/v0.3/docs/:path(.*/?)*",
"source": "/v0.4/docs/:path(.*/?)*",
"destination": "/docs/:path*"
},
{

View File

@@ -736,7 +736,7 @@ class SystemMessage:
custom_role: Optional[str] = None
"""If provided, a custom role for the system message.
Example: ``'developer'``.
Example: ``'developer'``, ``'control'``.
Integration packages may use this field to assign the system message role if it
contains a recognized value.

View File

@@ -191,6 +191,7 @@ class ChatOllama(BaseChatModel):
llm = ChatOllama(
model = "llama3",
validate_model_on_init = True,
temperature = 0.8,
num_predict = 256,
# other params ...
@@ -199,13 +200,7 @@ class ChatOllama(BaseChatModel):
Invoke:
.. code-block:: python
from langchain_core.v1.messages import HumanMessage
from langchain_core.messages.content_blocks import TextContentBlock
messages = [
HumanMessage("Hello!")
]
llm.invoke(messages)
llm.invoke("Hello!")
.. code-block:: python
@@ -214,13 +209,7 @@ class ChatOllama(BaseChatModel):
Stream:
.. code-block:: python
from langchain_core.v1.messages import HumanMessage
from langchain_core.messages.content_blocks import TextContentBlock
messages = [
HumanMessage(Return the words Hello World!")
]
for chunk in llm.stream(messages):
for chunk in llm.stream("Return the words Hello World!"):
print(chunk.content, end="")
.. code-block:: python
@@ -232,7 +221,7 @@ class ChatOllama(BaseChatModel):
Multi-modal input:
.. code-block:: python
from langchain_core.messages.content_blocks import ImageContentBlock
from langchain_core.messages.content_blocks import TextContentBlock, ImageContentBlock
response = llm.invoke([
HumanMessage(content=[
@@ -254,9 +243,7 @@ class ChatOllama(BaseChatModel):
b: int = Field(..., description="Second integer")
llm_with_tools = llm.bind_tools([Multiply])
ans = llm_with_tools.invoke([
HumanMessage("What is 45*67")
])
ans = llm_with_tools.invoke("What is 45*67")
ans.tool_calls
.. code-block:: python
@@ -278,7 +265,7 @@ class ChatOllama(BaseChatModel):
streaming: bool = False
"""Whether to use streaming for invocation.
If True, invoke will use streaming internally.
If True, ``invoke`` will use streaming internally.
"""