From dbf9986d4465c24c8f3fe45c86d6a06cf0785655 Mon Sep 17 00:00:00 2001 From: rylativity <41017744+rylativity@users.noreply.github.com> Date: Fri, 18 Apr 2025 10:07:07 -0400 Subject: [PATCH] langchain-ollama (partners) / langchain-core: allow passing ChatMessages to Ollama (including arbitrary roles) (#30411) Replacement for PR #30191 (@ccurme) **Description**: currently, ChatOllama [will raise a value error if a ChatMessage is passed to it](https://github.com/langchain-ai/langchain/blob/master/libs/partners/ollama/langchain_ollama/chat_models.py#L514), as described https://github.com/langchain-ai/langchain/pull/30147#issuecomment-2708932481. Furthermore, ollama-python is removing the limitations on valid roles that can be passed through chat messages to a model in ollama - https://github.com/ollama/ollama-python/pull/462#event-16917810634. This PR removes the role limitations imposed by langchain and enables passing langchain ChatMessages with arbitrary 'role' values through the langchain ChatOllama class to the underlying ollama-python Client. As this PR relies on [merged but unreleased functionality in ollama-python]( https://github.com/ollama/ollama-python/pull/462#event-16917810634), I have temporarily pointed the ollama package source to the main branch of the ollama-python github repo. Format, lint, and tests of new functionality passing. Need to resolve issue with recently added ChatOllama tests. (Now resolved) **Issue**: resolves #30122 (related to ollama issue https://github.com/ollama/ollama/issues/8955) **Dependencies**: no new dependencies [x] PR title [x] PR message [x] Lint and test: format, lint, and test all running successfully and passing --------- Co-authored-by: Ryan Stewart Co-authored-by: Chester Curme --- docs/docs/integrations/chat/ollama.ipynb | 1003 +++++++++-------- .../ollama/langchain_ollama/chat_models.py | 5 +- libs/partners/ollama/pyproject.toml | 5 +- .../tests/unit_tests/test_chat_models.py | 41 + libs/partners/ollama/uv.lock | 12 +- 5 files changed, 588 insertions(+), 478 deletions(-) diff --git a/docs/docs/integrations/chat/ollama.ipynb b/docs/docs/integrations/chat/ollama.ipynb index 4c1e8f37de5..39c6a3607a2 100644 --- a/docs/docs/integrations/chat/ollama.ipynb +++ b/docs/docs/integrations/chat/ollama.ipynb @@ -1,473 +1,536 @@ { - "cells": [ - { - "cell_type": "raw", - "id": "afaf8039", - "metadata": {}, - "source": [ - "---\n", - "sidebar_label: Ollama\n", - "---" - ] - }, - { - "cell_type": "markdown", - "id": "e49f1e0d", - "metadata": {}, - "source": [ - "# ChatOllama\n", - "\n", - "[Ollama](https://ollama.ai/) allows you to run open-source large language models, such as Llama 2, locally.\n", - "\n", - "Ollama bundles model weights, configuration, and data into a single package, defined by a Modelfile.\n", - "\n", - "It optimizes setup and configuration details, including GPU usage.\n", - "\n", - "For a complete list of supported models and model variants, see the [Ollama model library](https://github.com/jmorganca/ollama#model-library).\n", - "\n", - "## Overview\n", - "### Integration details\n", - "\n", - "| Class | Package | Local | Serializable | [JS support](https://js.langchain.com/v0.2/docs/integrations/chat/ollama) | Package downloads | Package latest |\n", - "| :--- | :--- | :---: | :---: | :---: | :---: | :---: |\n", - "| [ChatOllama](https://python.langchain.com/v0.2/api_reference/ollama/chat_models/langchain_ollama.chat_models.ChatOllama.html) | [langchain-ollama](https://python.langchain.com/v0.2/api_reference/ollama/index.html) | ✅ | ❌ | ✅ | ![PyPI - Downloads](https://img.shields.io/pypi/dm/langchain-ollama?style=flat-square&label=%20) | ![PyPI - Version](https://img.shields.io/pypi/v/langchain-ollama?style=flat-square&label=%20) |\n", - "\n", - "### Model features\n", - "| [Tool calling](/docs/how_to/tool_calling/) | [Structured output](/docs/how_to/structured_output/) | JSON mode | [Image input](/docs/how_to/multimodal_inputs/) | Audio input | Video input | [Token-level streaming](/docs/how_to/chat_streaming/) | Native async | [Token usage](/docs/how_to/chat_token_usage_tracking/) | [Logprobs](/docs/how_to/logprobs/) |\n", - "| :---: |:----------------------------------------------------:| :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: |\n", - "| ✅ | ✅ | ✅ | ❌ | ❌ | ❌ | ✅ | ✅ | ❌ | ❌ |\n", - "\n", - "## Setup\n", - "\n", - "First, follow [these instructions](https://github.com/jmorganca/ollama) to set up and run a local Ollama instance:\n", - "\n", - "* [Download](https://ollama.ai/download) and install Ollama onto the available supported platforms (including Windows Subsystem for Linux)\n", - "* Fetch available LLM model via `ollama pull `\n", - " * View a list of available models via the [model library](https://ollama.ai/library)\n", - " * e.g., `ollama pull llama3`\n", - "* This will download the default tagged version of the model. Typically, the default points to the latest, smallest sized-parameter model.\n", - "\n", - "> On Mac, the models will be download to `~/.ollama/models`\n", - ">\n", - "> On Linux (or WSL), the models will be stored at `/usr/share/ollama/.ollama/models`\n", - "\n", - "* Specify the exact version of the model of interest as such `ollama pull vicuna:13b-v1.5-16k-q4_0` (View the [various tags for the `Vicuna`](https://ollama.ai/library/vicuna/tags) model in this instance)\n", - "* To view all pulled models, use `ollama list`\n", - "* To chat directly with a model from the command line, use `ollama run `\n", - "* View the [Ollama documentation](https://github.com/jmorganca/ollama) for more commands. Run `ollama help` in the terminal to see available commands too.\n" - ] - }, - { - "cell_type": "markdown", - "id": "72ee0c4b-9764-423a-9dbf-95129e185210", - "metadata": {}, - "source": "To enable automated tracing of your model calls, set your [LangSmith](https://docs.smith.langchain.com/) API key:" - }, - { - "cell_type": "code", - "execution_count": null, - "id": "a15d341e-3e26-4ca3-830b-5aab30ed66de", - "metadata": {}, - "outputs": [], - "source": [ - "# os.environ[\"LANGSMITH_API_KEY\"] = getpass.getpass(\"Enter your LangSmith API key: \")\n", - "# os.environ[\"LANGSMITH_TRACING\"] = \"true\"" - ] - }, - { - "cell_type": "markdown", - "id": "0730d6a1-c893-4840-9817-5e5251676d5d", - "metadata": {}, - "source": [ - "### Installation\n", - "\n", - "The LangChain Ollama integration lives in the `langchain-ollama` package:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "652d6238-1f87-422a-b135-f5abbb8652fc", - "metadata": {}, - "outputs": [], - "source": [ - "%pip install -qU langchain-ollama" - ] - }, - { - "cell_type": "markdown", - "id": "b18bd692076f7cf7", - "metadata": {}, - "source": "Make sure you're using the latest Ollama version for structured outputs. Update by running:" - }, - { - "cell_type": "code", - "execution_count": null, - "id": "b7a05cba95644c2e", - "metadata": {}, - "outputs": [], - "source": "%pip install -U ollama" - }, - { - "cell_type": "markdown", - "id": "a38cde65-254d-4219-a441-068766c0d4b5", - "metadata": {}, - "source": [ - "## Instantiation\n", - "\n", - "Now we can instantiate our model object and generate chat completions:\n" - ] - }, - { - "cell_type": "code", - "execution_count": 9, - "id": "cb09c344-1836-4e0c-acf8-11d13ac1dbae", - "metadata": {}, - "outputs": [], - "source": [ - "from langchain_ollama import ChatOllama\n", - "\n", - "llm = ChatOllama(\n", - " model=\"llama3.1\",\n", - " temperature=0,\n", - " # other params...\n", - ")" - ] - }, - { - "cell_type": "markdown", - "id": "2b4f3e15", - "metadata": {}, - "source": [ - "## Invocation" - ] - }, - { - "cell_type": "code", - "execution_count": 10, - "id": "62e0dbc3", - "metadata": { - "tags": [] - }, - "outputs": [ - { - "data": { - "text/plain": [ - "AIMessage(content='The translation of \"I love programming\" from English to French is:\\n\\n\"J\\'adore programmer.\"', response_metadata={'model': 'llama3.1', 'created_at': '2024-08-19T16:05:32.81965Z', 'message': {'role': 'assistant', 'content': ''}, 'done_reason': 'stop', 'done': True, 'total_duration': 2167842917, 'load_duration': 54222584, 'prompt_eval_count': 35, 'prompt_eval_duration': 893007000, 'eval_count': 22, 'eval_duration': 1218962000}, id='run-0863daa2-43bf-4a43-86cc-611b23eae466-0', usage_metadata={'input_tokens': 35, 'output_tokens': 22, 'total_tokens': 57})" - ] - }, - "execution_count": 10, - "metadata": {}, - "output_type": "execute_result" - } - ], - "source": [ - "from langchain_core.messages import AIMessage\n", - "\n", - "messages = [\n", - " (\n", - " \"system\",\n", - " \"You are a helpful assistant that translates English to French. Translate the user sentence.\",\n", - " ),\n", - " (\"human\", \"I love programming.\"),\n", - "]\n", - "ai_msg = llm.invoke(messages)\n", - "ai_msg" - ] - }, - { - "cell_type": "code", - "execution_count": 11, - "id": "d86145b3-bfef-46e8-b227-4dda5c9c2705", - "metadata": {}, - "outputs": [ - { - "name": "stdout", - "output_type": "stream", - "text": [ - "The translation of \"I love programming\" from English to French is:\n", - "\n", - "\"J'adore programmer.\"\n" - ] - } - ], - "source": [ - "print(ai_msg.content)" - ] - }, - { - "cell_type": "markdown", - "id": "18e2bfc0-7e78-4528-a73f-499ac150dca8", - "metadata": {}, - "source": [ - "## Chaining\n", - "\n", - "We can [chain](/docs/how_to/sequence/) our model with a prompt template like so:" - ] - }, - { - "cell_type": "code", - "execution_count": 12, - "id": "e197d1d7-a070-4c96-9f8a-a0e86d046e0b", - "metadata": {}, - "outputs": [ - { - "data": { - "text/plain": [ - "AIMessage(content='Das Programmieren ist mir ein Leidenschaft! (That\\'s \"Programming is my passion!\" in German.) Would you like me to translate anything else?', response_metadata={'model': 'llama3.1', 'created_at': '2024-08-19T16:05:34.893548Z', 'message': {'role': 'assistant', 'content': ''}, 'done_reason': 'stop', 'done': True, 'total_duration': 2045997333, 'load_duration': 22584792, 'prompt_eval_count': 30, 'prompt_eval_duration': 213210000, 'eval_count': 32, 'eval_duration': 1808541000}, id='run-d18e1c6b-50e0-4b1d-b23a-973fa058edad-0', usage_metadata={'input_tokens': 30, 'output_tokens': 32, 'total_tokens': 62})" - ] - }, - "execution_count": 12, - "metadata": {}, - "output_type": "execute_result" - } - ], - "source": [ - "from langchain_core.prompts import ChatPromptTemplate\n", - "\n", - "prompt = ChatPromptTemplate.from_messages(\n", - " [\n", - " (\n", - " \"system\",\n", - " \"You are a helpful assistant that translates {input_language} to {output_language}.\",\n", - " ),\n", - " (\"human\", \"{input}\"),\n", - " ]\n", - ")\n", - "\n", - "chain = prompt | llm\n", - "chain.invoke(\n", - " {\n", - " \"input_language\": \"English\",\n", - " \"output_language\": \"German\",\n", - " \"input\": \"I love programming.\",\n", - " }\n", - ")" - ] - }, - { - "cell_type": "markdown", - "id": "0f51345d-0a9d-43f1-8fca-d0662cb8e21b", - "metadata": {}, - "source": [ - "## Tool calling\n", - "\n", - "We can use [tool calling](https://blog.langchain.dev/improving-core-tool-interfaces-and-docs-in-langchain/) with an LLM [that has been fine-tuned for tool use](https://ollama.com/library/llama3.1):\n", - "\n", - "```\n", - "ollama pull llama3.1\n", - "```\n", - "\n", - "Details on creating custom tools are available in [this guide](/docs/how_to/custom_tools/). Below, we demonstrate how to create a tool using the `@tool` decorator on a normal python function." - ] - }, - { - "cell_type": "code", - "execution_count": 13, - "id": "f767015f", - "metadata": {}, - "outputs": [ - { - "data": { - "text/plain": [ - "[{'name': 'validate_user',\n", - " 'args': {'addresses': '[\"123 Fake St, Boston, MA\", \"234 Pretend Boulevard, Houston, TX\"]',\n", - " 'user_id': '123'},\n", - " 'id': '40fe3de0-500c-4b91-9616-5932a929e640',\n", - " 'type': 'tool_call'}]" - ] - }, - "execution_count": 13, - "metadata": {}, - "output_type": "execute_result" - } - ], - "source": [ - "from typing import List\n", - "\n", - "from langchain_core.tools import tool\n", - "from langchain_ollama import ChatOllama\n", - "\n", - "\n", - "@tool\n", - "def validate_user(user_id: int, addresses: List[str]) -> bool:\n", - " \"\"\"Validate user using historical addresses.\n", - "\n", - " Args:\n", - " user_id (int): the user ID.\n", - " addresses (List[str]): Previous addresses as a list of strings.\n", - " \"\"\"\n", - " return True\n", - "\n", - "\n", - "llm = ChatOllama(\n", - " model=\"llama3.1\",\n", - " temperature=0,\n", - ").bind_tools([validate_user])\n", - "\n", - "result = llm.invoke(\n", - " \"Could you validate user 123? They previously lived at \"\n", - " \"123 Fake St in Boston MA and 234 Pretend Boulevard in \"\n", - " \"Houston TX.\"\n", - ")\n", - "result.tool_calls" - ] - }, - { - "cell_type": "markdown", - "id": "4c5e0197", - "metadata": {}, - "source": [ - "## Multi-modal\n", - "\n", - "Ollama has support for multi-modal LLMs, such as [bakllava](https://ollama.com/library/bakllava) and [llava](https://ollama.com/library/llava).\n", - "\n", - " ollama pull bakllava\n", - "\n", - "Be sure to update Ollama so that you have the most recent version to support multi-modal." - ] - }, - { - "cell_type": "code", - "execution_count": 15, - "id": "36c9b1c2", - "metadata": {}, - "outputs": [ - { - "data": { - "text/html": [ - "" - ], - "text/plain": [ - "" - ] - }, - "metadata": {}, - "output_type": "display_data" - } - ], - "source": [ - "import base64\n", - "from io import BytesIO\n", - "\n", - "from IPython.display import HTML, display\n", - "from PIL import Image\n", - "\n", - "\n", - "def convert_to_base64(pil_image):\n", - " \"\"\"\n", - " Convert PIL images to Base64 encoded strings\n", - "\n", - " :param pil_image: PIL image\n", - " :return: Re-sized Base64 string\n", - " \"\"\"\n", - "\n", - " buffered = BytesIO()\n", - " pil_image.save(buffered, format=\"JPEG\") # You can change the format if needed\n", - " img_str = base64.b64encode(buffered.getvalue()).decode(\"utf-8\")\n", - " return img_str\n", - "\n", - "\n", - "def plt_img_base64(img_base64):\n", - " \"\"\"\n", - " Disply base64 encoded string as image\n", - "\n", - " :param img_base64: Base64 string\n", - " \"\"\"\n", - " # Create an HTML img tag with the base64 string as the source\n", - " image_html = f''\n", - " # Display the image by rendering the HTML\n", - " display(HTML(image_html))\n", - "\n", - "\n", - "file_path = \"../../../static/img/ollama_example_img.jpg\"\n", - "pil_image = Image.open(file_path)\n", - "\n", - "image_b64 = convert_to_base64(pil_image)\n", - "plt_img_base64(image_b64)" - ] - }, - { - "cell_type": "code", - "execution_count": 16, - "id": "32b3ba7b", - "metadata": {}, - "outputs": [ - { - "name": "stdout", - "output_type": "stream", - "text": [ - "90%\n" - ] - } - ], - "source": [ - "from langchain_core.messages import HumanMessage\n", - "from langchain_ollama import ChatOllama\n", - "\n", - "llm = ChatOllama(model=\"bakllava\", temperature=0)\n", - "\n", - "\n", - "def prompt_func(data):\n", - " text = data[\"text\"]\n", - " image = data[\"image\"]\n", - "\n", - " image_part = {\n", - " \"type\": \"image_url\",\n", - " \"image_url\": f\"data:image/jpeg;base64,{image}\",\n", - " }\n", - "\n", - " content_parts = []\n", - "\n", - " text_part = {\"type\": \"text\", \"text\": text}\n", - "\n", - " content_parts.append(image_part)\n", - " content_parts.append(text_part)\n", - "\n", - " return [HumanMessage(content=content_parts)]\n", - "\n", - "\n", - "from langchain_core.output_parsers import StrOutputParser\n", - "\n", - "chain = prompt_func | llm | StrOutputParser()\n", - "\n", - "query_chain = chain.invoke(\n", - " {\"text\": \"What is the Dollar-based gross retention rate?\", \"image\": image_b64}\n", - ")\n", - "\n", - "print(query_chain)" - ] - }, - { - "cell_type": "markdown", - "id": "3a5bb5ca-c3ae-4a58-be67-2cd18574b9a3", - "metadata": {}, - "source": [ - "## API reference\n", - "\n", - "For detailed documentation of all ChatOllama features and configurations head to the API reference: https://python.langchain.com/api_reference/ollama/chat_models/langchain_ollama.chat_models.ChatOllama.html" - ] - } - ], - "metadata": { - "kernelspec": { - "display_name": "Python 3 (ipykernel)", - "language": "python", - "name": "python3" - }, - "language_info": { - "codemirror_mode": { - "name": "ipython", - "version": 3 - }, - "file_extension": ".py", - "mimetype": "text/x-python", - "name": "python", - "nbconvert_exporter": "python", - "pygments_lexer": "ipython3", - "version": "3.12.4" - } + "cells": [ + { + "cell_type": "raw", + "id": "afaf8039", + "metadata": {}, + "source": [ + "---\n", + "sidebar_label: Ollama\n", + "---" + ] }, - "nbformat": 4, - "nbformat_minor": 5 + { + "cell_type": "markdown", + "id": "e49f1e0d", + "metadata": {}, + "source": [ + "# ChatOllama\n", + "\n", + "[Ollama](https://ollama.ai/) allows you to run open-source large language models, such as Llama 2, locally.\n", + "\n", + "Ollama bundles model weights, configuration, and data into a single package, defined by a Modelfile.\n", + "\n", + "It optimizes setup and configuration details, including GPU usage.\n", + "\n", + "For a complete list of supported models and model variants, see the [Ollama model library](https://github.com/jmorganca/ollama#model-library).\n", + "\n", + "## Overview\n", + "### Integration details\n", + "\n", + "| Class | Package | Local | Serializable | [JS support](https://js.langchain.com/v0.2/docs/integrations/chat/ollama) | Package downloads | Package latest |\n", + "| :--- | :--- | :---: | :---: | :---: | :---: | :---: |\n", + "| [ChatOllama](https://python.langchain.com/v0.2/api_reference/ollama/chat_models/langchain_ollama.chat_models.ChatOllama.html) | [langchain-ollama](https://python.langchain.com/v0.2/api_reference/ollama/index.html) | ✅ | ❌ | ✅ | ![PyPI - Downloads](https://img.shields.io/pypi/dm/langchain-ollama?style=flat-square&label=%20) | ![PyPI - Version](https://img.shields.io/pypi/v/langchain-ollama?style=flat-square&label=%20) |\n", + "\n", + "### Model features\n", + "| [Tool calling](/docs/how_to/tool_calling/) | [Structured output](/docs/how_to/structured_output/) | JSON mode | [Image input](/docs/how_to/multimodal_inputs/) | Audio input | Video input | [Token-level streaming](/docs/how_to/chat_streaming/) | Native async | [Token usage](/docs/how_to/chat_token_usage_tracking/) | [Logprobs](/docs/how_to/logprobs/) |\n", + "| :---: |:----------------------------------------------------:| :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: |\n", + "| ✅ | ✅ | ✅ | ❌ | ❌ | ❌ | ✅ | ✅ | ❌ | ❌ |\n", + "\n", + "## Setup\n", + "\n", + "First, follow [these instructions](https://github.com/jmorganca/ollama) to set up and run a local Ollama instance:\n", + "\n", + "* [Download](https://ollama.ai/download) and install Ollama onto the available supported platforms (including Windows Subsystem for Linux)\n", + "* Fetch available LLM model via `ollama pull `\n", + " * View a list of available models via the [model library](https://ollama.ai/library)\n", + " * e.g., `ollama pull llama3`\n", + "* This will download the default tagged version of the model. Typically, the default points to the latest, smallest sized-parameter model.\n", + "\n", + "> On Mac, the models will be download to `~/.ollama/models`\n", + ">\n", + "> On Linux (or WSL), the models will be stored at `/usr/share/ollama/.ollama/models`\n", + "\n", + "* Specify the exact version of the model of interest as such `ollama pull vicuna:13b-v1.5-16k-q4_0` (View the [various tags for the `Vicuna`](https://ollama.ai/library/vicuna/tags) model in this instance)\n", + "* To view all pulled models, use `ollama list`\n", + "* To chat directly with a model from the command line, use `ollama run `\n", + "* View the [Ollama documentation](https://github.com/jmorganca/ollama) for more commands. Run `ollama help` in the terminal to see available commands too.\n" + ] + }, + { + "cell_type": "markdown", + "id": "72ee0c4b-9764-423a-9dbf-95129e185210", + "metadata": {}, + "source": [ + "To enable automated tracing of your model calls, set your [LangSmith](https://docs.smith.langchain.com/) API key:" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "a15d341e-3e26-4ca3-830b-5aab30ed66de", + "metadata": {}, + "outputs": [], + "source": [ + "# os.environ[\"LANGSMITH_API_KEY\"] = getpass.getpass(\"Enter your LangSmith API key: \")\n", + "# os.environ[\"LANGSMITH_TRACING\"] = \"true\"" + ] + }, + { + "cell_type": "markdown", + "id": "0730d6a1-c893-4840-9817-5e5251676d5d", + "metadata": {}, + "source": [ + "### Installation\n", + "\n", + "The LangChain Ollama integration lives in the `langchain-ollama` package:" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "652d6238-1f87-422a-b135-f5abbb8652fc", + "metadata": {}, + "outputs": [], + "source": [ + "%pip install -qU langchain-ollama" + ] + }, + { + "cell_type": "markdown", + "id": "b18bd692076f7cf7", + "metadata": {}, + "source": [ + "Make sure you're using the latest Ollama version for structured outputs. Update by running:" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "b7a05cba95644c2e", + "metadata": {}, + "outputs": [], + "source": [ + "%pip install -U ollama" + ] + }, + { + "cell_type": "markdown", + "id": "a38cde65-254d-4219-a441-068766c0d4b5", + "metadata": {}, + "source": [ + "## Instantiation\n", + "\n", + "Now we can instantiate our model object and generate chat completions:\n" + ] + }, + { + "cell_type": "code", + "execution_count": 9, + "id": "cb09c344-1836-4e0c-acf8-11d13ac1dbae", + "metadata": {}, + "outputs": [], + "source": [ + "from langchain_ollama import ChatOllama\n", + "\n", + "llm = ChatOllama(\n", + " model=\"llama3.1\",\n", + " temperature=0,\n", + " # other params...\n", + ")" + ] + }, + { + "cell_type": "markdown", + "id": "2b4f3e15", + "metadata": {}, + "source": [ + "## Invocation" + ] + }, + { + "cell_type": "code", + "execution_count": 10, + "id": "62e0dbc3", + "metadata": { + "tags": [] + }, + "outputs": [ + { + "data": { + "text/plain": [ + "AIMessage(content='The translation of \"I love programming\" from English to French is:\\n\\n\"J\\'adore programmer.\"', response_metadata={'model': 'llama3.1', 'created_at': '2024-08-19T16:05:32.81965Z', 'message': {'role': 'assistant', 'content': ''}, 'done_reason': 'stop', 'done': True, 'total_duration': 2167842917, 'load_duration': 54222584, 'prompt_eval_count': 35, 'prompt_eval_duration': 893007000, 'eval_count': 22, 'eval_duration': 1218962000}, id='run-0863daa2-43bf-4a43-86cc-611b23eae466-0', usage_metadata={'input_tokens': 35, 'output_tokens': 22, 'total_tokens': 57})" + ] + }, + "execution_count": 10, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "from langchain_core.messages import AIMessage\n", + "\n", + "messages = [\n", + " (\n", + " \"system\",\n", + " \"You are a helpful assistant that translates English to French. Translate the user sentence.\",\n", + " ),\n", + " (\"human\", \"I love programming.\"),\n", + "]\n", + "ai_msg = llm.invoke(messages)\n", + "ai_msg" + ] + }, + { + "cell_type": "code", + "execution_count": 11, + "id": "d86145b3-bfef-46e8-b227-4dda5c9c2705", + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "The translation of \"I love programming\" from English to French is:\n", + "\n", + "\"J'adore programmer.\"\n" + ] + } + ], + "source": [ + "print(ai_msg.content)" + ] + }, + { + "cell_type": "markdown", + "id": "18e2bfc0-7e78-4528-a73f-499ac150dca8", + "metadata": {}, + "source": [ + "## Chaining\n", + "\n", + "We can [chain](/docs/how_to/sequence/) our model with a prompt template like so:" + ] + }, + { + "cell_type": "code", + "execution_count": 12, + "id": "e197d1d7-a070-4c96-9f8a-a0e86d046e0b", + "metadata": {}, + "outputs": [ + { + "data": { + "text/plain": [ + "AIMessage(content='Das Programmieren ist mir ein Leidenschaft! (That\\'s \"Programming is my passion!\" in German.) Would you like me to translate anything else?', response_metadata={'model': 'llama3.1', 'created_at': '2024-08-19T16:05:34.893548Z', 'message': {'role': 'assistant', 'content': ''}, 'done_reason': 'stop', 'done': True, 'total_duration': 2045997333, 'load_duration': 22584792, 'prompt_eval_count': 30, 'prompt_eval_duration': 213210000, 'eval_count': 32, 'eval_duration': 1808541000}, id='run-d18e1c6b-50e0-4b1d-b23a-973fa058edad-0', usage_metadata={'input_tokens': 30, 'output_tokens': 32, 'total_tokens': 62})" + ] + }, + "execution_count": 12, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "from langchain_core.prompts import ChatPromptTemplate\n", + "\n", + "prompt = ChatPromptTemplate.from_messages(\n", + " [\n", + " (\n", + " \"system\",\n", + " \"You are a helpful assistant that translates {input_language} to {output_language}.\",\n", + " ),\n", + " (\"human\", \"{input}\"),\n", + " ]\n", + ")\n", + "\n", + "chain = prompt | llm\n", + "chain.invoke(\n", + " {\n", + " \"input_language\": \"English\",\n", + " \"output_language\": \"German\",\n", + " \"input\": \"I love programming.\",\n", + " }\n", + ")" + ] + }, + { + "cell_type": "markdown", + "id": "0f51345d-0a9d-43f1-8fca-d0662cb8e21b", + "metadata": {}, + "source": [ + "## Tool calling\n", + "\n", + "We can use [tool calling](https://blog.langchain.dev/improving-core-tool-interfaces-and-docs-in-langchain/) with an LLM [that has been fine-tuned for tool use](https://ollama.com/library/llama3.1):\n", + "\n", + "```\n", + "ollama pull llama3.1\n", + "```\n", + "\n", + "Details on creating custom tools are available in [this guide](/docs/how_to/custom_tools/). Below, we demonstrate how to create a tool using the `@tool` decorator on a normal python function." + ] + }, + { + "cell_type": "code", + "execution_count": 13, + "id": "f767015f", + "metadata": {}, + "outputs": [ + { + "data": { + "text/plain": [ + "[{'name': 'validate_user',\n", + " 'args': {'addresses': '[\"123 Fake St, Boston, MA\", \"234 Pretend Boulevard, Houston, TX\"]',\n", + " 'user_id': '123'},\n", + " 'id': '40fe3de0-500c-4b91-9616-5932a929e640',\n", + " 'type': 'tool_call'}]" + ] + }, + "execution_count": 13, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "from typing import List\n", + "\n", + "from langchain_core.tools import tool\n", + "from langchain_ollama import ChatOllama\n", + "\n", + "\n", + "@tool\n", + "def validate_user(user_id: int, addresses: List[str]) -> bool:\n", + " \"\"\"Validate user using historical addresses.\n", + "\n", + " Args:\n", + " user_id (int): the user ID.\n", + " addresses (List[str]): Previous addresses as a list of strings.\n", + " \"\"\"\n", + " return True\n", + "\n", + "\n", + "llm = ChatOllama(\n", + " model=\"llama3.1\",\n", + " temperature=0,\n", + ").bind_tools([validate_user])\n", + "\n", + "result = llm.invoke(\n", + " \"Could you validate user 123? They previously lived at \"\n", + " \"123 Fake St in Boston MA and 234 Pretend Boulevard in \"\n", + " \"Houston TX.\"\n", + ")\n", + "result.tool_calls" + ] + }, + { + "cell_type": "markdown", + "id": "4c5e0197", + "metadata": {}, + "source": [ + "## Multi-modal\n", + "\n", + "Ollama has support for multi-modal LLMs, such as [bakllava](https://ollama.com/library/bakllava) and [llava](https://ollama.com/library/llava).\n", + "\n", + " ollama pull bakllava\n", + "\n", + "Be sure to update Ollama so that you have the most recent version to support multi-modal." + ] + }, + { + "cell_type": "code", + "execution_count": 15, + "id": "36c9b1c2", + "metadata": {}, + "outputs": [ + { + "data": { + "text/html": [ + "" + ], + "text/plain": [ + "" + ] + }, + "metadata": {}, + "output_type": "display_data" + } + ], + "source": [ + "import base64\n", + "from io import BytesIO\n", + "\n", + "from IPython.display import HTML, display\n", + "from PIL import Image\n", + "\n", + "\n", + "def convert_to_base64(pil_image):\n", + " \"\"\"\n", + " Convert PIL images to Base64 encoded strings\n", + "\n", + " :param pil_image: PIL image\n", + " :return: Re-sized Base64 string\n", + " \"\"\"\n", + "\n", + " buffered = BytesIO()\n", + " pil_image.save(buffered, format=\"JPEG\") # You can change the format if needed\n", + " img_str = base64.b64encode(buffered.getvalue()).decode(\"utf-8\")\n", + " return img_str\n", + "\n", + "\n", + "def plt_img_base64(img_base64):\n", + " \"\"\"\n", + " Disply base64 encoded string as image\n", + "\n", + " :param img_base64: Base64 string\n", + " \"\"\"\n", + " # Create an HTML img tag with the base64 string as the source\n", + " image_html = f''\n", + " # Display the image by rendering the HTML\n", + " display(HTML(image_html))\n", + "\n", + "\n", + "file_path = \"../../../static/img/ollama_example_img.jpg\"\n", + "pil_image = Image.open(file_path)\n", + "\n", + "image_b64 = convert_to_base64(pil_image)\n", + "plt_img_base64(image_b64)" + ] + }, + { + "cell_type": "code", + "execution_count": 16, + "id": "32b3ba7b", + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "90%\n" + ] + } + ], + "source": [ + "from langchain_core.messages import HumanMessage\n", + "from langchain_ollama import ChatOllama\n", + "\n", + "llm = ChatOllama(model=\"bakllava\", temperature=0)\n", + "\n", + "\n", + "def prompt_func(data):\n", + " text = data[\"text\"]\n", + " image = data[\"image\"]\n", + "\n", + " image_part = {\n", + " \"type\": \"image_url\",\n", + " \"image_url\": f\"data:image/jpeg;base64,{image}\",\n", + " }\n", + "\n", + " content_parts = []\n", + "\n", + " text_part = {\"type\": \"text\", \"text\": text}\n", + "\n", + " content_parts.append(image_part)\n", + " content_parts.append(text_part)\n", + "\n", + " return [HumanMessage(content=content_parts)]\n", + "\n", + "\n", + "from langchain_core.output_parsers import StrOutputParser\n", + "\n", + "chain = prompt_func | llm | StrOutputParser()\n", + "\n", + "query_chain = chain.invoke(\n", + " {\"text\": \"What is the Dollar-based gross retention rate?\", \"image\": image_b64}\n", + ")\n", + "\n", + "print(query_chain)" + ] + }, + { + "cell_type": "markdown", + "id": "fb6a331f-1507-411f-89e5-c4d598154f3c", + "metadata": {}, + "source": [ + "## Reasoning models and custom message roles\n", + "\n", + "Some models, such as IBM's [Granite 3.2](https://ollama.com/library/granite3.2), support custom message roles to enable thinking processes.\n", + "\n", + "To access Granite 3.2's thinking features, pass a message with a `\"control\"` role with content set to `\"thinking\"`. Because `\"control\"` is a non-standard message role, we can use a [ChatMessage](https://python.langchain.com/api_reference/core/messages/langchain_core.messages.chat.ChatMessage.html) object to implement it:" + ] + }, + { + "cell_type": "code", + "execution_count": 1, + "id": "d7309fa7-990e-4c20-b1f0-b155624ecf37", + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "Here is my thought process:\n", + "This question is asking for the result of 3 raised to the power of 3, which is a basic mathematical operation. \n", + "\n", + "Here is my response:\n", + "The expression 3^3 means 3 raised to the power of 3. To calculate this, you multiply the base number (3) by itself as many times as its exponent (3):\n", + "\n", + "3 * 3 * 3 = 27\n", + "\n", + "So, 3^3 equals 27.\n" + ] + } + ], + "source": [ + "from langchain_core.messages import ChatMessage, HumanMessage\n", + "from langchain_ollama import ChatOllama\n", + "\n", + "llm = ChatOllama(model=\"granite3.2:8b\")\n", + "\n", + "messages = [\n", + " ChatMessage(role=\"control\", content=\"thinking\"),\n", + " HumanMessage(\"What is 3^3?\"),\n", + "]\n", + "\n", + "response = llm.invoke(messages)\n", + "print(response.content)" + ] + }, + { + "cell_type": "markdown", + "id": "6271d032-da40-44d4-9b52-58370e164be3", + "metadata": {}, + "source": [ + "Note that the model exposes its thought process in addition to its final response." + ] + }, + { + "cell_type": "markdown", + "id": "3a5bb5ca-c3ae-4a58-be67-2cd18574b9a3", + "metadata": {}, + "source": [ + "## API reference\n", + "\n", + "For detailed documentation of all ChatOllama features and configurations head to the API reference: https://python.langchain.com/api_reference/ollama/chat_models/langchain_ollama.chat_models.ChatOllama.html" + ] + } + ], + "metadata": { + "kernelspec": { + "display_name": "Python 3 (ipykernel)", + "language": "python", + "name": "python3" + }, + "language_info": { + "codemirror_mode": { + "name": "ipython", + "version": 3 + }, + "file_extension": ".py", + "mimetype": "text/x-python", + "name": "python", + "nbconvert_exporter": "python", + "pygments_lexer": "ipython3", + "version": "3.10.4" + } + }, + "nbformat": 4, + "nbformat_minor": 5 } diff --git a/libs/partners/ollama/langchain_ollama/chat_models.py b/libs/partners/ollama/langchain_ollama/chat_models.py index a29ac403588..a7046ef79ca 100644 --- a/libs/partners/ollama/langchain_ollama/chat_models.py +++ b/libs/partners/ollama/langchain_ollama/chat_models.py @@ -26,6 +26,7 @@ from langchain_core.messages import ( AIMessageChunk, BaseMessage, BaseMessageChunk, + ChatMessage, HumanMessage, SystemMessage, ToolCall, @@ -511,7 +512,7 @@ class ChatOllama(BaseChatModel): ) -> Sequence[Message]: ollama_messages: list = [] for message in messages: - role: Literal["user", "assistant", "system", "tool"] + role: str tool_call_id: Optional[str] = None tool_calls: Optional[list[dict[str, Any]]] = None if isinstance(message, HumanMessage): @@ -528,6 +529,8 @@ class ChatOllama(BaseChatModel): ) elif isinstance(message, SystemMessage): role = "system" + elif isinstance(message, ChatMessage): + role = message.role elif isinstance(message, ToolMessage): role = "tool" tool_call_id = message.tool_call_id diff --git a/libs/partners/ollama/pyproject.toml b/libs/partners/ollama/pyproject.toml index 99d3a4a1f1e..130bc130214 100644 --- a/libs/partners/ollama/pyproject.toml +++ b/libs/partners/ollama/pyproject.toml @@ -6,7 +6,10 @@ build-backend = "pdm.backend" authors = [] license = { text = "MIT" } requires-python = "<4.0,>=3.9" -dependencies = ["ollama<1,>=0.4.4", "langchain-core<1.0.0,>=0.3.52"] +dependencies = [ + "ollama>=0.4.8,<1.0.0", + "langchain-core<1.0.0,>=0.3.52", +] name = "langchain-ollama" version = "0.3.2" description = "An integration package connecting Ollama and LangChain" diff --git a/libs/partners/ollama/tests/unit_tests/test_chat_models.py b/libs/partners/ollama/tests/unit_tests/test_chat_models.py index 95d32b180c9..ff552fef5d0 100644 --- a/libs/partners/ollama/tests/unit_tests/test_chat_models.py +++ b/libs/partners/ollama/tests/unit_tests/test_chat_models.py @@ -1,7 +1,13 @@ """Test chat model integration.""" import json +from collections.abc import Generator +from contextlib import contextmanager +from typing import Any +import pytest +from httpx import Client, Request, Response +from langchain_core.messages import ChatMessage from langchain_tests.unit_tests import ChatModelUnitTests from langchain_ollama.chat_models import ChatOllama, _parse_arguments_from_tool_call @@ -23,3 +29,38 @@ def test__parse_arguments_from_tool_call() -> None: response = _parse_arguments_from_tool_call(raw_tool_calls[0]) assert response is not None assert isinstance(response["arg_1"], str) + + +@contextmanager +def _mock_httpx_client_stream( + *args: Any, **kwargs: Any +) -> Generator[Response, Any, Any]: + yield Response( + status_code=200, + content='{"message": {"role": "assistant", "content": "The meaning ..."}}', + request=Request(method="POST", url="http://whocares:11434"), + ) + + +def test_arbitrary_roles_accepted_in_chatmessages( + monkeypatch: pytest.MonkeyPatch, +) -> None: + monkeypatch.setattr(Client, "stream", _mock_httpx_client_stream) + + llm = ChatOllama( + base_url="http://whocares:11434", + model="granite3.2", + verbose=True, + format=None, + ) + + messages = [ + ChatMessage( + role="somerandomrole", + content="I'm ok with you adding any role message now!", + ), + ChatMessage(role="control", content="thinking"), + ChatMessage(role="user", content="What is the meaning of life?"), + ] + + llm.invoke(messages) diff --git a/libs/partners/ollama/uv.lock b/libs/partners/ollama/uv.lock index 895083899a4..43fa3584938 100644 --- a/libs/partners/ollama/uv.lock +++ b/libs/partners/ollama/uv.lock @@ -288,7 +288,7 @@ wheels = [ [[package]] name = "langchain-core" -version = "0.3.52" +version = "0.3.54" source = { editable = "../../core" } dependencies = [ { name = "jsonpatch" }, @@ -381,7 +381,7 @@ typing = [ [package.metadata] requires-dist = [ { name = "langchain-core", editable = "../../core" }, - { name = "ollama", specifier = ">=0.4.4,<1" }, + { name = "ollama", specifier = ">=0.4.8,<1.0.0" }, ] [package.metadata.requires-dev] @@ -405,7 +405,7 @@ typing = [ [[package]] name = "langchain-tests" -version = "0.3.18" +version = "0.3.19" source = { editable = "../../standard-tests" } dependencies = [ { name = "httpx" }, @@ -625,15 +625,15 @@ wheels = [ [[package]] name = "ollama" -version = "0.4.7" +version = "0.4.8" source = { registry = "https://pypi.org/simple" } dependencies = [ { name = "httpx" }, { name = "pydantic" }, ] -sdist = { url = "https://files.pythonhosted.org/packages/b0/6d/dc77539c735bbed5d0c873fb029fb86aa9f0163df169b34152914331c369/ollama-0.4.7.tar.gz", hash = "sha256:891dcbe54f55397d82d289c459de0ea897e103b86a3f1fad0fdb1895922a75ff", size = 12843 } +sdist = { url = "https://files.pythonhosted.org/packages/e2/64/709dc99030f8f46ec552f0a7da73bbdcc2da58666abfec4742ccdb2e800e/ollama-0.4.8.tar.gz", hash = "sha256:1121439d49b96fa8339842965d0616eba5deb9f8c790786cdf4c0b3df4833802", size = 12972 } wheels = [ - { url = "https://files.pythonhosted.org/packages/31/83/c3ffac86906c10184c88c2e916460806b072a2cfe34cdcaf3a0c0e836d39/ollama-0.4.7-py3-none-any.whl", hash = "sha256:85505663cca67a83707be5fb3aeff0ea72e67846cea5985529d8eca4366564a1", size = 13210 }, + { url = "https://files.pythonhosted.org/packages/33/3f/164de150e983b3a16e8bf3d4355625e51a357e7b3b1deebe9cc1f7cb9af8/ollama-0.4.8-py3-none-any.whl", hash = "sha256:04312af2c5e72449aaebac4a2776f52ef010877c554103419d3f36066fe8af4c", size = 13325 }, ] [[package]]