From 40b4a3de6e03e4e8ebe35bc2aa8cfcb82000a68a Mon Sep 17 00:00:00 2001 From: ccurme Date: Wed, 31 Jul 2024 11:26:52 -0400 Subject: [PATCH] docs: update chat model integration pages (#24882) to conform with template --- docs/docs/integrations/chat/ai21.ipynb | 196 +++++++++++--- .../integrations/chat/azure_chat_openai.ipynb | 36 +-- docs/docs/integrations/chat/huggingface.ipynb | 113 +++++++- docs/docs/integrations/chat/mistralai.ipynb | 245 +++++++++++------- .../chat/nvidia_ai_endpoints.ipynb | 199 ++++++++++---- docs/docs/integrations/chat/vllm.ipynb | 183 ++++++++----- 6 files changed, 702 insertions(+), 270 deletions(-) diff --git a/docs/docs/integrations/chat/ai21.ipynb b/docs/docs/integrations/chat/ai21.ipynb index 4a0d0f86e3d..e26c4901f0f 100644 --- a/docs/docs/integrations/chat/ai21.ipynb +++ b/docs/docs/integrations/chat/ai21.ipynb @@ -17,26 +17,25 @@ "source": [ "# ChatAI21\n", "\n", + "## Overview\n", + "\n", "This notebook covers how to get started with AI21 chat models.\n", - "Note that different chat models support different parameters. See the ", - "[AI21 documentation](https://docs.ai21.com/reference) to learn more about the parameters in your chosen model.\n", + "Note that different chat models support different parameters. See the [AI21 documentation](https://docs.ai21.com/reference) to learn more about the parameters in your chosen model.\n", "[See all AI21's LangChain components.](https://pypi.org/project/langchain-ai21/) \n", - "## Installation" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "4c3bef91", - "metadata": { - "ExecuteTime": { - "end_time": "2024-02-15T06:50:44.929635Z", - "start_time": "2024-02-15T06:50:41.209704Z" - } - }, - "outputs": [], - "source": [ - "!pip install -qU langchain-ai21" + "\n", + "### Integration details\n", + "\n", + "| Class | Package | Local | Serializable | [JS support](https://js.langchain.com/v0.2/docs/integrations/chat/__package_name_short_snake__) | Package downloads | Package latest |\n", + "| :--- | :--- | :---: | :---: | :---: | :---: | :---: |\n", + "| [ChatAI21](https://api.python.langchain.com/en/latest/chat_models/langchain_ai21.chat_models.ChatAI21.html#langchain_ai21.chat_models.ChatAI21) | [langchain-ai21](https://api.python.langchain.com/en/latest/ai21_api_reference.html) | ❌ | beta | ✅ | ![PyPI - Downloads](https://img.shields.io/pypi/dm/langchain-ai21?style=flat-square&label=%20) | ![PyPI - Version](https://img.shields.io/pypi/v/langchain-ai21?style=flat-square&label=%20) |\n", + "\n", + "### Model features\n", + "| [Tool calling](/docs/how_to/tool_calling) | [Structured output](/docs/how_to/structured_output/) | JSON mode | [Image input](/docs/how_to/multimodal_inputs/) | Audio input | Video input | [Token-level streaming](/docs/how_to/chat_streaming/) | Native async | [Token usage](/docs/how_to/chat_token_usage_tracking/) | [Logprobs](/docs/how_to/logprobs/) |\n", + "| :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: |\n", + "| ✅ | ✅ | ❌ | ❌ | ❌ | ❌ | ✅ | ✅ | ✅ | ❌ | \n", + "\n", + "\n", + "## Setup" ] }, { @@ -44,10 +43,9 @@ "id": "2b4f3e15", "metadata": {}, "source": [ - "## Environment Setup\n", + "### Credentials\n", "\n", - "We'll need to get an [AI21 API key](https://docs.ai21.com/) and set the ", - "`AI21_API_KEY` environment variable:\n" + "We'll need to get an [AI21 API key](https://docs.ai21.com/) and set the `AI21_API_KEY` environment variable:\n" ] }, { @@ -67,48 +65,166 @@ }, { "cell_type": "markdown", - "id": "4828829d3da430ce", - "metadata": { - "collapsed": false - }, + "id": "f6844fff-3702-4489-ab74-732f69f3b9d7", + "metadata": {}, "source": [ - "## Usage" + "If you want to get automated tracing of your model calls you can also set your [LangSmith](https://docs.smith.langchain.com/) API key by uncommenting below:" ] }, { "cell_type": "code", - "execution_count": 1, - "id": "39353473fce5dd2e", + "execution_count": null, + "id": "7c2e19d3-7c58-4470-9e1a-718b27a32056", + "metadata": {}, + "outputs": [], + "source": [ + "# os.environ[\"LANGCHAIN_TRACING_V2\"] = \"true\"\n", + "# os.environ[\"LANGCHAIN_API_KEY\"] = getpass.getpass(\"Enter your LangSmith API key: \")" + ] + }, + { + "cell_type": "markdown", + "id": "98e22f31-8acc-42d6-916d-415d1263c56e", + "metadata": {}, + "source": [ + "### Installation" + ] + }, + { + "cell_type": "markdown", + "id": "f9699cd9-58f2-450e-aa64-799e66906c0f", + "metadata": {}, + "source": [ + "!pip install -qU langchain-ai21" + ] + }, + { + "cell_type": "markdown", + "id": "4828829d3da430ce", "metadata": { - "collapsed": false + "collapsed": false, + "jupyter": { + "outputs_hidden": false + } }, + "source": [ + "## Instantiation\n", + "\n", + "Now we can instantiate our model object and generate chat completions:" + ] + }, + { + "cell_type": "code", + "execution_count": 2, + "id": "c40756fb-cbf8-4d44-a293-3989d707237e", + "metadata": {}, + "outputs": [], + "source": [ + "from langchain_ai21 import ChatAI21\n", + "\n", + "llm = ChatAI21(model=\"jamba-instruct\", temperature=0)" + ] + }, + { + "cell_type": "markdown", + "id": "2bdc5d68-2a19-495e-8c04-d11adc86d3ae", + "metadata": {}, + "source": [ + "## Invocation" + ] + }, + { + "cell_type": "code", + "execution_count": 3, + "id": "46b982dc-5d8a-46da-a711-81c03ccd6adc", + "metadata": {}, "outputs": [ { "data": { "text/plain": [ - "AIMessage(content='Bonjour, comment vas-tu?')" + "AIMessage(content=\"J'adore programmer.\", id='run-2e8d16d6-a06e-45cb-8d0c-1c8208645033-0')" ] }, - "execution_count": 1, + "execution_count": 3, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "messages = [\n", + " (\n", + " \"system\",\n", + " \"You are a helpful assistant that translates English to French. Translate the user sentence.\",\n", + " ),\n", + " (\"human\", \"I love programming.\"),\n", + "]\n", + "ai_msg = llm.invoke(messages)\n", + "ai_msg" + ] + }, + { + "cell_type": "markdown", + "id": "10a30f84-b531-4fd5-8b5b-91512fbdc75b", + "metadata": {}, + "source": [ + "## Chaining\n", + "\n", + "We can [chain](/docs/how_to/sequence/) our model with a prompt template like so:" + ] + }, + { + "cell_type": "code", + "execution_count": 4, + "id": "39353473fce5dd2e", + "metadata": { + "collapsed": false, + "jupyter": { + "outputs_hidden": false + } + }, + "outputs": [ + { + "data": { + "text/plain": [ + "AIMessage(content='Ich liebe das Programmieren.', id='run-e1bd82dc-1a7e-4b2e-bde9-ac995929ac0f-0')" + ] + }, + "execution_count": 4, "metadata": {}, "output_type": "execute_result" } ], "source": [ - "from langchain_ai21 import ChatAI21\n", "from langchain_core.prompts import ChatPromptTemplate\n", "\n", - "chat = ChatAI21(model=\"jamba-instruct\")\n", - "\n", - "prompt = ChatPromptTemplate.from_messages(\n", + "prompt = ChatPromptTemplate(\n", " [\n", - " (\"system\", \"You are a helpful assistant that translates English to French.\"),\n", - " (\"human\", \"Translate this sentence from English to French. {english_text}.\"),\n", + " (\n", + " \"system\",\n", + " \"You are a helpful assistant that translates {input_language} to {output_language}.\",\n", + " ),\n", + " (\"human\", \"{input}\"),\n", " ]\n", ")\n", "\n", - "chain = prompt | chat\n", - "chain.invoke({\"english_text\": \"Hello, how are you?\"})" + "chain = prompt | llm\n", + "chain.invoke(\n", + " {\n", + " \"input_language\": \"English\",\n", + " \"output_language\": \"German\",\n", + " \"input\": \"I love programming.\",\n", + " }\n", + ")" + ] + }, + { + "cell_type": "markdown", + "id": "e79de691-9dd6-4697-b57e-59a4a3cc073a", + "metadata": {}, + "source": [ + "## API reference\n", + "\n", + "For detailed documentation of all ChatAI21 features and configurations head to the API reference: https://api.python.langchain.com/en/latest/chat_models/langchain_ai21.chat_models.ChatAI21.html" ] } ], @@ -128,7 +244,7 @@ "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", - "version": "3.11.4" + "version": "3.10.4" } }, "nbformat": 4, diff --git a/docs/docs/integrations/chat/azure_chat_openai.ipynb b/docs/docs/integrations/chat/azure_chat_openai.ipynb index 83c3f7d2fd1..36282ad303b 100644 --- a/docs/docs/integrations/chat/azure_chat_openai.ipynb +++ b/docs/docs/integrations/chat/azure_chat_openai.ipynb @@ -115,7 +115,7 @@ }, { "cell_type": "code", - "execution_count": 1, + "execution_count": 2, "id": "cb09c344-1836-4e0c-acf8-11d13ac1dbae", "metadata": {}, "outputs": [], @@ -123,8 +123,8 @@ "from langchain_openai import AzureChatOpenAI\n", "\n", "llm = AzureChatOpenAI(\n", - " azure_deployment=\"YOUR-DEPLOYMENT\",\n", - " api_version=\"2024-05-01-preview\",\n", + " azure_deployment=\"gpt-35-turbo\", # or your deployment\n", + " api_version=\"2023-06-01-preview\", # or your api version\n", " temperature=0,\n", " max_tokens=None,\n", " timeout=None,\n", @@ -143,7 +143,7 @@ }, { "cell_type": "code", - "execution_count": 4, + "execution_count": 3, "id": "62e0dbc3", "metadata": { "tags": [] @@ -152,10 +152,10 @@ { "data": { "text/plain": [ - "AIMessage(content=\"J'adore la programmation.\", response_metadata={'token_usage': {'completion_tokens': 8, 'prompt_tokens': 31, 'total_tokens': 39}, 'model_name': 'gpt-35-turbo', 'system_fingerprint': None, 'prompt_filter_results': [{'prompt_index': 0, 'content_filter_results': {'hate': {'filtered': False, 'severity': 'safe'}, 'self_harm': {'filtered': False, 'severity': 'safe'}, 'sexual': {'filtered': False, 'severity': 'safe'}, 'violence': {'filtered': False, 'severity': 'safe'}}}], 'finish_reason': 'stop', 'logprobs': None, 'content_filter_results': {'hate': {'filtered': False, 'severity': 'safe'}, 'self_harm': {'filtered': False, 'severity': 'safe'}, 'sexual': {'filtered': False, 'severity': 'safe'}, 'violence': {'filtered': False, 'severity': 'safe'}}}, id='run-a6a732c2-cb02-4e50-9a9c-ab30eab034fc-0', usage_metadata={'input_tokens': 31, 'output_tokens': 8, 'total_tokens': 39})" + "AIMessage(content=\"J'adore la programmation.\", response_metadata={'token_usage': {'completion_tokens': 8, 'prompt_tokens': 31, 'total_tokens': 39}, 'model_name': 'gpt-35-turbo', 'system_fingerprint': None, 'prompt_filter_results': [{'prompt_index': 0, 'content_filter_results': {'hate': {'filtered': False, 'severity': 'safe'}, 'self_harm': {'filtered': False, 'severity': 'safe'}, 'sexual': {'filtered': False, 'severity': 'safe'}, 'violence': {'filtered': False, 'severity': 'safe'}}}], 'finish_reason': 'stop', 'logprobs': None, 'content_filter_results': {'hate': {'filtered': False, 'severity': 'safe'}, 'self_harm': {'filtered': False, 'severity': 'safe'}, 'sexual': {'filtered': False, 'severity': 'safe'}, 'violence': {'filtered': False, 'severity': 'safe'}}}, id='run-bea4b46c-e3e1-4495-9d3a-698370ad963d-0', usage_metadata={'input_tokens': 31, 'output_tokens': 8, 'total_tokens': 39})" ] }, - "execution_count": 4, + "execution_count": 3, "metadata": {}, "output_type": "execute_result" } @@ -174,7 +174,7 @@ }, { "cell_type": "code", - "execution_count": 11, + "execution_count": 4, "id": "d86145b3-bfef-46e8-b227-4dda5c9c2705", "metadata": {}, "outputs": [ @@ -202,17 +202,17 @@ }, { "cell_type": "code", - "execution_count": 12, + "execution_count": 5, "id": "e197d1d7-a070-4c96-9f8a-a0e86d046e0b", "metadata": {}, "outputs": [ { "data": { "text/plain": [ - "AIMessage(content='Ich liebe das Programmieren.', response_metadata={'token_usage': {'completion_tokens': 6, 'prompt_tokens': 26, 'total_tokens': 32}, 'model_name': 'gpt-35-turbo', 'system_fingerprint': None, 'prompt_filter_results': [{'prompt_index': 0, 'content_filter_results': {'hate': {'filtered': False, 'severity': 'safe'}, 'self_harm': {'filtered': False, 'severity': 'safe'}, 'sexual': {'filtered': False, 'severity': 'safe'}, 'violence': {'filtered': False, 'severity': 'safe'}}}], 'finish_reason': 'stop', 'logprobs': None, 'content_filter_results': {'hate': {'filtered': False, 'severity': 'safe'}, 'self_harm': {'filtered': False, 'severity': 'safe'}, 'sexual': {'filtered': False, 'severity': 'safe'}, 'violence': {'filtered': False, 'severity': 'safe'}}}, id='run-084967d7-06f2-441f-b5c1-477e2a9e9d03-0', usage_metadata={'input_tokens': 26, 'output_tokens': 6, 'total_tokens': 32})" + "AIMessage(content='Ich liebe das Programmieren.', response_metadata={'token_usage': {'completion_tokens': 6, 'prompt_tokens': 26, 'total_tokens': 32}, 'model_name': 'gpt-35-turbo', 'system_fingerprint': None, 'prompt_filter_results': [{'prompt_index': 0, 'content_filter_results': {'hate': {'filtered': False, 'severity': 'safe'}, 'self_harm': {'filtered': False, 'severity': 'safe'}, 'sexual': {'filtered': False, 'severity': 'safe'}, 'violence': {'filtered': False, 'severity': 'safe'}}}], 'finish_reason': 'stop', 'logprobs': None, 'content_filter_results': {'hate': {'filtered': False, 'severity': 'safe'}, 'self_harm': {'filtered': False, 'severity': 'safe'}, 'sexual': {'filtered': False, 'severity': 'safe'}, 'violence': {'filtered': False, 'severity': 'safe'}}}, id='run-cbc44038-09d3-40d4-9da2-c5910ee636ca-0', usage_metadata={'input_tokens': 26, 'output_tokens': 6, 'total_tokens': 32})" ] }, - "execution_count": 12, + "execution_count": 5, "metadata": {}, "output_type": "execute_result" } @@ -264,8 +264,8 @@ }, { "cell_type": "code", - "execution_count": 5, - "id": "84c411b0-1790-4798-8bb7-47d8ece4c2dc", + "execution_count": 6, + "id": "2ca02d23-60d0-43eb-8d04-070f61f8fefd", "metadata": {}, "outputs": [ { @@ -288,22 +288,22 @@ }, { "cell_type": "code", - "execution_count": 6, - "id": "21234693-d92b-4d69-8a7f-55aa062084bf", + "execution_count": 7, + "id": "e1b07ae2-3de7-44bd-bfdc-b76f4ba45a35", "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ - "Total Cost (USD): $0.000078\n" + "Total Cost (USD): $0.000074\n" ] } ], "source": [ "llm_0301 = AzureChatOpenAI(\n", - " azure_deployment=\"YOUR-DEPLOYMENT\",\n", - " api_version=\"2024-05-01-preview\",\n", + " azure_deployment=\"gpt-35-turbo\", # or your deployment\n", + " api_version=\"2023-06-01-preview\", # or your api version\n", " model_version=\"0301\",\n", ")\n", "with get_openai_callback() as cb:\n", @@ -338,7 +338,7 @@ "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", - "version": "3.11.9" + "version": "3.10.4" } }, "nbformat": 4, diff --git a/docs/docs/integrations/chat/huggingface.ipynb b/docs/docs/integrations/chat/huggingface.ipynb index bda8c511132..5fb4d7df872 100644 --- a/docs/docs/integrations/chat/huggingface.ipynb +++ b/docs/docs/integrations/chat/huggingface.ipynb @@ -4,18 +4,67 @@ "cell_type": "markdown", "metadata": {}, "source": [ - "# Hugging Face\n", + "---\n", + "sidebar_label: Hugging Face\n", + "---" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "# ChatHuggingFace\n", "\n", - "This notebook shows how to get started using `Hugging Face` LLM's as chat models.\n", + "## Overview\n", + "\n", + "This notebook shows how to get started using Hugging Face LLMs as chat models.\n", "\n", "In particular, we will:\n", - "1. Utilize the [HuggingFaceEndpoint](https://github.com/langchain-ai/langchain/blob/master/libs/langchain/langchain/llms/huggingface_endpoint.py) integrations to instantiate an `LLM`.\n", + "1. Utilize the [HuggingFaceEndpoint](https://github.com/langchain-ai/langchain/blob/master/libs/langchain/langchain/llms/huggingface_endpoint.py) integrations to instantiate an LLM.\n", "2. Utilize the `ChatHuggingFace` class to enable any of these LLMs to interface with LangChain's [Chat Messages](/docs/concepts/#message-types) abstraction.\n", "3. Explore tool calling with the `ChatHuggingFace`.\n", "4. Demonstrate how to use an open-source LLM to power an `ChatAgent` pipeline\n", "\n", + "### Integration details\n", "\n", - "> Note: To get started, you'll need to have a [Hugging Face Access Token](https://huggingface.co/docs/hub/security-tokens) saved as an environment variable: `HUGGINGFACEHUB_API_TOKEN`." + "| Class | Package | Local | Serializable | JS support | Package downloads | Package latest |\n", + "| :--- | :--- | :---: | :---: | :---: | :---: | :---: |\n", + "| [ChatHuggingFace](https://api.python.langchain.com/en/latest/chat_models/langchain_huggingface.chat_models.huggingface.ChatHuggingFace.html) | [langchain-huggingface](https://api.python.langchain.com/en/latest/huggingface_api_reference.html) | ✅ | beta | ❌ | ![PyPI - Downloads](https://img.shields.io/pypi/dm/langchain_huggingface?style=flat-square&label=%20) | ![PyPI - Version](https://img.shields.io/pypi/v/langchain_huggingface?style=flat-square&label=%20) |\n", + "\n", + "### Model features\n", + "| [Tool calling](/docs/how_to/tool_calling) | [Structured output](/docs/how_to/structured_output/) | JSON mode | [Image input](/docs/how_to/multimodal_inputs/) | Audio input | Video input | [Token-level streaming](/docs/how_to/chat_streaming/) | Native async | [Token usage](/docs/how_to/chat_token_usage_tracking/) | [Logprobs](/docs/how_to/logprobs/) |\n", + "| :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: |\n", + "| ✅ | ✅ | ❌ | ✅ | ✅ | ✅ | ❌ | ✅ | ✅ | ❌ | \n", + "\n", + "## Setup\n", + "\n", + "To access Hugging Face models you'll need to create a Hugging Face account, get an API key, and install the `langchain-huggingface` integration package.\n", + "\n", + "### Credentials\n", + "\n", + "Generate a [Hugging Face Access Token](https://huggingface.co/docs/hub/security-tokens) and store it as an environment variable: `HUGGINGFACEHUB_API_TOKEN`." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "import getpass\n", + "import os\n", + "\n", + "if not os.getenv(\"HUGGINGFACEHUB_API_TOKEN\"):\n", + " os.environ[\"HUGGINGFACEHUB_API_TOKEN\"] = getpass.getpass(\"Enter your token: \")" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "### Installation\n", + "\n", + "Below we install additional packages as well for demonstration purposes:" ] }, { @@ -31,7 +80,7 @@ "cell_type": "markdown", "metadata": {}, "source": [ - "## 1. Instantiate an LLM" + "## Instantiation" ] }, { @@ -118,7 +167,7 @@ "cell_type": "markdown", "metadata": {}, "source": [ - "## 2. Instantiate the `ChatHuggingFace` to apply chat templates" + "## Invocation" ] }, { @@ -249,7 +298,44 @@ "cell_type": "markdown", "metadata": {}, "source": [ - "## 3. Explore the tool calling with `ChatHuggingFace`\n", + "## Chaining\n", + "\n", + "We can [chain](/docs/how_to/sequence/) our model with a prompt template like so:" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "from langchain_core.prompts import ChatPromptTemplate\n", + "\n", + "prompt = ChatPromptTemplate(\n", + " [\n", + " (\n", + " \"system\",\n", + " \"You are a helpful assistant that translates {input_language} to {output_language}.\",\n", + " ),\n", + " (\"human\", \"{input}\"),\n", + " ]\n", + ")\n", + "\n", + "chain = prompt | llm\n", + "chain.invoke(\n", + " {\n", + " \"input_language\": \"English\",\n", + " \"output_language\": \"German\",\n", + " \"input\": \"I love programming.\",\n", + " }\n", + ")" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## Tool calling with `ChatHuggingFace`\n", "\n", "`text-generation-inference` supports tool with open source LLMs starting from v2.0.1" ] @@ -313,7 +399,7 @@ "cell_type": "markdown", "metadata": {}, "source": [ - "## 4. Take it for a spin as an agent!\n", + "## Use with agents\n", "\n", "Here we'll test out `Zephyr-7B-beta` as a zero-shot `ReAct` Agent. \n", "\n", @@ -458,6 +544,15 @@ "\n", "It's exciting to see how far open-source LLM's can go as general purpose reasoning agents. Give it a try yourself!" ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## API reference\n", + "\n", + "For detailed documentation of all ChatHuggingFace features and configurations head to the API reference: https://api.python.langchain.com/en/latest/chat_models/langchain_huggingface.chat_models.huggingface.ChatHuggingFace.html" + ] } ], "metadata": { @@ -476,7 +571,7 @@ "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", - "version": "3.10.12" + "version": "3.10.4" } }, "nbformat": 4, diff --git a/docs/docs/integrations/chat/mistralai.ipynb b/docs/docs/integrations/chat/mistralai.ipynb index f2e3b1c6a75..299f49e623f 100644 --- a/docs/docs/integrations/chat/mistralai.ipynb +++ b/docs/docs/integrations/chat/mistralai.ipynb @@ -12,43 +12,87 @@ }, { "cell_type": "markdown", - "id": "bf733a38-db84-4363-89e2-de6735c37230", + "id": "a14c83bf-af26-4f22-8c1a-d632c5795ecf", "metadata": {}, "source": [ "# MistralAI\n", "\n", - "This notebook covers how to get started with MistralAI chat models, via their [API](https://docs.mistral.ai/api/).\n", + "This will help you getting started with Mistral [chat models](/docs/concepts/#chat-models), accessed via their [API](https://docs.mistral.ai/api/). For detailed documentation of all ChatMistralAI features and configurations head to the [API reference](https://api.python.langchain.com/en/latest/chat_models/langchain_mistralai.chat_models.ChatMistralAI.html).\n", "\n", - "A valid [API key](https://console.mistral.ai/users/api-keys/) is needed to communicate with the API.\n", + "## Overview\n", + "### Integration details\n", + "\n", + "| Class | Package | Local | Serializable | [JS support](https://js.langchain.com/v0.2/docs/integrations/chat/mistral) | Package downloads | Package latest |\n", + "| :--- | :--- | :---: | :---: | :---: | :---: | :---: |\n", + "| [ChatMistralAI](https://api.python.langchain.com/en/latest/chat_models/langchain_mistralai.chat_models.ChatMistralAI.html) | [langchain_mistralai](https://api.python.langchain.com/en/latest/mistralai_api_reference.html) | ❌ | beta | ✅ | ![PyPI - Downloads](https://img.shields.io/pypi/dm/langchain_mistralai?style=flat-square&label=%20) | ![PyPI - Version](https://img.shields.io/pypi/v/langchain_mistralai?style=flat-square&label=%20) |\n", + "\n", + "### Model features\n", + "| [Tool calling](/docs/how_to/tool_calling) | [Structured output](/docs/how_to/structured_output/) | JSON mode | [Image input](/docs/how_to/multimodal_inputs/) | Audio input | Video input | [Token-level streaming](/docs/how_to/chat_streaming/) | Native async | [Token usage](/docs/how_to/chat_token_usage_tracking/) | [Logprobs](/docs/how_to/logprobs/) |\n", + "| :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: |\n", + "| ✅ | ✅ | ❌ | ❌ | ❌ | ❌ | ✅ | ✅ | ✅ | ❌ | \n", "\n", - "Head to the [API reference](https://api.python.langchain.com/en/latest/chat_models/langchain_mistralai.chat_models.ChatMistralAI.html) for detailed documentation of all attributes and methods." - ] - }, - { - "cell_type": "markdown", - "id": "cc686b8f", - "metadata": {}, - "source": [ "## Setup\n", "\n", - "You will need the `langchain-core` and `langchain-mistralai` package to use the API. You can install these with:\n", + "To access Mistral models you'll need to create a Mistral account, get an API key, and install the `langchain-mistralai` integration package.\n", "\n", - "```bash\n", - "pip install -U langchain-core langchain-mistralai\n", + "### Credentials\n", "\n", - "We'll also need to get a [Mistral API key](https://console.mistral.ai/users/api-keys/)" + "A valid [API key](https://console.mistral.ai/users/api-keys/) is needed to communicate with the API. Once you've obtained an API key, store it in the `MISTRAL_API_KEY` environment variable:" ] }, { "cell_type": "code", - "execution_count": 7, - "id": "c3fd4184", + "execution_count": null, + "id": "9acd8340-09d4-4ece-871a-a35b0732c7d8", "metadata": {}, "outputs": [], "source": [ "import getpass\n", + "import os\n", "\n", - "api_key = getpass.getpass()" + "if not os.getenv(\"__MODULE_NAME___API_KEY\"):\n", + " os.environ[\"__MODULE_NAME___API_KEY\"] = getpass.getpass(\n", + " \"Enter your __ModuleName__ API key: \"\n", + " )" + ] + }, + { + "cell_type": "markdown", + "id": "42c979b1-df49-4f6c-9fe6-d9dbf3ea8c2a", + "metadata": {}, + "source": [ + "If you want to get automated tracing of your model calls you can also set your [LangSmith](https://docs.smith.langchain.com/) API key by uncommenting below:" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "cc4f11ec-5cb3-4caf-b3cd-7a20c41b0cfe", + "metadata": {}, + "outputs": [], + "source": [ + "# os.environ[\"LANGCHAIN_TRACING_V2\"] = \"true\"\n", + "# os.environ[\"LANGCHAIN_API_KEY\"] = getpass.getpass(\"Enter your LangSmith API key: \")" + ] + }, + { + "cell_type": "markdown", + "id": "0fc42221-97b2-466b-95db-10368e17ca56", + "metadata": {}, + "source": [ + "### Installation\n", + "\n", + "The LangChain MistralAI integration lives in the `langchain-mistralai` package:" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "85cb1ab8-9f2c-4b93-8415-ad65819dcb38", + "metadata": {}, + "outputs": [], + "source": [ + "%pip install -qU langchain-mistralai" ] }, { @@ -56,57 +100,76 @@ "id": "502127fd", "metadata": {}, "source": [ - "## Usage" + "## Instantiation\n", + "\n", + "Now we can instantiate our model object and generate chat completions:" ] }, { "cell_type": "code", - "execution_count": 3, - "id": "d4a7c55d-b235-4ca4-a579-c90cc9570da9", - "metadata": { - "tags": [] - }, + "execution_count": 1, + "id": "2dfa801a-d040-4c09-9634-58604e8eaf16", + "metadata": {}, "outputs": [], "source": [ - "from langchain_core.messages import HumanMessage\n", - "from langchain_mistralai.chat_models import ChatMistralAI" + "from langchain_mistralai.chat_models import ChatMistralAI\n", + "\n", + "llm = ChatMistralAI(model=\"mistral-large-latest\")" ] }, { - "cell_type": "code", - "execution_count": 8, - "id": "70cf04e8-423a-4ff6-8b09-f11fb711c817", - "metadata": { - "tags": [] - }, - "outputs": [], + "cell_type": "markdown", + "id": "f668acff-eb14-4b3a-959a-df5bfc02968b", + "metadata": {}, "source": [ - "# If api_key is not passed, default behavior is to use the `MISTRAL_API_KEY` environment variable.\n", - "chat = ChatMistralAI(api_key=api_key)" + "## Invocation" ] }, { "cell_type": "code", - "execution_count": 9, - "id": "8199ef8f-eb8b-4253-9ea0-6c24a013ca4c", - "metadata": { - "tags": [] - }, + "execution_count": 2, + "id": "86e3f9e6-67ec-4fbf-8ff1-85331200f412", + "metadata": {}, "outputs": [ { "data": { "text/plain": [ - "AIMessage(content=\"Who's there? I was just about to ask the same thing! How can I assist you today?\")" + "AIMessage(content=\"J'adore la programmation.\", response_metadata={'token_usage': {'prompt_tokens': 27, 'total_tokens': 36, 'completion_tokens': 9}, 'model': 'mistral-large-latest', 'finish_reason': 'stop'}, id='run-d6196c33-9410-413b-b454-4ed0bec1f0c7-0', usage_metadata={'input_tokens': 27, 'output_tokens': 9, 'total_tokens': 36})" ] }, - "execution_count": 9, + "execution_count": 2, "metadata": {}, "output_type": "execute_result" } ], "source": [ - "messages = [HumanMessage(content=\"knock knock\")]\n", - "chat.invoke(messages)" + "messages = [\n", + " (\n", + " \"system\",\n", + " \"You are a helpful assistant that translates English to French. Translate the user sentence.\",\n", + " ),\n", + " (\"human\", \"I love programming.\"),\n", + "]\n", + "ai_msg = llm.invoke(messages)\n", + "ai_msg" + ] + }, + { + "cell_type": "code", + "execution_count": 3, + "id": "8f8a24bc-b7f0-4d3a-b310-8a4e0ba125dd", + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "J'adore la programmation.\n" + ] + } + ], + "source": [ + "print(ai_msg.content)" ] }, { @@ -119,7 +182,7 @@ }, { "cell_type": "code", - "execution_count": 10, + "execution_count": 4, "id": "c5fac0e9-05a4-4fc1-a3b3-e5bbb24b971b", "metadata": { "tags": [] @@ -128,16 +191,16 @@ { "data": { "text/plain": [ - "AIMessage(content='Who\\'s there?\\n\\n(You can then continue the \"knock knock\" joke by saying the name of the person or character who should be responding. For example, if I say \"Banana,\" you could respond with \"Banana who?\" and I would say \"Banana bunch! Get it? Because a group of bananas is called a \\'bunch\\'!\" and then we would both laugh and have a great time. But really, you can put anything you want in the spot where I put \"Banana\" and it will still technically be a \"knock knock\" joke. The possibilities are endless!)')" + "AIMessage(content=\"J'aime programmer.\", response_metadata={'token_usage': {'prompt_tokens': 27, 'total_tokens': 34, 'completion_tokens': 7}, 'model': 'mistral-large-latest', 'finish_reason': 'stop'}, id='run-1873888a-186f-49a8-ab81-24335bd3099b-0', usage_metadata={'input_tokens': 27, 'output_tokens': 7, 'total_tokens': 34})" ] }, - "execution_count": 10, + "execution_count": 4, "metadata": {}, "output_type": "execute_result" } ], "source": [ - "await chat.ainvoke(messages)" + "await llm.ainvoke(messages)" ] }, { @@ -150,7 +213,7 @@ }, { "cell_type": "code", - "execution_count": 11, + "execution_count": 5, "id": "025be980-e50d-4a68-93dc-c9c7b500ce34", "metadata": { "tags": [] @@ -160,32 +223,12 @@ "name": "stdout", "output_type": "stream", "text": [ - "Who's there?\n", - "\n", - "(After this, the conversation can continue as a call and response \"who's there\" joke. Here is an example of how it could go:\n", - "\n", - "You say: Orange.\n", - "I say: Orange who?\n", - "You say: Orange you glad I didn't say banana!?)\n", - "\n", - "But since you asked for a knock knock joke specifically, here's one for you:\n", - "\n", - "Knock knock.\n", - "\n", - "Me: Who's there?\n", - "\n", - "You: Lettuce.\n", - "\n", - "Me: Lettuce who?\n", - "\n", - "You: Lettuce in, it's too cold out here!\n", - "\n", - "I hope this brings a smile to your face! Do you have a favorite knock knock joke you'd like to share? I'd love to hear it." + "J'adore programmer." ] } ], "source": [ - "for chunk in chat.stream(messages):\n", + "for chunk in llm.stream(messages):\n", " print(chunk.content, end=\"\")" ] }, @@ -199,23 +242,23 @@ }, { "cell_type": "code", - "execution_count": 12, + "execution_count": 6, "id": "e63aebcb", "metadata": {}, "outputs": [ { "data": { "text/plain": [ - "[AIMessage(content=\"Who's there? I was just about to ask the same thing! Go ahead and tell me who's there. I love a good knock-knock joke.\")]" + "[AIMessage(content=\"J'adore la programmation.\", response_metadata={'token_usage': {'prompt_tokens': 27, 'total_tokens': 36, 'completion_tokens': 9}, 'model': 'mistral-large-latest', 'finish_reason': 'stop'}, id='run-2aa2a189-c405-4cf5-bd31-e9025e4c8536-0', usage_metadata={'input_tokens': 27, 'output_tokens': 9, 'total_tokens': 36})]" ] }, - "execution_count": 12, + "execution_count": 6, "metadata": {}, "output_type": "execute_result" } ], "source": [ - "chat.batch([messages])" + "llm.batch([messages])" ] }, { @@ -230,36 +273,52 @@ }, { "cell_type": "code", - "execution_count": 13, + "execution_count": 7, "id": "ee43a1ae", "metadata": {}, - "outputs": [], - "source": [ - "from langchain_core.prompts import ChatPromptTemplate\n", - "\n", - "prompt = ChatPromptTemplate.from_template(\"Tell me a joke about {topic}\")\n", - "chain = prompt | chat" - ] - }, - { - "cell_type": "code", - "execution_count": 14, - "id": "0dc49212", - "metadata": {}, "outputs": [ { "data": { "text/plain": [ - "AIMessage(content='Why do bears hate shoes so much? They like to run around in their bear feet.')" + "AIMessage(content='Ich liebe Programmieren.', response_metadata={'token_usage': {'prompt_tokens': 21, 'total_tokens': 28, 'completion_tokens': 7}, 'model': 'mistral-large-latest', 'finish_reason': 'stop'}, id='run-409ebc9a-b4a0-4734-ab6f-e11f6b4f808f-0', usage_metadata={'input_tokens': 21, 'output_tokens': 7, 'total_tokens': 28})" ] }, - "execution_count": 14, + "execution_count": 7, "metadata": {}, "output_type": "execute_result" } ], "source": [ - "chain.invoke({\"topic\": \"bears\"})" + "from langchain_core.prompts import ChatPromptTemplate\n", + "\n", + "prompt = ChatPromptTemplate(\n", + " [\n", + " (\n", + " \"system\",\n", + " \"You are a helpful assistant that translates {input_language} to {output_language}.\",\n", + " ),\n", + " (\"human\", \"{input}\"),\n", + " ]\n", + ")\n", + "\n", + "chain = prompt | llm\n", + "chain.invoke(\n", + " {\n", + " \"input_language\": \"English\",\n", + " \"output_language\": \"German\",\n", + " \"input\": \"I love programming.\",\n", + " }\n", + ")" + ] + }, + { + "cell_type": "markdown", + "id": "eb7e01fb-a433-48b1-a4c2-e6009523a896", + "metadata": {}, + "source": [ + "## API reference\n", + "\n", + "For detailed documentation of all ChatMistralAI features and configurations head to the API reference: https://api.python.langchain.com/en/latest/chat_models/langchain_mistralai.chat_models.ChatMistralAI.html" ] } ], @@ -279,7 +338,7 @@ "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", - "version": "3.10.12" + "version": "3.10.4" } }, "nbformat": 4, diff --git a/docs/docs/integrations/chat/nvidia_ai_endpoints.ipynb b/docs/docs/integrations/chat/nvidia_ai_endpoints.ipynb index 24554ac80e3..adc96d53c32 100644 --- a/docs/docs/integrations/chat/nvidia_ai_endpoints.ipynb +++ b/docs/docs/integrations/chat/nvidia_ai_endpoints.ipynb @@ -2,13 +2,24 @@ "cells": [ { "cell_type": "markdown", - "id": "cc6caafa", - "metadata": { - "id": "cc6caafa" - }, + "id": "1f666798-8635-4bc0-a515-04d318588d67", + "metadata": {}, "source": [ - "# NVIDIA NIMs\n", + "---\n", + "sidebar_label: NVIDIA AI Endpoints\n", + "---" + ] + }, + { + "cell_type": "markdown", + "id": "fa8eb20e-4db8-45e3-9e79-c595f4f274da", + "metadata": {}, + "source": [ + "# ChatNVIDIA\n", "\n", + "This will help you getting started with NVIDIA [chat models](/docs/concepts/#chat-models). For detailed documentation of all `ChatNVIDIA` features and configurations head to the [API reference](https://api.python.langchain.com/en/latest/chat_models/langchain_nvidia_ai_endpoints.chat_models.ChatNVIDIA.html).\n", + "\n", + "## Overview\n", "The `langchain-nvidia-ai-endpoints` package contains LangChain integrations building applications with models on \n", "NVIDIA NIM inference microservice. NIM supports models across domains like chat, embedding, and re-ranking models \n", "from the community as well as NVIDIA. These models are optimized by NVIDIA to deliver the best performance on NVIDIA \n", @@ -24,7 +35,66 @@ "\n", "This example goes over how to use LangChain to interact with NVIDIA supported via the `ChatNVIDIA` class.\n", "\n", - "For more information on accessing the chat models through this api, check out the [ChatNVIDIA](https://python.langchain.com/docs/integrations/chat/nvidia_ai_endpoints/) documentation." + "For more information on accessing the chat models through this api, check out the [ChatNVIDIA](https://python.langchain.com/docs/integrations/chat/nvidia_ai_endpoints/) documentation.\n", + "\n", + "### Integration details\n", + "\n", + "| Class | Package | Local | Serializable | JS support | Package downloads | Package latest |\n", + "| :--- | :--- | :---: | :---: | :---: | :---: | :---: |\n", + "| [ChatNVIDIA](https://api.python.langchain.com/en/latest/chat_models/langchain_nvidia_ai_endpoints.chat_models.ChatNVIDIA.html) | [langchain_nvidia_ai_endpoints](https://api.python.langchain.com/en/latest/nvidia_ai_endpoints_api_reference.html) | ✅ | beta | ❌ | ![PyPI - Downloads](https://img.shields.io/pypi/dm/langchain_nvidia_ai_endpoints?style=flat-square&label=%20) | ![PyPI - Version](https://img.shields.io/pypi/v/langchain_nvidia_ai_endpoints?style=flat-square&label=%20) |\n", + "\n", + "### Model features\n", + "| [Tool calling](/docs/how_to/tool_calling) | [Structured output](/docs/how_to/structured_output/) | JSON mode | [Image input](/docs/how_to/multimodal_inputs/) | Audio input | Video input | [Token-level streaming](/docs/how_to/chat_streaming/) | Native async | [Token usage](/docs/how_to/chat_token_usage_tracking/) | [Logprobs](/docs/how_to/logprobs/) |\n", + "| :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: |\n", + "| ✅ | ✅ | ❌ | ✅ | ❌ | ❌ | ✅ | ❌ | ❌ | ❌ | \n", + "\n", + "## Setup\n", + "\n", + "**To get started:**\n", + "\n", + "1. Create a free account with [NVIDIA](https://build.nvidia.com/), which hosts NVIDIA AI Foundation models.\n", + "\n", + "2. Click on your model of choice.\n", + "\n", + "3. Under `Input` select the `Python` tab, and click `Get API Key`. Then click `Generate Key`.\n", + "\n", + "4. Copy and save the generated key as `NVIDIA_API_KEY`. From there, you should have access to the endpoints.\n", + "\n", + "### Credentials\n" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "208b72da-1535-4249-bbd3-2500028e25e9", + "metadata": {}, + "outputs": [], + "source": [ + "import getpass\n", + "import os\n", + "\n", + "if not os.getenv(\"NVIDIA_API_KEY\"):\n", + " # Note: the API key should start with \"nvapi-\"\n", + " os.environ[\"NVIDIA_API_KEY\"] = getpass.getpass(\"Enter your NVIDIA API key: \")" + ] + }, + { + "cell_type": "markdown", + "id": "52dc8dcb-0a48-4a4e-9947-764116d2ffd4", + "metadata": {}, + "source": [ + "If you want to get automated tracing of your model calls you can also set your [LangSmith](https://docs.smith.langchain.com/) API key by uncommenting below:" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "2cd9cb12-6ca5-432a-9e42-8a57da073c7e", + "metadata": {}, + "outputs": [], + "source": [ + "# os.environ[\"LANGCHAIN_TRACING_V2\"] = \"true\"\n", + "# os.environ[\"LANGCHAIN_API_KEY\"] = getpass.getpass(\"Enter your LangSmith API key: \")" ] }, { @@ -32,7 +102,9 @@ "id": "f2be90a9", "metadata": {}, "source": [ - "## Installation" + "### Installation\n", + "\n", + "The LangChain NVIDIA AI Endpoints integration lives in the `langchain_nvidia_ai_endpoints` package:" ] }, { @@ -45,51 +117,14 @@ "%pip install --upgrade --quiet langchain-nvidia-ai-endpoints" ] }, - { - "cell_type": "markdown", - "id": "ccff689e", - "metadata": { - "id": "ccff689e" - }, - "source": [ - "## Setup\n", - "\n", - "**To get started:**\n", - "\n", - "1. Create a free account with [NVIDIA](https://build.nvidia.com/), which hosts NVIDIA AI Foundation models.\n", - "\n", - "2. Click on your model of choice.\n", - "\n", - "3. Under `Input` select the `Python` tab, and click `Get API Key`. Then click `Generate Key`.\n", - "\n", - "4. Copy and save the generated key as `NVIDIA_API_KEY`. From there, you should have access to the endpoints." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "686c4d2f", - "metadata": {}, - "outputs": [], - "source": [ - "import getpass\n", - "import os\n", - "\n", - "# del os.environ['NVIDIA_API_KEY'] ## delete key and reset\n", - "if os.environ.get(\"NVIDIA_API_KEY\", \"\").startswith(\"nvapi-\"):\n", - " print(\"Valid NVIDIA_API_KEY already in environment. Delete to reset\")\n", - "else:\n", - " nvapi_key = getpass.getpass(\"NVAPI Key (starts with nvapi-): \")\n", - " assert nvapi_key.startswith(\"nvapi-\"), f\"{nvapi_key[:5]}... is not a valid key\"\n", - " os.environ[\"NVIDIA_API_KEY\"] = nvapi_key" - ] - }, { "cell_type": "markdown", "id": "af0ce26b", "metadata": {}, "source": [ - "## Working with NVIDIA API Catalog" + "## Instantiation\n", + "\n", + "Now we can access models in the NVIDIA API Catalog:" ] }, { @@ -108,7 +143,24 @@ "## Core LC Chat Interface\n", "from langchain_nvidia_ai_endpoints import ChatNVIDIA\n", "\n", - "llm = ChatNVIDIA(model=\"mistralai/mixtral-8x7b-instruct-v0.1\")\n", + "llm = ChatNVIDIA(model=\"mistralai/mixtral-8x7b-instruct-v0.1\")" + ] + }, + { + "cell_type": "markdown", + "id": "469c8c7f-de62-457f-a30f-674763a8b717", + "metadata": {}, + "source": [ + "## Invocation" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "9512c81b-1f3a-4eca-9470-f52cedff5c74", + "metadata": {}, + "outputs": [], + "source": [ "result = llm.invoke(\"Write a ballad about LangChain.\")\n", "print(result.content)" ] @@ -630,6 +682,55 @@ "source": [ "See [How to use chat models to call tools](https://python.langchain.com/v0.2/docs/how_to/tool_calling/) for additional examples." ] + }, + { + "cell_type": "markdown", + "id": "a9a3c438-121d-46eb-8fb5-b8d5a13cd4a4", + "metadata": {}, + "source": [ + "## Chaining\n", + "\n", + "We can [chain](/docs/how_to/sequence/) our model with a prompt template like so:" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "af585c6b-fe0a-4833-9860-a4209a71b3c6", + "metadata": {}, + "outputs": [], + "source": [ + "from langchain_core.prompts import ChatPromptTemplate\n", + "\n", + "prompt = ChatPromptTemplate(\n", + " [\n", + " (\n", + " \"system\",\n", + " \"You are a helpful assistant that translates {input_language} to {output_language}.\",\n", + " ),\n", + " (\"human\", \"{input}\"),\n", + " ]\n", + ")\n", + "\n", + "chain = prompt | llm\n", + "chain.invoke(\n", + " {\n", + " \"input_language\": \"English\",\n", + " \"output_language\": \"German\",\n", + " \"input\": \"I love programming.\",\n", + " }\n", + ")" + ] + }, + { + "cell_type": "markdown", + "id": "f2f25dd3-0b4a-465f-a53e-95521cdc253c", + "metadata": {}, + "source": [ + "## API reference\n", + "\n", + "For detailed documentation of all `ChatNVIDIA` features and configurations head to the API reference: https://api.python.langchain.com/en/latest/chat_models/langchain_nvidia_ai_endpoints.chat_models.ChatNVIDIA.html" + ] } ], "metadata": { @@ -651,7 +752,7 @@ "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", - "version": "3.10.13" + "version": "3.10.4" } }, "nbformat": 4, diff --git a/docs/docs/integrations/chat/vllm.ipynb b/docs/docs/integrations/chat/vllm.ipynb index 9d70a06a3f1..48fb5b6c7a1 100644 --- a/docs/docs/integrations/chat/vllm.ipynb +++ b/docs/docs/integrations/chat/vllm.ipynb @@ -12,14 +12,83 @@ }, { "cell_type": "markdown", - "id": "eb7e5679-aa06-47e4-a1a3-b6b70e604017", + "id": "8f82e243-f4ee-44e2-b417-099b6401ae3e", "metadata": {}, "source": [ "# vLLM Chat\n", "\n", "vLLM can be deployed as a server that mimics the OpenAI API protocol. This allows vLLM to be used as a drop-in replacement for applications using OpenAI API. This server can be queried in the same format as OpenAI API.\n", "\n", - "This notebook covers how to get started with vLLM chat models using langchain's `ChatOpenAI` **as it is**." + "## Overview\n", + "This will help you getting started with vLLM [chat models](/docs/concepts/#chat-models), which leverage the `langchain-openai` package. For detailed documentation of all `ChatOpenAI` features and configurations head to the [API reference](https://api.python.langchain.com/en/latest/chat_models/langchain_openai.chat_models.base.ChatOpenAI.html).\n", + "\n", + "### Integration details\n", + "\n", + "| Class | Package | Local | Serializable | JS support | Package downloads | Package latest |\n", + "| :--- | :--- | :---: | :---: | :---: | :---: | :---: |\n", + "| [ChatOpenAI](https://api.python.langchain.com/en/latest/chat_models/langchain_openai.chat_models.base.ChatOpenAI.html) | [langchain_openai](https://api.python.langchain.com/en/latest/langchain_openai.html) | ✅ | beta | ❌ | ![PyPI - Downloads](https://img.shields.io/pypi/dm/langchain_openai?style=flat-square&label=%20) | ![PyPI - Version](https://img.shields.io/pypi/v/langchain_openai?style=flat-square&label=%20) |\n", + "\n", + "### Model features\n", + "Specific model features-- such as tool calling, support for multi-modal inputs, support for token-level streaming, etc.-- will depend on the hosted model.\n", + "\n", + "## Setup\n", + "\n", + "See the vLLM docs [here](https://docs.vllm.ai/en/latest/).\n", + "\n", + "To access vLLM models through LangChain, you'll need to install the `langchain-openai` integration package.\n", + "\n", + "### Credentials\n", + "\n", + "Authentication will depend on specifics of the inference server." + ] + }, + { + "cell_type": "markdown", + "id": "c3b1707a-cf2c-4367-94e3-436c43402503", + "metadata": {}, + "source": [ + "If you want to get automated tracing of your model calls you can also set your [LangSmith](https://docs.smith.langchain.com/) API key by uncommenting below:" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "1e40bd5e-cbaa-41ef-aaf9-0858eb207184", + "metadata": {}, + "outputs": [], + "source": [ + "# os.environ[\"LANGCHAIN_TRACING_V2\"] = \"true\"\n", + "# os.environ[\"LANGCHAIN_API_KEY\"] = getpass.getpass(\"Enter your LangSmith API key: \")" + ] + }, + { + "cell_type": "markdown", + "id": "0739b647-609b-46d3-bdd3-e86fe4463288", + "metadata": {}, + "source": [ + "### Installation\n", + "\n", + "The LangChain vLLM integration can be accessed via the `langchain-openai` package:" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "7afcfbdc-56aa-4529-825a-8acbe7aa5241", + "metadata": {}, + "outputs": [], + "source": [ + "%pip install -qU langchain-openai" + ] + }, + { + "cell_type": "markdown", + "id": "2cf576d6-7b67-4937-bf99-39071e85720c", + "metadata": {}, + "source": [ + "## Instantiation\n", + "\n", + "Now we can instantiate our model object and generate chat completions:" ] }, { @@ -51,7 +120,7 @@ "source": [ "inference_server_url = \"http://localhost:8000/v1\"\n", "\n", - "chat = ChatOpenAI(\n", + "llm = ChatOpenAI(\n", " model=\"mosaicml/mpt-7b\",\n", " openai_api_key=\"EMPTY\",\n", " openai_api_base=inference_server_url,\n", @@ -60,6 +129,14 @@ ")" ] }, + { + "cell_type": "markdown", + "id": "34b18328-5e8b-4ff2-9b89-6fbb76b5c7f0", + "metadata": {}, + "source": [ + "## Invocation" + ] + }, { "cell_type": "code", "execution_count": 15, @@ -88,82 +165,66 @@ " content=\"Translate the following sentence from English to Italian: I love programming.\"\n", " ),\n", "]\n", - "chat(messages)" + "llm.invoke(messages)" ] }, { "cell_type": "markdown", - "id": "55fc7046-a6dc-4720-8c0c-24a6db76a4f4", + "id": "a580a1e4-11a3-4277-bfba-bfb414ac7201", "metadata": {}, "source": [ - "You can make use of templating by using a `MessagePromptTemplate`. You can build a `ChatPromptTemplate` from one or more `MessagePromptTemplates`. You can use ChatPromptTemplate's format_prompt -- this returns a `PromptValue`, which you can convert to a string or `Message` object, depending on whether you want to use the formatted value as input to an llm or chat model.\n", + "## Chaining\n", "\n", - "For convenience, there is a `from_template` method exposed on the template. If you were to use this template, this is what it would look like:" - ] - }, - { - "cell_type": "code", - "execution_count": 16, - "id": "123980e9-0dee-4ce5-bde6-d964dd90129c", - "metadata": { - "tags": [] - }, - "outputs": [], - "source": [ - "template = (\n", - " \"You are a helpful assistant that translates {input_language} to {output_language}.\"\n", - ")\n", - "system_message_prompt = SystemMessagePromptTemplate.from_template(template)\n", - "human_template = \"{text}\"\n", - "human_message_prompt = HumanMessagePromptTemplate.from_template(human_template)" - ] - }, - { - "cell_type": "code", - "execution_count": 17, - "id": "b2fb8c59-8892-4270-85a2-4f8ab276b75d", - "metadata": { - "tags": [] - }, - "outputs": [ - { - "data": { - "text/plain": [ - "AIMessage(content=' I love programming too.', additional_kwargs={}, example=False)" - ] - }, - "execution_count": 17, - "metadata": {}, - "output_type": "execute_result" - } - ], - "source": [ - "chat_prompt = ChatPromptTemplate.from_messages(\n", - " [system_message_prompt, human_message_prompt]\n", - ")\n", - "\n", - "# get a chat completion from the formatted messages\n", - "chat(\n", - " chat_prompt.format_prompt(\n", - " input_language=\"English\", output_language=\"Italian\", text=\"I love programming.\"\n", - " ).to_messages()\n", - ")" + "We can [chain](/docs/how_to/sequence/) our model with a prompt template like so:" ] }, { "cell_type": "code", "execution_count": null, - "id": "0bbd9861-2b94-4920-8708-b690004f4c4d", + "id": "dd0f4043-48bd-4245-8bdb-e7669666a277", "metadata": {}, "outputs": [], - "source": [] + "source": [ + "from langchain_core.prompts import ChatPromptTemplate\n", + "\n", + "prompt = ChatPromptTemplate(\n", + " [\n", + " (\n", + " \"system\",\n", + " \"You are a helpful assistant that translates {input_language} to {output_language}.\",\n", + " ),\n", + " (\"human\", \"{input}\"),\n", + " ]\n", + ")\n", + "\n", + "chain = prompt | llm\n", + "chain.invoke(\n", + " {\n", + " \"input_language\": \"English\",\n", + " \"output_language\": \"German\",\n", + " \"input\": \"I love programming.\",\n", + " }\n", + ")" + ] + }, + { + "cell_type": "markdown", + "id": "265f5d51-0a76-4808-8d13-ef598ee6e366", + "metadata": {}, + "source": [ + "## API reference\n", + "\n", + "For detailed documentation of all features and configurations exposed via `langchain-openai`, head to the API reference: https://api.python.langchain.com/en/latest/chat_models/langchain_openai.chat_models.base.ChatOpenAI.html\n", + "\n", + "Refer to the vLLM [documentation](https://docs.vllm.ai/en/latest/) as well." + ] } ], "metadata": { "kernelspec": { - "display_name": "conda_pytorch_p310", + "display_name": "Python 3 (ipykernel)", "language": "python", - "name": "conda_pytorch_p310" + "name": "python3" }, "language_info": { "codemirror_mode": { @@ -175,7 +236,7 @@ "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", - "version": "3.10.12" + "version": "3.10.4" } }, "nbformat": 4,