From 4322b246aa6c2c0f910c5acde4f6385ee7832373 Mon Sep 17 00:00:00 2001 From: Massimiliano Pronesti Date: Mon, 25 Sep 2023 03:23:19 +0200 Subject: [PATCH] docs: add vLLM chat notebook (#10993) This PR aims at showcasing how to use vLLM's OpenAI-compatible chat API. ### Context Lanchain already supports vLLM and its OpenAI-compatible `Completion` API. However, the `ChatCompletion` API was not aligned with OpenAI and for this reason I've waited for this [PR](https://github.com/vllm-project/vllm/pull/852) to be merged before adding this notebook to langchain. --- docs/extras/integrations/chat/vllm.ipynb | 174 +++++++++++++++++++++++ 1 file changed, 174 insertions(+) create mode 100644 docs/extras/integrations/chat/vllm.ipynb diff --git a/docs/extras/integrations/chat/vllm.ipynb b/docs/extras/integrations/chat/vllm.ipynb new file mode 100644 index 00000000000..45c5094304e --- /dev/null +++ b/docs/extras/integrations/chat/vllm.ipynb @@ -0,0 +1,174 @@ +{ + "cells": [ + { + "cell_type": "markdown", + "id": "eb7e5679-aa06-47e4-a1a3-b6b70e604017", + "metadata": {}, + "source": [ + "# vLLM Chat\n", + "\n", + "vLLM can be deployed as a server that mimics the OpenAI API protocol. This allows vLLM to be used as a drop-in replacement for applications using OpenAI API. This server can be queried in the same format as OpenAI API.\n", + "\n", + "This notebook covers how to get started with vLLM chat models using langchain's `ChatOpenAI` **as it is**." + ] + }, + { + "cell_type": "code", + "execution_count": 1, + "id": "060a2e3d-d42f-4221-bd09-a9a06544dcd3", + "metadata": { + "tags": [] + }, + "outputs": [], + "source": [ + "from langchain.chat_models import ChatOpenAI\n", + "from langchain.prompts.chat import (\n", + " ChatPromptTemplate,\n", + " SystemMessagePromptTemplate,\n", + " AIMessagePromptTemplate,\n", + " HumanMessagePromptTemplate,\n", + ")\n", + "from langchain.schema import AIMessage, HumanMessage, SystemMessage" + ] + }, + { + "cell_type": "code", + "execution_count": 14, + "id": "bf24d732-68a9-44fd-b05d-4903ce5620c6", + "metadata": { + "tags": [] + }, + "outputs": [], + "source": [ + "inference_server_url = \"http://localhost:8000/v1\"\n", + "\n", + "chat = ChatOpenAI(\n", + " model=\"mosaicml/mpt-7b\",\n", + " openai_api_key=\"EMPTY\",\n", + " openai_api_base=inference_server_url,\n", + " max_tokens=5,\n", + " temperature=0,\n", + ")" + ] + }, + { + "cell_type": "code", + "execution_count": 15, + "id": "aea4e363-5688-4b07-82ed-6aa8153c2377", + "metadata": { + "tags": [] + }, + "outputs": [ + { + "data": { + "text/plain": [ + "AIMessage(content=' Io amo programmare', additional_kwargs={}, example=False)" + ] + }, + "execution_count": 15, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "messages = [\n", + " SystemMessage(\n", + " content=\"You are a helpful assistant that translates English to Italian.\"\n", + " ),\n", + " HumanMessage(\n", + " content=\"Translate the following sentence from English to Italian: I love programming.\"\n", + " ),\n", + "]\n", + "chat(messages)" + ] + }, + { + "cell_type": "markdown", + "id": "55fc7046-a6dc-4720-8c0c-24a6db76a4f4", + "metadata": {}, + "source": [ + "You can make use of templating by using a `MessagePromptTemplate`. You can build a `ChatPromptTemplate` from one or more `MessagePromptTemplates`. You can use ChatPromptTemplate's format_prompt -- this returns a `PromptValue`, which you can convert to a string or `Message` object, depending on whether you want to use the formatted value as input to an llm or chat model.\n", + "\n", + "For convenience, there is a `from_template` method exposed on the template. If you were to use this template, this is what it would look like:" + ] + }, + { + "cell_type": "code", + "execution_count": 16, + "id": "123980e9-0dee-4ce5-bde6-d964dd90129c", + "metadata": { + "tags": [] + }, + "outputs": [], + "source": [ + "template = (\n", + " \"You are a helpful assistant that translates {input_language} to {output_language}.\"\n", + ")\n", + "system_message_prompt = SystemMessagePromptTemplate.from_template(template)\n", + "human_template = \"{text}\"\n", + "human_message_prompt = HumanMessagePromptTemplate.from_template(human_template)" + ] + }, + { + "cell_type": "code", + "execution_count": 17, + "id": "b2fb8c59-8892-4270-85a2-4f8ab276b75d", + "metadata": { + "tags": [] + }, + "outputs": [ + { + "data": { + "text/plain": [ + "AIMessage(content=' I love programming too.', additional_kwargs={}, example=False)" + ] + }, + "execution_count": 17, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "chat_prompt = ChatPromptTemplate.from_messages(\n", + " [system_message_prompt, human_message_prompt]\n", + ")\n", + "\n", + "# get a chat completion from the formatted messages\n", + "chat(\n", + " chat_prompt.format_prompt(\n", + " input_language=\"English\", output_language=\"Italian\", text=\"I love programming.\"\n", + " ).to_messages()\n", + ")" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "0bbd9861-2b94-4920-8708-b690004f4c4d", + "metadata": {}, + "outputs": [], + "source": [] + } + ], + "metadata": { + "kernelspec": { + "display_name": "conda_pytorch_p310", + "language": "python", + "name": "conda_pytorch_p310" + }, + "language_info": { + "codemirror_mode": { + "name": "ipython", + "version": 3 + }, + "file_extension": ".py", + "mimetype": "text/x-python", + "name": "python", + "nbconvert_exporter": "python", + "pygments_lexer": "ipython3", + "version": "3.10.12" + } + }, + "nbformat": 4, + "nbformat_minor": 5 +}