feat(llms): support vLLM's OpenAI-compatible server (#9179)

This PR aims at supporting [vLLM's OpenAI-compatible server feature](https://vllm.readthedocs.io/en/latest/getting_started/quickstart.html#openai-compatible-server), i.e. allowing to call vLLM's LLMs like if they were OpenAI's. I've also udpated the related notebook providing an example usage. At the moment, vLLM only supports the `Completion` API.
2025-09-03 20:16:52 +00:00 · 2023-08-14 08:03:05 +02:00
parent 621da3c164
commit d95eeaedbe
3 changed files with 73 additions and 1 deletions
--- a/docs/extras/integrations/llms/vllm.ipynb
+++ b/docs/extras/integrations/llms/vllm.ipynb
@@ -170,6 +170,51 @@
    "\n",
    "llm(\"What is the future of AI?\")"
   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "64e89be0-6ad7-43a8-9dac-1324dcd4e851",
+   "metadata": {
+    "tags": []
+   },
+   "source": [
+    "## OpenAI-Compatible Server\n",
+    "\n",
+    "vLLM can be deployed as a server that mimics the OpenAI API protocol. This allows vLLM to be used as a drop-in replacement for applications using OpenAI API.\n",
+    "\n",
+    "This server can be queried in the same format as OpenAI API.\n",
+    "\n",
+    "### OpenAI-Compatible Completion"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 3,
+   "id": "c3cbc428-0bb8-422a-913e-1c6fef8b89d4",
+   "metadata": {
+    "tags": []
+   },
+   "outputs": [
+    {
+     "name": "stdout",
+     "output_type": "stream",
+     "text": [
+      " a city that is filled with history, ancient buildings, and art around every corner\n"
+     ]
+    }
+   ],
+   "source": [
+    "from langchain.llms import VLLMOpenAI\n",
+    "\n",
+    "\n",
+    "llm = VLLMOpenAI(\n",
+    "    openai_api_key=\"EMPTY\",\n",
+    "    openai_api_base=\"http://localhost:8000/v1\",\n",
+    "    model_name=\"tiiuae/falcon-7b\",\n",
+    "    model_kwargs={\"stop\": [\".\"]}\n",
+    ")\n",
+    "print(llm(\"Rome is\"))"
+   ]
  }
 ],
 "metadata": {