mirror of
https://github.com/hwchase17/langchain.git
synced 2025-09-03 20:16:52 +00:00
feat(llms): support vLLM's OpenAI-compatible server (#9179)
This PR aims at supporting [vLLM's OpenAI-compatible server feature](https://vllm.readthedocs.io/en/latest/getting_started/quickstart.html#openai-compatible-server), i.e. allowing to call vLLM's LLMs like if they were OpenAI's. I've also udpated the related notebook providing an example usage. At the moment, vLLM only supports the `Completion` API.
This commit is contained in:
committed by
GitHub
parent
621da3c164
commit
d95eeaedbe
@@ -170,6 +170,51 @@
|
||||
"\n",
|
||||
"llm(\"What is the future of AI?\")"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"id": "64e89be0-6ad7-43a8-9dac-1324dcd4e851",
|
||||
"metadata": {
|
||||
"tags": []
|
||||
},
|
||||
"source": [
|
||||
"## OpenAI-Compatible Server\n",
|
||||
"\n",
|
||||
"vLLM can be deployed as a server that mimics the OpenAI API protocol. This allows vLLM to be used as a drop-in replacement for applications using OpenAI API.\n",
|
||||
"\n",
|
||||
"This server can be queried in the same format as OpenAI API.\n",
|
||||
"\n",
|
||||
"### OpenAI-Compatible Completion"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 3,
|
||||
"id": "c3cbc428-0bb8-422a-913e-1c6fef8b89d4",
|
||||
"metadata": {
|
||||
"tags": []
|
||||
},
|
||||
"outputs": [
|
||||
{
|
||||
"name": "stdout",
|
||||
"output_type": "stream",
|
||||
"text": [
|
||||
" a city that is filled with history, ancient buildings, and art around every corner\n"
|
||||
]
|
||||
}
|
||||
],
|
||||
"source": [
|
||||
"from langchain.llms import VLLMOpenAI\n",
|
||||
"\n",
|
||||
"\n",
|
||||
"llm = VLLMOpenAI(\n",
|
||||
" openai_api_key=\"EMPTY\",\n",
|
||||
" openai_api_base=\"http://localhost:8000/v1\",\n",
|
||||
" model_name=\"tiiuae/falcon-7b\",\n",
|
||||
" model_kwargs={\"stop\": [\".\"]}\n",
|
||||
")\n",
|
||||
"print(llm(\"Rome is\"))"
|
||||
]
|
||||
}
|
||||
],
|
||||
"metadata": {
|
||||
|
Reference in New Issue
Block a user