Enable streaming for OpenAI LLM (#986)

* Support a callback `on_llm_new_token` that users can implement when
`OpenAI.streaming` is set to `True`
This commit is contained in:
Ankush Gola
2023-02-14 15:06:14 -08:00
committed by GitHub
parent f05f025e41
commit caa8e4742e
26 changed files with 1311 additions and 155 deletions

View File

@@ -18,7 +18,9 @@
"cell_type": "code",
"execution_count": 1,
"id": "df924055",
"metadata": {},
"metadata": {
"tags": []
},
"outputs": [],
"source": [
"from langchain.llms import OpenAI"
@@ -207,14 +209,6 @@
"source": [
"llm.get_num_tokens(\"what a joke\")"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "b004ffdd",
"metadata": {},
"outputs": [],
"source": []
}
],
"metadata": {

View File

@@ -8,6 +8,7 @@ They are split into two categories:
1. `Generic Functionality <./generic_how_to.html>`_: Covering generic functionality all LLMs should have.
2. `Integrations <./integrations.html>`_: Covering integrations with various LLM providers.
3. `Asynchronous <./async_llm.html>`_: Covering asynchronous functionality.
4. `Streaming <./streaming_llm.html>`_: Covering streaming functionality.
.. toctree::
:maxdepth: 1

View File

@@ -0,0 +1,140 @@
{
"cells": [
{
"cell_type": "markdown",
"id": "6eaf7e66-f49c-42da-8d11-22ea13bef718",
"metadata": {},
"source": [
"# Streaming with LLMs\n",
"\n",
"LangChain provides streaming support for LLMs. Currently, we only support streaming for the `OpenAI` LLM implementation, but streaming support for other LLM implementations is on the roadmap. To utilize streaming, use a [`CallbackHandler`](https://github.com/hwchase17/langchain/blob/master/langchain/callbacks/base.py) that implements `on_llm_new_token`. In this example, we are using [`StreamingStdOutCallbackHandler`]()."
]
},
{
"cell_type": "code",
"execution_count": 9,
"id": "4ac0ff54-540a-4f2b-8d9a-b590fec7fe07",
"metadata": {
"tags": []
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"\n",
"\n",
"Verse 1\n",
"I'm sippin' on sparkling water,\n",
"It's so refreshing and light,\n",
"It's the perfect way to quench my thirst,\n",
"On a hot summer night.\n",
"\n",
"Chorus\n",
"Sparkling water, sparkling water,\n",
"It's the best way to stay hydrated,\n",
"It's so refreshing and light,\n",
"It's the perfect way to stay alive.\n",
"\n",
"Verse 2\n",
"I'm sippin' on sparkling water,\n",
"It's so bubbly and bright,\n",
"It's the perfect way to cool me down,\n",
"On a hot summer night.\n",
"\n",
"Chorus\n",
"Sparkling water, sparkling water,\n",
"It's the best way to stay hydrated,\n",
"It's so refreshing and light,\n",
"It's the perfect way to stay alive.\n",
"\n",
"Verse 3\n",
"I'm sippin' on sparkling water,\n",
"It's so crisp and clean,\n",
"It's the perfect way to keep me going,\n",
"On a hot summer day.\n",
"\n",
"Chorus\n",
"Sparkling water, sparkling water,\n",
"It's the best way to stay hydrated,\n",
"It's so refreshing and light,\n",
"It's the perfect way to stay alive."
]
}
],
"source": [
"from langchain.llms import OpenAI\n",
"from langchain.callbacks.base import CallbackManager\n",
"from langchain.callbacks.streaming_stdout import StreamingStdOutCallbackHandler\n",
"\n",
"\n",
"llm = OpenAI(streaming=True, callback_manager=CallbackManager([StreamingStdOutCallbackHandler()]), verbose=True, temperature=0)\n",
"resp = llm(\"Write me a song about sparkling water.\")"
]
},
{
"cell_type": "markdown",
"id": "61fb6de7-c6c8-48d0-a48e-1204c027a23c",
"metadata": {
"tags": []
},
"source": [
"We still have access to the end `LLMResult` if using `generate`. However, `token_usage` is not currently supported for streaming."
]
},
{
"cell_type": "code",
"execution_count": 8,
"id": "a35373f1-9ee6-4753-a343-5aee749b8527",
"metadata": {
"tags": []
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"\n",
"\n",
"Q: What did the fish say when it hit the wall?\n",
"A: Dam!"
]
},
{
"data": {
"text/plain": [
"LLMResult(generations=[[Generation(text='\\n\\nQ: What did the fish say when it hit the wall?\\nA: Dam!', generation_info={'finish_reason': 'stop', 'logprobs': None})]], llm_output={'token_usage': {}})"
]
},
"execution_count": 8,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"llm.generate([\"Tell me a joke.\"])"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.10.9"
}
},
"nbformat": 4,
"nbformat_minor": 5
}