mirror of
https://github.com/hwchase17/langchain.git
synced 2025-04-27 19:46:55 +00:00
docs: document OpenAI flex processing (#31023)
Following https://github.com/langchain-ai/langchain/pull/31005
This commit is contained in:
parent
629b7a5a43
commit
a60fd06784
@ -1413,6 +1413,23 @@
|
||||
"second_output_message = llm.invoke(history)"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"id": "90c18d18-b25c-4509-a639-bd652b92f518",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"## Flex processing\n",
|
||||
"\n",
|
||||
"OpenAI offers a variety of [service tiers](https://platform.openai.com/docs/guides/flex-processing). The \"flex\" tier offers cheaper pricing for requests, with the trade-off that responses may take longer and resources might not always be available. This approach is best suited for non-critical tasks, including model testing, data enhancement, or jobs that can be run asynchronously.\n",
|
||||
"\n",
|
||||
"To use it, initialize the model with `service_tier=\"flex\"`:\n",
|
||||
"```python\n",
|
||||
"llm = ChatOpenAI(model=\"o4-mini\", service_tier=\"flex\")\n",
|
||||
"```\n",
|
||||
"\n",
|
||||
"Note that this is a beta feature that is only available for a subset of models. See OpenAI [docs](https://platform.openai.com/docs/guides/flex-processing) for more detail."
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"id": "a796d728-971b-408b-88d5-440015bbb941",
|
||||
@ -1420,7 +1437,7 @@
|
||||
"source": [
|
||||
"## API reference\n",
|
||||
"\n",
|
||||
"For detailed documentation of all ChatOpenAI features and configurations head to the API reference: https://python.langchain.com/api_reference/openai/chat_models/langchain_openai.chat_models.base.ChatOpenAI.html"
|
||||
"For detailed documentation of all ChatOpenAI features and configurations head to the [API reference](https://python.langchain.com/api_reference/openai/chat_models/langchain_openai.chat_models.base.ChatOpenAI.html)."
|
||||
]
|
||||
}
|
||||
],
|
||||
|
@ -2331,6 +2331,27 @@ class ChatOpenAI(BaseChatOpenAI): # type: ignore[override]
|
||||
"logprobs": None,
|
||||
}
|
||||
|
||||
.. dropdown:: Flex processing
|
||||
|
||||
OpenAI offers a variety of
|
||||
`service tiers <https://platform.openai.com/docs/guides/flex-processing>`_.
|
||||
The "flex" tier offers cheaper pricing for requests, with the trade-off that
|
||||
responses may take longer and resources might not always be available.
|
||||
This approach is best suited for non-critical tasks, including model testing,
|
||||
data enhancement, or jobs that can be run asynchronously.
|
||||
|
||||
To use it, initialize the model with ``service_tier="flex"``:
|
||||
|
||||
.. code-block:: python
|
||||
|
||||
from langchain_openai import ChatOpenAI
|
||||
|
||||
llm = ChatOpenAI(model="o4-mini", service_tier="flex")
|
||||
|
||||
Note that this is a beta feature that is only available for a subset of models.
|
||||
See OpenAI `docs <https://platform.openai.com/docs/guides/flex-processing>`_
|
||||
for more detail.
|
||||
|
||||
""" # noqa: E501
|
||||
|
||||
max_tokens: Optional[int] = Field(default=None, alias="max_completion_tokens")
|
||||
|
Loading…
Reference in New Issue
Block a user