langchain/libs/partners/openai/langchain_openai
ccurme 8119a7bc5c
openai[patch]: support streaming token counts in AzureChatOpenAI (#30494)
When OpenAI originally released `stream_options` to enable token usage
during streaming, it was not supported in AzureOpenAI. It is now
supported.

Like the [OpenAI
SDK](f66d2e6fdc/src/openai/resources/completions.py (L68)),
ChatOpenAI does not return usage metadata during streaming by default
(which adds an extra chunk to the stream). The OpenAI SDK requires users
to pass `stream_options={"include_usage": True}`. ChatOpenAI implements
a convenience argument `stream_usage: Optional[bool]`, and an attribute
`stream_usage: bool = False`.

Here we extend this to AzureChatOpenAI by moving the `stream_usage`
attribute and `stream_usage` kwarg (on `_(a)stream`) from ChatOpenAI to
BaseChatOpenAI.

---

Additional consideration: we must be sensitive to the number of users
using BaseChatOpenAI to interact with other APIs that do not support the
`stream_options` parameter.

Suppose OpenAI in the future updates the default behavior to stream
token usage. Currently, BaseChatOpenAI only passes `stream_options` if
`stream_usage` is True, so there would be no way to disable this new
default behavior.

To address this, we could update the `stream_usage` attribute to
`Optional[bool] = None`, but this is technically a breaking change (as
currently values of False are not passed to the client). IMO: if / when
this change happens, we could accompany it with this update in a minor
bump.

--- 

Related previous PRs:
- https://github.com/langchain-ai/langchain/pull/22628
- https://github.com/langchain-ai/langchain/pull/22854
- https://github.com/langchain-ai/langchain/pull/23552

---------

Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>
2025-03-26 15:16:37 -04:00
..
chat_models openai[patch]: support streaming token counts in AzureChatOpenAI (#30494) 2025-03-26 15:16:37 -04:00
embeddings Fix #29759: Use local chunk_size_ for looping in embed_documents (#29761) 2025-02-13 01:28:26 +00:00
llms docs: Fix typo in code samples for max_tokens_for_prompt (#30088) 2025-03-04 09:11:21 -05:00
output_parsers langchain[patch], core[patch], openai[patch], fireworks[minor]: ChatFireworks.with_structured_output (#18078) 2024-02-26 12:46:39 -08:00
__init__.py infra: add more formatter rules to openai (#23189) 2024-06-19 11:39:58 -07:00
py.typed openai[minor]: implement langchain-openai package (#15503) 2024-01-05 15:03:28 -08:00