openai[patch]: support streaming token counts in AzureChatOpenAI (#30494)

When OpenAI originally released `stream_options` to enable token usage during streaming, it was not supported in AzureOpenAI. It is now supported. Like the [OpenAI SDK](f66d2e6fdc/src/openai/resources/completions.py (L68)), ChatOpenAI does not return usage metadata during streaming by default (which adds an extra chunk to the stream). The OpenAI SDK requires users to pass `stream_options={"include_usage": True}`. ChatOpenAI implements a convenience argument `stream_usage: Optional[bool]`, and an attribute `stream_usage: bool = False`. Here we extend this to AzureChatOpenAI by moving the `stream_usage` attribute and `stream_usage` kwarg (on `_(a)stream`) from ChatOpenAI to BaseChatOpenAI. --- Additional consideration: we must be sensitive to the number of users using BaseChatOpenAI to interact with other APIs that do not support the `stream_options` parameter. Suppose OpenAI in the future updates the default behavior to stream token usage. Currently, BaseChatOpenAI only passes `stream_options` if `stream_usage` is True, so there would be no way to disable this new default behavior. To address this, we could update the `stream_usage` attribute to `Optional[bool] = None`, but this is technically a breaking change (as currently values of False are not passed to the client). IMO: if / when this change happens, we could accompany it with this update in a minor bump. --- Related previous PRs: - https://github.com/langchain-ai/langchain/pull/22628 - https://github.com/langchain-ai/langchain/pull/22854 - https://github.com/langchain-ai/langchain/pull/23552 --------- Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>
2026-06-09 10:17:00 +00:00 · 2025-03-26 15:16:37 -04:00
parent 56629ed87b
commit 8119a7bc5c
2 changed files with 41 additions and 52 deletions
--- a/libs/partners/openai/tests/integration_tests/chat_models/test_azure_standard.py
+++ b/libs/partners/openai/tests/integration_tests/chat_models/test_azure_standard.py
@@ -3,7 +3,6 @@
 import os
 from typing import Type

-import pytest
 from langchain_core.language_models import BaseChatModel
 from langchain_tests.integration_tests import ChatModelIntegrationTests

@@ -25,6 +24,7 @@ class TestAzureOpenAIStandard(ChatModelIntegrationTests):
            "model": "gpt-4o-mini",
            "openai_api_version": OPENAI_API_VERSION,
            "azure_endpoint": OPENAI_API_BASE,
+            "stream_usage": True,
        }

    @property
@@ -35,10 +35,6 @@ class TestAzureOpenAIStandard(ChatModelIntegrationTests):
    def supports_json_mode(self) -> bool:
        return True

-    @pytest.mark.xfail(reason="Not yet supported.")
-    def test_usage_metadata_streaming(self, model: BaseChatModel) -> None:
-        super().test_usage_metadata_streaming(model)
-

 class TestAzureOpenAIStandardLegacy(ChatModelIntegrationTests):
    """Test a legacy model."""
@@ -53,12 +49,9 @@ class TestAzureOpenAIStandardLegacy(ChatModelIntegrationTests):
            "deployment_name": os.environ["AZURE_OPENAI_LEGACY_CHAT_DEPLOYMENT_NAME"],
            "openai_api_version": OPENAI_API_VERSION,
            "azure_endpoint": OPENAI_API_BASE,
+            "stream_usage": True,
        }

    @property
    def structured_output_kwargs(self) -> dict:
        return {"method": "function_calling"}
-
-    @pytest.mark.xfail(reason="Not yet supported.")
-    def test_usage_metadata_streaming(self, model: BaseChatModel) -> None:
-        super().test_usage_metadata_streaming(model)