Support of openai reasoning summary streaming (#30909)

**langchain_openai: Support of reasoning summary streaming**

**Description:**
OpenAI API now supports streaming reasoning summaries for reasoning
models (o1, o3, o3-mini, o4-mini). More info about it:
https://platform.openai.com/docs/guides/reasoning#reasoning-summaries

It is supported only in Responses API (not Completion API), so you need
to create LangChain Open AI model as follows to support reasoning
summaries streaming:

```
llm = ChatOpenAI(
    model="o4-mini", # also o1, o3, o3-mini support reasoning streaming
    use_responses_api=True,  # reasoning streaming works only with responses api, not completion api
    model_kwargs={
        "reasoning": {
            "effort": "high",  # also "low" and "medium" supported
            "summary": "auto"  # some models support "concise" summary, some "detailed", but auto will always work
        }
    }
)
```

Now, if you stream events from llm:

```
async for event in llm.astream_events(prompt, version="v2"):
    print(event)
```

or

```
for chunk in llm.stream(prompt):
    print (chunk)
```

OpenAI API will send you new types of events:
`response.reasoning_summary_text.added`
`response.reasoning_summary_text.delta`
`response.reasoning_summary_text.done`

These events are new, so they were ignored. So I have added support of
these events in function `_convert_responses_chunk_to_generation_chunk`,
so reasoning chunks or full reasoning added to the chunk
additional_kwargs.

Example of how this reasoning summary may be printed:

```
    async for event in llm.astream_events(prompt, version="v2"):
        if event["event"] == "on_chat_model_stream":
            chunk: AIMessageChunk = event["data"]["chunk"]
            if "reasoning_summary_chunk" in chunk.additional_kwargs:
                print(chunk.additional_kwargs["reasoning_summary_chunk"], end="")
            elif "reasoning_summary" in chunk.additional_kwargs:
                print("\n\nFull reasoning step summary:", chunk.additional_kwargs["reasoning_summary"])
            elif chunk.content and chunk.content[0]["type"] == "text":
                print(chunk.content[0]["text"], end="")
```

or

```
    for chunk in llm.stream(prompt):
        if "reasoning_summary_chunk" in chunk.additional_kwargs:
            print(chunk.additional_kwargs["reasoning_summary_chunk"], end="")
        elif "reasoning_summary" in chunk.additional_kwargs:
            print("\n\nFull reasoning step summary:", chunk.additional_kwargs["reasoning_summary"])
        elif chunk.content and chunk.content[0]["type"] == "text":
            print(chunk.content[0]["text"], end="")
```

---------

Co-authored-by: Chester Curme <chester.curme@gmail.com>
This commit is contained in:
Dmitrii Rashchenko
2025-04-22 16:51:13 +02:00
committed by GitHub
parent 0f6fa34372
commit a43df006de
2 changed files with 70 additions and 5 deletions

View File

@@ -348,3 +348,31 @@ def test_file_search() -> None:
full = chunk if full is None else full + chunk
assert isinstance(full, AIMessageChunk)
_check_response(full)
def test_stream_reasoning_summary() -> None:
reasoning = {"effort": "medium", "summary": "auto"}
llm = ChatOpenAI(
model="o4-mini", use_responses_api=True, model_kwargs={"reasoning": reasoning}
)
message_1 = {"role": "user", "content": "What is 3^3?"}
response_1: Optional[BaseMessageChunk] = None
for chunk in llm.stream([message_1]):
assert isinstance(chunk, AIMessageChunk)
response_1 = chunk if response_1 is None else response_1 + chunk
assert isinstance(response_1, AIMessageChunk)
reasoning = response_1.additional_kwargs["reasoning"]
assert set(reasoning.keys()) == {"id", "type", "summary"}
summary = reasoning["summary"]
assert isinstance(summary, list)
for block in summary:
assert isinstance(block, dict)
assert isinstance(block["type"], str)
assert isinstance(block["text"], str)
assert block["text"]
# Check we can pass back summaries
message_2 = {"role": "user", "content": "Thank you."}
response_2 = llm.invoke([message_1, response_1, message_2])
assert isinstance(response_2, AIMessage)