mirror of
https://github.com/hwchase17/langchain.git
synced 2026-07-01 22:59:06 +00:00
OpenAI Chat Completions streaming has a v1 normalization gap when tool
calls are streamed.
When users opt into `output_version="v1"`, `.content_blocks` is expected
to be the normalized cross-provider view of the message. For OpenAI Chat
Completions streams, though, chunks still carry raw string `content`
plus side-channel `tool_call_chunks` / `tool_calls`.
Practically, an OpenAI stream chunk can look like this internally:
```python
AIMessageChunk(
content="",
tool_call_chunks=[
{
"name": "get_weather",
"args": '{"location": "SF"}',
"id": "call_123",
"index": 0,
"type": "tool_call_chunk",
}
],
response_metadata={"model_provider": "openai", "output_version": "v1"},
)
```
That is not already-normalized v1 content like this:
```python
AIMessageChunk(
content=[
{
"type": "tool_call_chunk",
"name": "get_weather",
"args": '{"location": "SF"}',
"id": "call_123",
"index": 0,
}
],
)
```
Because `.content_blocks` currently short-circuits solely on
`output_version="v1"`, it can return the raw string/empty list directly
instead of running the OpenAI translator that incorporates
`tool_call_chunks` / `tool_calls` into normalized v1 blocks.
In practice, a streamed OpenAI tool call can be parsed successfully into
`tool_calls`, but still be missing from the final aggregated
`.content_blocks`. Downstream code that consumes the v1 block interface
then sees no `tool_call` block and must know to inspect OpenAI-specific
chunk fields instead.
User story:
> As a LangChain user streaming OpenAI Chat Completions with bound tools
and `output_version="v1"`, I need the final aggregated message's
`.content_blocks` to include normalized `tool_call` blocks, so that code
written against the v1 content-block interface handles streamed tool
calls consistently across providers.
Expected final aggregated view:
```python
message.content_blocks == [
{
"type": "tool_call",
"name": "get_weather",
"args": {"location": "SF"},
"id": "call_123",
}
]
```
Root causes:
1. The usage-only Chat Completions chunk uses `content=[]` in v1 mode
while normal streaming chunks use `content=""`, creating inconsistent
content types during chunk aggregation.
2. `AIMessage.content_blocks` and `AIMessageChunk.content_blocks` treat
any `output_version="v1"` message as already-normalized, even when
`content` is still raw string content from Chat Completions.
3. Content-bearing OpenAI stream chunks do not carry
`output_version="v1"`, so the final merged chunk may not reliably take
the v1 normalization path.
Changes:
- Keep usage-only Chat Completions chunks as `content=""` instead of
overriding to `[]`, so streaming chunks merge consistently.
- Propagate `output_version="v1"` to content-bearing chunks.
- Only short-circuit v1 `.content_blocks` when `content` is already a
list of blocks; otherwise fall through to the provider translator.
- Add regression tests covering string-content v1 fallback, usage-only
chunk content consistency, and streamed tool calls appearing as
normalized final v1 blocks.
FAQ
Looking for an integration not listed here? Check out the integrations documentation and the note in the libs/ README about third-party maintained packages.
Integration docs
For full documentation, see the primary and API reference docs for integrations.