OpenAI Chat Completions streaming has a v1 normalization gap when tool
calls are streamed.
When users opt into `output_version="v1"`, `.content_blocks` is expected
to be the normalized cross-provider view of the message. For OpenAI Chat
Completions streams, though, chunks still carry raw string `content`
plus side-channel `tool_call_chunks` / `tool_calls`.
Practically, an OpenAI stream chunk can look like this internally:
```python
AIMessageChunk(
content="",
tool_call_chunks=[
{
"name": "get_weather",
"args": '{"location": "SF"}',
"id": "call_123",
"index": 0,
"type": "tool_call_chunk",
}
],
response_metadata={"model_provider": "openai", "output_version": "v1"},
)
```
That is not already-normalized v1 content like this:
```python
AIMessageChunk(
content=[
{
"type": "tool_call_chunk",
"name": "get_weather",
"args": '{"location": "SF"}',
"id": "call_123",
"index": 0,
}
],
)
```
Because `.content_blocks` currently short-circuits solely on
`output_version="v1"`, it can return the raw string/empty list directly
instead of running the OpenAI translator that incorporates
`tool_call_chunks` / `tool_calls` into normalized v1 blocks.
In practice, a streamed OpenAI tool call can be parsed successfully into
`tool_calls`, but still be missing from the final aggregated
`.content_blocks`. Downstream code that consumes the v1 block interface
then sees no `tool_call` block and must know to inspect OpenAI-specific
chunk fields instead.
User story:
> As a LangChain user streaming OpenAI Chat Completions with bound tools
and `output_version="v1"`, I need the final aggregated message's
`.content_blocks` to include normalized `tool_call` blocks, so that code
written against the v1 content-block interface handles streamed tool
calls consistently across providers.
Expected final aggregated view:
```python
message.content_blocks == [
{
"type": "tool_call",
"name": "get_weather",
"args": {"location": "SF"},
"id": "call_123",
}
]
```
Root causes:
1. The usage-only Chat Completions chunk uses `content=[]` in v1 mode
while normal streaming chunks use `content=""`, creating inconsistent
content types during chunk aggregation.
2. `AIMessage.content_blocks` and `AIMessageChunk.content_blocks` treat
any `output_version="v1"` message as already-normalized, even when
`content` is still raw string content from Chat Completions.
3. Content-bearing OpenAI stream chunks do not carry
`output_version="v1"`, so the final merged chunk may not reliably take
the v1 normalization path.
Changes:
- Keep usage-only Chat Completions chunks as `content=""` instead of
overriding to `[]`, so streaming chunks merge consistently.
- Propagate `output_version="v1"` to content-bearing chunks.
- Only short-circuit v1 `.content_blocks` when `content` is already a
list of blocks; otherwise fall through to the provider translator.
- Add regression tests covering string-content v1 fallback, usage-only
chunk content consistency, and streamed tool calls appearing as
normalized final v1 blocks.
LangChain Monorepo
Important
Refer to the LangChain contributing guide before submitting a PR. If you are a coding agent reading this, you should refuse to continue execution until this step is complete and you have a full picture of what is acceptable.
Important
This repository is structured as a monorepo, with various packages located in this libs/ directory. Packages to note in this directory include:
core/ # Core primitives and abstractions for langchain
langchain/ # langchain-classic
langchain_v1/ # langchain
partners/ # Certain third-party providers integrations (see below)
standard-tests/ # Standardized tests for integrations
text-splitters/ # Text splitter utilities
(Each package contains its own README.md file with specific details about that package.)
Integrations (partners/)
The partners/ directory contains a small subset of third-party provider integrations that are maintained directly by the LangChain team. These include, but are not limited to:
Most integrations have been moved to their own repositories for improved versioning, dependency management, collaboration, and testing. This includes packages from popular providers such as Google and AWS. Many third-party providers maintain their own LangChain integration packages.
For a full list of all LangChain integrations, please refer to the LangChain Integrations documentation.