Supersedes #34727Closes#30703
Related:
* langchain-ai/langchain-google#1460
* langchain-ai/langchain-google#1501
Fixing this at the `langchain-core` callback layer instead of
normalizing inside individual provider integrations, so structured
streaming content is preserved consistently.
---
Models are increasingly streaming structured content blocks instead of
plain text tokens. For example, Gemini 3 can stream text as
content-block lists, and Anthropic/tool-use flows can also produce
non-text message content. Today those values already reach
`on_llm_new_token`, but the callback API still advertises `token: str`,
which makes custom callbacks, tracers, and streaming helpers assume
every streamed value is text.
User story: as a LangChain user building a streaming callback for chat
models with tool calls, reasoning/thinking blocks, or provider-specific
structured content, I need `on_llm_new_token` to accept the same content
shape that chat model chunks can actually emit, so my callback can
observe the stream without providers flattening or dropping non-text
data.
Fixing this in `langchain-core` makes the existing runtime behavior
explicit at the shared callback boundary. Normalizing content blocks
inside each provider would duplicate logic, produce inconsistent
behavior across integrations, and in some cases lose required provider
metadata such as Gemini thought signatures.
## Changes
- Update the callback contract so streamed tokens can be either plain
text or structured content blocks
- Carry structured streamed content through tracing and event/log
streaming paths without forcing provider data into text too early
- Keep built-in text-oriented streaming callbacks working by converting
structured tokens only at the display/queue boundary
- Drop the now-incorrect `cast("str", ...)` on streamed content in
`BaseChatModel` so the producer side matches the widened callback
signature instead of asserting a string it doesn't always have (no
runtime change — `cast` is erased)
- Align Anthropic and Mistral content typing with the structured content
shapes already used by chat model messages
- Update callback tests to reflect that not every streamed value is text
## Compatibility
No runtime behavior change: no producer emits anything it wasn't already
emitting, and widening a parameter type is safe for existing callers and
handlers that pass or receive `str`. The one caveat is downstream code
that subclasses a callback handler or tracer and overrides
`on_llm_new_token` with a `token: str` annotation — under strict type
checking that override is now narrower than the base and will be flagged
as incompatible with the supertype. Such code still runs unchanged; the
fix is to widen the annotation to match.
`Runnable.__or__`, `Runnable.__ror__`, and their `RunnableSequence` and
`StructuredPrompt` overrides previously erased composition types: the
right-hand operand was typed `Runnable[Any, Other]`, so piping two
runnables together always produced `RunnableSerializable[Input, Any]`.
Type information was lost at every `|`, which is why chains so often
needed a `chain: Runnable = ...` annotation just to recover usable
inference.
This adds `@overload`s so the `Output` of one step flows into the
`Input` of the next and the composed result carries the real `Output`
type through. `Runnable[int, str] | Runnable[str, float]` now infers
`RunnableSerializable[int, float]` instead of `[int, Any]`.
`coerce_to_runnable` gains overloads so a `Mapping` resolves to
`RunnableParallel` while everything else stays a `Runnable`. As a
knock-on effect, dozens of now-unnecessary `: Runnable` annotations were
dropped from the test suite.
Runtime behavior is unchanged — this is a typing-only change.
## Impact on type-checked code
Most users will simply get better inference. Two changes can require a
small adjustment if you run a type checker (`mypy`, `pyright`):
### Stricter operand matching in `|`
The right-hand side of `|` is now typed `Runnable[Output, Other]` rather
than `Runnable[Any, Other]`, so the right operand's declared **input**
must match the left operand's **output**. This is more accurate, but it
surfaces a common pattern that was previously silent: piping a step that
outputs a plain `dict` into a step whose declared input is a more
specific type (for example a `TypedDict`). It still works at runtime;
the checker now reports an `[operator]` error.
If you hit this, narrow the boundary with a `cast` (or an explicit
annotation):
```python
from typing import Any, cast
from langchain_core.runnables import Runnable
# upstream outputs a dict; downstream declares a narrower input type
chain = cast("Runnable[Any, MyInput]", upstream) | downstream
```
### `list` → `Sequence` on `RunnableEach` / `map()`
`Runnable.map()` and the `invoke` / `ainvoke` methods of `RunnableEach`
now accept `Sequence[Input]` instead of `list[Input]`. Callers are
unaffected — a `list` is a `Sequence`, and tuples or other sequences now
type-check too. The only thing to adjust: if you **subclass**
`RunnableEach` (or `RunnableEachBase`) and override these methods with a
`list[...]` parameter, widen the annotation to `Sequence[...]` so the
override stays compatible with the base signature.
---------
Co-authored-by: Mason Daugherty <github@mdrxy.com>
Simplify test for `create_agent` errors.
* Remove duplicate tests
* Test sync and async with common logic
---------
Co-authored-by: Mason Daugherty <github@mdrxy.com>
In this order:
* used `@override` when overriding a parent method.
* prefixed param with `_` when the param could be renamed.
* used `*_args, **_kwargs` when it was not possible to rename (eg:
protocol)
* used `_ = some_variable` when the variable name is inspected (in
tools)
---------
Co-authored-by: Mason Daugherty <github@mdrxy.com>
Co-authored-by: Mason Daugherty <mason@langchain.dev>
Automated refresh of model profile data for all in-monorepo partner
integrations via `langchain-profiles refresh`.
🤖 Generated by the `refresh_model_profiles` workflow.
Co-authored-by: mdrxy <61371264+mdrxy@users.noreply.github.com>
`handle_tool_error` callables can now return structured message content
as any valid sequence, not just a mutable `list`. Valid structured
sequences are normalized to the `ToolMessage` content shape at the tool
output boundary, while invalid content still falls back to
stringification.
## Changes
- Widened `ToolExceptionHandlerOutput` from `list[str | dict[str, Any]]`
to `Sequence[MessageContentBlock]` so handlers returning `list[dict[str,
Any]]` or tuple content blocks type-check cleanly.
- Added `_normalize_message_content` to validate structured message
content and convert valid non-string sequences to the `list` shape
expected by `ToolMessage`.
- Preserved existing stringification behavior for invalid structured
content blocks instead of treating failed normalization as `None`.
- Removed the now-unused `_is_message_content_type` helper; output
formatting validates content directly through
`_normalize_message_content`.
`handle_tool_error` callables can already return structured message
content at runtime, but the public typing only allowed strings. The tool
error handling API now reflects the existing output formatting path,
including clearer docs for how handled errors become
`ToolMessage(status="error")` results.
`SummarizationMiddleware._trigger_conditions` is now explicitly marked
as a temporary compatibility view for private consumers. The regression
test is tied to the package major version so the 2.0 release path fails
loudly until the legacy attr and test are removed.
`SummarizationMiddleware` now uses `_trigger_clauses` as the canonical
internal representation for AND/OR trigger evaluation while keeping
`_trigger_conditions` as a tuple-shaped compatibility view. This keeps
the new dict-style `TriggerClause` behavior intact without breaking
private consumers that still inspect the old tuple-normalized trigger
state.
## Changes
- Added `_trigger_clauses` as the source of truth for summarization
trigger evaluation, profile requirement checks, and compound AND clause
handling.
- Restored `_trigger_conditions` as a legacy compatibility projection
for tuple-expressible triggers, so tuple and single-key dict triggers
remain visible in the previous private shape.
- Avoided misrepresenting compound `TriggerClause` inputs like
`{"tokens": 1000, "messages": 5}` as independent OR-style tuple
conditions.
Closes#34442
[Docs](https://github.com/langchain-ai/docs/pull/4377)
---
Add parity with LangChain.js trigger semantics for Python
`SummarizationMiddleware`. `trigger` can now express AND conditions
within a single dict-style `TriggerClause` while preserving the existing
tuple and list-of-tuples behavior.
A simple user story: a support agent is helping debug an issue over a
long conversation. One tool call may return a large log snippet, briefly
pushing the token count over a limit, but the conversation is still only
a few messages long and the recent context is valuable. Separately, the
user may send many short follow-up messages that increase message count
without using much context.
With `trigger={"tokens": 4000, "messages": 10}`, both thresholds must be
met at the same time: at least 4,000 tokens and at least 10 messages.
This means 5,000 tokens across only 3 messages does not summarize, and
20 short messages totaling only 1,000 tokens does not summarize either.
Summarization waits until the conversation is large enough by both
measures, making it less likely to discard useful recent context too
early.
## Changes
- Add `TriggerClause` support so `trigger={"tokens": 4000, "messages":
10}` only summarizes when all configured thresholds are met
- Export `TriggerClause` from `langchain.agents.middleware` so users can
import and annotate dict-style trigger clauses from the public
middleware entrypoint
- Normalize tuple and mapping trigger inputs through
`_normalize_trigger`, preserving existing `ContextSize` tuple semantics
as single-condition clauses
- Defensively copy mutable trigger list and dict inputs during
initialization so caller-side mutations do not change the middleware's
stored public configuration after construction
- Keep list inputs as OR semantics across clauses, including mixed lists
like `[{"tokens": 4000, "messages": 10}, ("messages", 50)]`
- Update `_should_summarize` to evaluate AND within each clause and OR
across clauses for `tokens`, `messages`, and `fraction`
- Update the docs and API link map so `TriggerClause` resolves in the
Python middleware docs
- Preserve tuple-trigger compatibility while allowing message-based
`keep` configurations to summarize at least one message when a trigger
fires near the cutoff boundary
AI assistance was used to help draft and refine this contribution.
---------
Co-authored-by: Mason Daugherty <mason@langchain.dev>
Co-authored-by: Mason Daugherty <github@mdrxy.com>
Clarifies how `get_buffer_string` treats multimodal message content
across output formats. The docs now make the default prefix format's
text-only behavior explicit and point users to XML when they need
structured multimodal block representations.
This behavior may change in future iterations
## Summary
Follow-up to #37911 (released in `langchain-perplexity` 1.3.2). That PR
fixed the outbound `ToolMessage` / `AIMessage.tool_calls` serialization;
this one implements **`ChatPerplexity.bind_tools`**, which flips
`has_tool_calling` to `True` and lights up the full `langchain-tests`
standard tool-calling suite — the suite that would have caught #37911 in
the first place.
Verified live against the Perplexity Agent API (`openai/gpt-5.5`,
`use_responses_api=True`): a client-side function-tool round-trip
(invoke + stream) works end-to-end.
## Core change (the `bind_tools` work + the Responses-API follow-up)
- **`bind_tools`** mirrors `langchain-openai`: converts tools via
`convert_to_openai_tool`, normalizes `tool_choice`, and passes
Perplexity built-in tools (`web_search`, etc.) through unchanged.
- **`_to_responses_payload`** now translates tool turns into the
Responses (Agent) API's typed input items: `AIMessage.tool_calls` →
`function_call`, `ToolMessage` → `function_call_output`, and flattens
function tool specs. (The Responses API has no `tool` role, so this
translation is required for round-trips.)
## Changes required to make standard-suite tests pass on the Responses
route
- Streaming: `_convert_responses_stream_event_to_chunk` emits a
`tool_call_chunk` on `response.output_item.done` function calls —
required by `test_tool_calling` (which streams and asserts tool calls).
- `_content_to_text` reduces list-shaped assistant content to text in
the tool-call branch — required by `test_agent_loop` and
`test_tool_message_histories_list_content`.
- `response_metadata["model_name"]` on the Responses route, mirroring
Chat Completions — required by `test_usage_metadata` /
`test_usage_metadata_streaming` (used by `langchain_core` usage
callbacks).
## Tests
- `sonar` standard class marked `has_tool_calling=False` (the family
returns 400 "Tool calling is not supported for this model").
- New `TestPerplexityResponsesStandard` runs the full suite on
`openai/gpt-5.5` + `use_responses_api` with `has_tool_choice=False`:
**35 passed, 13 skipped, 2 xfailed**.
- The 2 xfails (`test_unicode_tool_call_integration`,
`test_structured_few_shot_examples`) hard-code `tool_choice="any"`. The
Responses (Agent) API does not support `tool_choice` (verified: every
form returns HTTP 200 without forcing a call), which `ChatPerplexity`
surfaces as `ValueError` — **existing behavior, unchanged here.**
Softening that to a warning can be a separate change.
`make format lint` clean; unit + standard tests green.
---------
Co-authored-by: Claude Opus 4.8 <noreply@anthropic.com>
Co-authored-by: Mason Daugherty <mason@langchain.dev>
Co-authored-by: Mason Daugherty <github@mdrxy.com>
Automated refresh of model profile data for all in-monorepo partner
integrations via `langchain-profiles refresh`.
🤖 Generated by the `refresh_model_profiles` workflow.
Co-authored-by: mdrxy <61371264+mdrxy@users.noreply.github.com>
`dict()` is a problematic method name as it clashes with the builtin
`dict` used as a type annotation.
This PR replaces it with an `asdict` method (inspired by dataclasses).
It also fixes a few places where `dict` must be replaced by
`builtins.dict` until the `dict()` method is removed.
---------
Co-authored-by: Mason Daugherty <github@mdrxy.com>
The custom VCR serializer pipes the cassette dict through
`yaml.safe_dump`, which raises on stream objects — so any request with
an `io.BytesIO` body (multipart/file-upload endpoints) couldn't be
recorded. A new `_coerce_bytesio` helper walks the cassette and replaces
each `BytesIO` with its raw bytes before dumping.
Automated refresh of model profile data for all in-monorepo partner
integrations via `langchain-profiles refresh`.
🤖 Generated by the `refresh_model_profiles` workflow.
Co-authored-by: mdrxy <61371264+mdrxy@users.noreply.github.com>
Automated refresh of model profile data for all in-monorepo partner
integrations via `langchain-profiles refresh`.
🤖 Generated by the `refresh_model_profiles` workflow.
Co-authored-by: mdrxy <61371264+mdrxy@users.noreply.github.com>
Fixes#37912
`ChatPerplexity._convert_message_to_dict` raises `TypeError` on
`ToolMessage` and drops `AIMessage.tool_calls`, which breaks
tool-message round-trips through `ChatPerplexity` — a client-side
tool-calling loop, or a shared message history across providers via
`RunnableWithFallbacks`.
Repro:
```python
from langchain_perplexity import ChatPerplexity
from langchain_core.messages import ToolMessage
ChatPerplexity(model="sonar")._convert_message_to_dict(
ToolMessage(content="result", tool_call_id="call_1")
)
# TypeError: Got unknown type content='result' tool_call_id='call_1'
```
An `AIMessage` carrying `tool_calls` also serializes to `{"role":
"assistant", "content": ...}` with the `tool_calls` silently dropped.
This brings the converter to parity with `langchain-openai`: serialize
`tool_calls` / `invalid_tool_calls`, send `content` as `null` when
tool_calls are present, and add a `tool`-role branch for `ToolMessage`.
How I verified: added unit tests for the `ToolMessage` and
`AIMessage.tool_calls` / `invalid_tool_calls` cases; the perplexity
package unit tests, lint, and format all pass.
Scope: translating these to the Responses (Agent) API's `function_call`
/ `function_call_output` input items is a separate follow-up; this PR is
the Chat Completions serialization parity fix.
---------
Co-authored-by: Claude Opus 4.8 <noreply@anthropic.com>
Co-authored-by: Mason Daugherty <mason@langchain.dev>
Co-authored-by: Mason Daugherty <github@mdrxy.com>
Removes the built-in Bedrock class init validator from `load` so Bedrock
kwargs such as `base_url` and `endpoint_url` are no longer specially
rejected during deserialization.
This keeps provider-specific SSRF policy out of core; callers should
continue to avoid untrusted manifests or use restrictive
`allowed_objects`.
Verified with `make format`, `make lint`, and the focused serialization
load unit tests.
AI-assisted contribution by Open SWE.
Made by [Open SWE](https://openswe.vercel.app)
---------
Co-authored-by: open-swe[bot] <215916821+open-swe[bot]@users.noreply.github.com>
Automated refresh of model profile data for all in-monorepo partner
integrations via `langchain-profiles refresh`.
🤖 Generated by the `refresh_model_profiles` workflow.
Co-authored-by: mdrxy <61371264+mdrxy@users.noreply.github.com>
Rewrote the `ModelProfile` docstrings to point readers at canonical
docs. The class docstring now explains how profiles are accessed and
where the data comes from, and several terse field docstrings gain a
one-line clarification or a link to the relevant guide.
Test targets run with `-n auto`, which makes `pytest-benchmark` (present
via `langchain-tests`) auto-disable itself and emit a
`PytestBenchmarkWarning` once per xdist worker. Passing
`--benchmark-disable` turns the plugin off explicitly so the warning
never fires, matching what `core` and `langchain_v1` already do.
## Changes
- Add `--benchmark-disable` to the `-n auto` test targets across
`langchain` (unit) and 14 partner packages' integration targets:
`anthropic`, `chroma`, `deepseek`, `exa`, `fireworks`, `groq`,
`huggingface`, `mistralai`, `nomic`, `ollama`, `openai`, `openrouter`,
`qdrant`, `xai`.
- Deliberately excluded `text-splitters` and `model-profiles`: their
`test` group doesn't install `pytest-benchmark`, so the flag would fail
with `unrecognized arguments`. Verified by importing the plugin under
each package's actual dependency group before editing.
aiohttp **3.14.0** (released 2026-06-01) removed
`aiohttp.streams.AsyncStreamReaderMixin`. The current release of `vcrpy`
(8.1.1) still subclasses it in its aiohttp stub:
```python
class MockStream(asyncio.StreamReader, streams.AsyncStreamReaderMixin):
```
As a result, the VCR fixture fails to import during test setup, and the
scheduled integration tests for the `fireworks` and `xai` partners error
out at `test_stream_time`:
```
AttributeError: module 'aiohttp.streams' has no attribute 'AsyncStreamReaderMixin'
```
Only these two partners are affected because their SDKs use `aiohttp`,
so VCR loads its aiohttp stub; partners on `httpx` (e.g. `anthropic`,
`openai`) are unaffected.
This is an upstream incompatibility tracked at kevin1024/vcrpy#995, with
a fix in kevin1024/vcrpy#996 that has not yet been released.
Until a fixed `vcrpy` ships, this caps `aiohttp<3.14.0` via `[tool.uv]
constraint-dependencies` (the same mechanism already used for the
`pygments` CVE pin) in both packages. Using a constraint rather than
tightening the published `aiohttp>=3.9.1,<4.0.0` range means the change
only affects local/CI resolution — the package metadata users install
against is untouched. Both lockfiles re-resolve to `aiohttp` 3.13.5,
which still provides the removed attribute. A code comment points back
to the upstream fix so the pin can be removed once it lands.
Automated refresh of model profile data for all in-monorepo partner
integrations via `langchain-profiles refresh`.
🤖 Generated by the `refresh_model_profiles` workflow.
Co-authored-by: mdrxy <61371264+mdrxy@users.noreply.github.com>
## Summary
- Bump `langsmith` in `libs/partners/huggingface/uv.lock` from 0.7.31 to
0.8.9
- Bump `langchain-classic` in `libs/langchain_v1/uv.lock` from 1.0.0 to
1.0.7
- Bump transitive `langsmith` in `libs/langchain_v1/uv.lock` from 0.8.0
to 0.8.9
## Notes
- The open Chroma alert currently has no patched version in GitHub
Dependabot metadata. `chromadb` is already at 1.5.9 on `master`, and
PyPI shows 1.5.9 as the latest available release.
## Testing
- `uv lock --check` in `libs/partners/huggingface`
- `uv lock --check` in `libs/langchain_v1`
- `uv lock --check` in `libs/partners/chroma`
- `git diff --check`
[](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores)
Dependabot will resolve any conflicts with this PR as long as you don't
alter it yourself. You can also trigger a rebase manually by commenting
`@dependabot rebase`.
[//]: # (dependabot-automerge-start)
[//]: # (dependabot-automerge-end)
---
<details>
<summary>Dependabot commands and options</summary>
<br />
You can trigger Dependabot actions by commenting on this PR:
- `@dependabot rebase` will rebase this PR
- `@dependabot recreate` will recreate this PR, overwriting any edits
that have been made to it
- `@dependabot show <dependency name> ignore conditions` will show all
of the ignore conditions of the specified dependency
- `@dependabot ignore this major version` will close this PR and stop
Dependabot creating any more for this major version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this minor version` will close this PR and stop
Dependabot creating any more for this minor version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this dependency` will close this PR and stop
Dependabot creating any more for this dependency (unless you reopen the
PR or upgrade to it yourself)
You can disable automated security fix PRs for this repo from the
[Security Alerts
page](https://github.com/langchain-ai/langchain/network/alerts).
</details>
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: John Kennedy <65985482+jkennedyvz@users.noreply.github.com>