`Runnable.__or__`, `Runnable.__ror__`, and their `RunnableSequence` and
`StructuredPrompt` overrides previously erased composition types: the
right-hand operand was typed `Runnable[Any, Other]`, so piping two
runnables together always produced `RunnableSerializable[Input, Any]`.
Type information was lost at every `|`, which is why chains so often
needed a `chain: Runnable = ...` annotation just to recover usable
inference.
This adds `@overload`s so the `Output` of one step flows into the
`Input` of the next and the composed result carries the real `Output`
type through. `Runnable[int, str] | Runnable[str, float]` now infers
`RunnableSerializable[int, float]` instead of `[int, Any]`.
`coerce_to_runnable` gains overloads so a `Mapping` resolves to
`RunnableParallel` while everything else stays a `Runnable`. As a
knock-on effect, dozens of now-unnecessary `: Runnable` annotations were
dropped from the test suite.
Runtime behavior is unchanged — this is a typing-only change.
## Impact on type-checked code
Most users will simply get better inference. Two changes can require a
small adjustment if you run a type checker (`mypy`, `pyright`):
### Stricter operand matching in `|`
The right-hand side of `|` is now typed `Runnable[Output, Other]` rather
than `Runnable[Any, Other]`, so the right operand's declared **input**
must match the left operand's **output**. This is more accurate, but it
surfaces a common pattern that was previously silent: piping a step that
outputs a plain `dict` into a step whose declared input is a more
specific type (for example a `TypedDict`). It still works at runtime;
the checker now reports an `[operator]` error.
If you hit this, narrow the boundary with a `cast` (or an explicit
annotation):
```python
from typing import Any, cast
from langchain_core.runnables import Runnable
# upstream outputs a dict; downstream declares a narrower input type
chain = cast("Runnable[Any, MyInput]", upstream) | downstream
```
### `list` → `Sequence` on `RunnableEach` / `map()`
`Runnable.map()` and the `invoke` / `ainvoke` methods of `RunnableEach`
now accept `Sequence[Input]` instead of `list[Input]`. Callers are
unaffected — a `list` is a `Sequence`, and tuples or other sequences now
type-check too. The only thing to adjust: if you **subclass**
`RunnableEach` (or `RunnableEachBase`) and override these methods with a
`list[...]` parameter, widen the annotation to `Sequence[...]` so the
override stays compatible with the base signature.
---------
Co-authored-by: Mason Daugherty <github@mdrxy.com>
`handle_tool_error` callables can now return structured message content
as any valid sequence, not just a mutable `list`. Valid structured
sequences are normalized to the `ToolMessage` content shape at the tool
output boundary, while invalid content still falls back to
stringification.
## Changes
- Widened `ToolExceptionHandlerOutput` from `list[str | dict[str, Any]]`
to `Sequence[MessageContentBlock]` so handlers returning `list[dict[str,
Any]]` or tuple content blocks type-check cleanly.
- Added `_normalize_message_content` to validate structured message
content and convert valid non-string sequences to the `list` shape
expected by `ToolMessage`.
- Preserved existing stringification behavior for invalid structured
content blocks instead of treating failed normalization as `None`.
- Removed the now-unused `_is_message_content_type` helper; output
formatting validates content directly through
`_normalize_message_content`.
`handle_tool_error` callables can already return structured message
content at runtime, but the public typing only allowed strings. The tool
error handling API now reflects the existing output formatting path,
including clearer docs for how handled errors become
`ToolMessage(status="error")` results.
`dict()` is a problematic method name as it clashes with the builtin
`dict` used as a type annotation.
This PR replaces it with an `asdict` method (inspired by dataclasses).
It also fixes a few places where `dict` must be replaced by
`builtins.dict` until the `dict()` method is removed.
---------
Co-authored-by: Mason Daugherty <github@mdrxy.com>
Removes the built-in Bedrock class init validator from `load` so Bedrock
kwargs such as `base_url` and `endpoint_url` are no longer specially
rejected during deserialization.
This keeps provider-specific SSRF policy out of core; callers should
continue to avoid untrusted manifests or use restrictive
`allowed_objects`.
Verified with `make format`, `make lint`, and the focused serialization
load unit tests.
AI-assisted contribution by Open SWE.
Made by [Open SWE](https://openswe.vercel.app)
---------
Co-authored-by: open-swe[bot] <215916821+open-swe[bot]@users.noreply.github.com>
The v3 streaming path drops `additional_kwargs` from per-chunk
`AIMessageChunk`s during assembly: `chunks_to_events` emits no event
field for them, and `ChatModelStream._assemble_message` constructs the
final `AIMessage` without an `additional_kwargs` argument. Non-streaming
`ainvoke` returns the provider message unchanged, so streaming and
non-streaming diverge for any provider that uses `additional_kwargs` to
carry data outside the typed protocol blocks.
## How this surfaces
The concrete failure mode is Gemini's
`__gemini_function_call_thought_signatures__` — a per-tool-call
signature blob the Google GenAI integration places in
`additional_kwargs`, keyed by `tool_call_id`. Gemini requires that
signature on follow-up turns to replay the prior thought trace; without
it, multi-turn streaming flows lose thought continuity (and may
regenerate thinking, charging additional reasoning tokens, or in some
cases refuse). Other providers that use `additional_kwargs` (e.g. older
`function_call` accumulators, custom routing metadata) hit the same gap;
the fix is intentionally provider-agnostic.
## Fix
Provider-agnostic, two seams:
- `_compat_bridge` accumulates `msg.additional_kwargs` across chunks
with `merge_dicts` (matching `AIMessageChunk`'s own merge semantics for
fields that accumulate, like `function_call`) and emits the merged dict
on the `message-finish` event as an off-spec extension. The bridge
already uses one such extension (`metadata` on `MessageFinishData`);
this PR follows the same pattern for `additional_kwargs`.
- `ChatModelStream._finish` reads the new field; `_assemble_message`
threads it onto the final `AIMessage` only when non-empty, preserving
today's behavior of leaving `additional_kwargs` empty when no provider
data needs to ride on it.
Closes#37420
---
`stream_events(version="v3")` (and the `astream_events` async twin)
silently dropped reasoning content from the final assembled `AIMessage`
whenever the same message also produced a tool_call. The bug reproduces
against Gemini 2.5 Pro with `include_thoughts=True`: reasoning streams
correctly through `ChatModelStream.reasoning`, but the persisted message
in the final graph state carries only the `tool_call` block.
## Root cause
`_iter_protocol_blocks` in the compat bridge groups per-chunk content
blocks by source-side identifier. When a provider doesn't supply an
`index` field on its content blocks — which the Google GenAI translator
does not for either `reasoning` or `tool_call` blocks — the bridge falls
back to positional `i` as the bucket key. Because Gemini typically emits
one block per chunk, every reasoning chunk and the later tool_call chunk
all key to `0`, and the type mismatch trips `_accumulate`'s
self-contained `else` branch. That branch clears accumulated reasoning
state and replaces it with the incoming tool_call, so reasoning never
reaches `content-block-finish`.
## Fix
When a block has no source-side `index`, key it by `("__lc_no_index__",
block_type, positional_i)` instead of bare `i`. Same-type chunks at the
same position still share a bucket and merge cleanly (streaming text and
reasoning unchanged); different-type chunks at the same position now
occupy distinct wire blocks and both reach `content-block-finish`.
Providers that supply explicit indices (Anthropic, OpenAI Responses) are
unaffected.
## Verification
Unit-tested at the compat-bridge layer for both sync
(`chunks_to_events`) and async (`achunks_to_events`) paths.
Verified live against Gemini 2.5 Pro `gemini-2.5-pro` with
`thinking_budget=2048`, `include_thoughts=True`, and a single
`get_weather` tool. Pre-fix:
`final_state.messages[tool_calling_ai_message].content == [{type:
tool_call, ...}]`. Post-fix: `[..., {type: reasoning, reasoning: "..."},
{type: tool_call, ...}]`, matching the shape `ainvoke` returns on the
same input.
`langchain_core._api.deprecation` previously did `from
pydantic.v1.fields import FieldInfo as FieldInfoV1` at module scope,
which triggers Pydantic's `UserWarning("Core Pydantic V1 functionality
isn't compatible with Python 3.14 or greater.")` on every
`langchain_core` import under 3.14+. The v1 symbol is only needed inside
one runtime branch of `@deprecated`, so it's now resolved lazily.
## Changes
- Replace the top-level v1 `FieldInfo` import with
`_is_pydantic_v1_field_info`, which probes
`sys.modules.get("pydantic.v1.fields")` instead of forcing the import.
The reconstruction inside `deprecated`'s `finalize` closure imports
`FieldInfoV1` lazily, gated by the predicate — so the warning only fires
if a caller has already loaded `pydantic.v1` themselves.
- Add a subprocess-based regression test asserting that importing
`langchain_core._api.deprecation` does not pull any `pydantic.v1*`
module into `sys.modules`. Verified to fail when the eager import is
reintroduced.
- Add a v1 `FieldInfo` decoration test — the v1 branch of `@deprecated`
previously had zero direct coverage.
- Update the stale `# Last Any should be FieldInfoV1 but this leads to
circular imports` comment on `T`'s bound, which no longer reflects the
real reason (it's about the 3.14 warning, not circularity).
Tool runs in `_TracerCore._create_tool_run` were discarding the
structured `inputs` dict that `BaseTool.run` passes to `on_tool_start`,
replacing it with `{"input": str(filtered_tool_input)}`. Consequently,
every multi-arg tool (e.g. ones in `deepagents` like `execute`,
`edit_file`, `write_file`, `grep`, ...) appeared in LangSmith with a
stringified, escaped dump of its arguments — multi-line bash commands
rendered with `\n` and were effectively unreadable. Chain runs already
preserved dicts via `_get_chain_inputs`; tool runs are now symmetric.
## Changes
- Preserve `inputs` when it is already a `dict` in the `original` /
`original+chat` branch of `_TracerCore._create_tool_run`, falling back
to `{"input": input_str}` only when no structured payload was provided
- Add regression tests in the sync and async base-tracer suites that
pass a structured `inputs` to `on_tool_start` and assert the dict
survives onto the resulting `Run`
## Breaking change
Custom `BaseTracer` subclasses that parsed `Run.inputs["input"]` as a
stringified dict for tool runs will need to read the structured fields
directly. The shape now matches what `on_tool_start(inputs=...)` has
always received — introduced alongside `_schema_format` in the
`astream_events` work — and what `streaming_events` consumers already
see.
Drop the `NotImplementedError` branch in `warn_deprecated` so callers
can pass `pending=False` without specifying a `removal` version. The
previous behavior contradicted the docstring (which claimed an empty
default would auto-compute a removal version) — no such computation
existed; the function just raised a placeholder "Need to determine which
default deprecation schedule to use" error.
When a langsmith `@traceable` function invokes a LangChain Runnable or
LangGraph subgraph, the callback manager's `_configure` function injects
the `@traceable` RunTree into the `LangChainTracer`'s `run_map` so that
child runs can resolve their parent for trace nesting. However, since
the RunTree was created outside the tracer's callback lifecycle,
`_end_trace` never removes it. The entry persists in `run_map`
indefinitely, retaining the full RunTree and its entire child tree.
In applications with nested subgraph invocations (e.g. an outer
investigation graph delegating to skill agent subgraphs, each compiled
as their own `StateGraph`), this causes RunTree objects to accumulate
linearly with every call.
**Fix:** Track which `run_map` entries were injected externally via a
shared `_external_run_ids` refcount dict on `_TracerCore`. When
`_start_trace` adds a child under an external parent, it increments the
count. When `_end_trace` finishes a child, it decrements — and evicts
the external parent from `run_map` once the last child completes.
The refcount (rather than a simple set) is necessary because a single
external parent may have multiple sibling children in the callback chain
(e.g. a `prompt | llm` `RunnableSequence`). Only truly external runs are
tracked — the `_configure` guard `if run_id_str not in handler.run_map`
prevents tracer-managed runs from being misclassified.
Resolve symlinks before validating file extensions in the deprecated
`save()` method on prompt classes.
Credit to Jeff Ponte (@JDP-Security) for reporting the symlink
resolution issue.
Adds serialization mappings for `ChatBedrockConverse` and `BedrockLLM`
to unblock standard tests on `langchain-core>=1.2.5` (context:
[langchain-aws#821](https://github.com/langchain-ai/langchain-aws/pull/821)).
Also introduces a class-specific validator system in
`langchain_core.load` that blocks deserialization of AWS Bedrock models
when `endpoint_url` or `base_url` parameters are present, preventing
SSRF attacks via crafted serialized payloads.
Closes#34645
## Changes
- Add `ChatBedrockConverse` and `BedrockLLM` entries to
`SERIALIZABLE_MAPPING` in `mapping.py`, mapping legacy paths to their
`langchain_aws` import locations
- Add `validators.py` with `_bedrock_validator` — rejects
deserialization kwargs containing `endpoint_url` or `base_url` for all
Bedrock-related classes (`ChatBedrock`, `BedrockChat`,
`ChatBedrockConverse`, `ChatAnthropicBedrock`, `BedrockLLM`, `Bedrock`)
- `CLASS_INIT_VALIDATORS` registry covers both serialized (legacy) keys
and resolved import paths from `ALL_SERIALIZABLE_MAPPINGS`, preventing
bypass via direct-path payloads
- Move kwargs extraction and all validator checks
(`CLASS_INIT_VALIDATORS` + `init_validator`) in `Reviver.__call__` to
run **before** `importlib.import_module()` — fail fast on security
violations before executing third-party code
- Class-specific validators are independent of `init_validator` and
cannot be disabled by passing `init_validator=None`
## Testing
- `test_validator_registry_keys_in_serializable_mapping` — structural
invariant test ensuring every `CLASS_INIT_VALIDATORS` key exists in
`ALL_SERIALIZABLE_MAPPINGS`
- 10 end-to-end `load()` tests covering all Bedrock class paths (legacy
aliases, resolved import paths, `ChatAnthropicBedrock`,
`init_validator=None` bypass attempt)
- Unit tests for `_bedrock_validator` covering `endpoint_url`,
`base_url`, both params, and safe kwargs
---------
Co-authored-by: Mason Daugherty <mason@langchain.dev>
Co-authored-by: Mason Daugherty <github@mdrxy.com>
PR #35788 added 7 new fields to the `langchain-profiles` CLI output
(`name`, `status`, `release_date`, `last_updated`, `open_weights`,
`attachment`, `temperature`) but didn't update `ModelProfile` in
`langchain-core`. Partner packages like `langchain-aws` that set
`extra="forbid"` on their Pydantic models hit `extra_forbidden`
validation errors when Pydantic encountered undeclared TypedDict keys at
construction time. This adds the missing fields, makes `ModelProfile`
forward-compatible, provides a base-class hook so partners can stop
duplicating model-profile validator boilerplate, migrates all in-repo
partners to the new hook, and adds runtime + CI-time warnings for schema
drift.
## Changes
### `langchain-core`
- Add `__pydantic_config__ = ConfigDict(extra="allow")` to
`ModelProfile` so unknown profile keys pass Pydantic validation even on
models with `extra="forbid"` — forward-compatibility for when the CLI
schema evolves ahead of core
- Declare the 7 missing fields on `ModelProfile`: `name`, `status`,
`release_date`, `last_updated`, `open_weights` (metadata) and
`attachment`, `temperature` (capabilities)
- Add `_warn_unknown_profile_keys()` in `model_profile.py` — emits a
`UserWarning` when a profile dict contains keys not in `ModelProfile`,
suggesting a core upgrade. Wrapped in a bare `except` so introspection
failures never crash model construction
- Add `BaseChatModel._resolve_model_profile()` hook that returns `None`
by default. Partners can override this single method instead of
redefining the full `_set_model_profile` validator — the base validator
calls it automatically
- Add `BaseChatModel._check_profile_keys` as a separate
`model_validator` that calls `_warn_unknown_profile_keys`. Uses a
distinct method name so partner overrides of `_set_model_profile` don't
inadvertently suppress the check
### `langchain-profiles` CLI
- Add `_warn_undeclared_profile_keys()` to the CLI (`cli.py`), called
after merging augmentations in `refresh()` — warns at profile-generation
time (not just runtime) when emitted keys aren't declared in
`ModelProfile`. Gracefully skips if `langchain-core` isn't installed
- Add guard test
`test_model_data_to_profile_keys_subset_of_model_profile` in
model-profiles — feeds a fully-populated model dict to
`_model_data_to_profile()` and asserts every emitted key exists in
`ModelProfile.__annotations__`. CI fails before any release if someone
adds a CLI field without updating the TypedDict
### Partner packages
- Migrate all 10 in-repo partners to the `_resolve_model_profile()`
hook, replacing duplicated `@model_validator` / `_set_model_profile`
overrides: anthropic, deepseek, fireworks, groq, huggingface, mistralai,
openai (base + azure), openrouter, perplexity, xai
- Anthropic retains custom logic (context-1m beta → `max_input_tokens`
override); all others reduce to a one-liner
- Add `pr_lint.yml` scope for the new `model-profiles` package
Closes#29530
---
Remove a stale BlockBuster allowlist entry in `conftest.py` referencing
`aconfig_with_context` — the function and its containing module
(`langchain_core/beta/runnables/context.py`) were deleted in `fded6c6b1`
(Sep 2025, #32850). Spotted by @antonio-mello-ai in #29530.
Fixes missing `run.metadata.usage_metadata` population in
`LangChainTracer` for real LLM/chat traces following #34414
- Fix extraction to read usage from serialized tracer message shape:
`outputs.generations[*][*].message.kwargs.usage_metadata`
- Remove non-serialized direct message shape handling
(`message.usage_metadata`) from extractor to match real tracer output
path
- Clarify tracer docstrings around chat callback naming
(`on_chat_model_start` + shared `on_llm_end`) to reduce ambiguity
## Why
#34414 introduced usage duplication into `run.metadata.usage_metadata`,
but the extractor read `message.usage_metadata`.
In real tracer flow, messages are serialized with `dumpd(...)` during
run completion, so usage metadata lives under
`message.kwargs.usage_metadata`. Because of this mismatch, duplication
did not trigger in real traces.