- Bumps `langchain` to **1.3.2** (patch)
- Raises minimum `langgraph` requirement from `>=1.2.1` to `>=1.2.2`
langgraph 1.2.2 fixes a race condition where DeltaChannel checkpoint
writes
could serialize `BaseMessage` objects with `id=None` before
`apply_writes()`
ran the reducer, causing the same message to appear with a different ID
on
every `get_state()` call and across resumed invocations
(langchain-ai/langgraph#7913).
The lockfile will be updated once langgraph 1.2.2 is published to PyPI
(langchain-ai/langgraph#7914).
`PIIMiddleware` previously scrubbed detected PII only at the state level
via its `after_model` / `before_model` hooks. Consumers reading the live
stream — `astream_events(version="v3")` or `run.messages` /
`run.tool_calls` / `run.values` — saw the raw model text, the raw
tool-call args, the raw tool outputs, and the raw state snapshots until
the run finished and the canonical conversation history was written.
This change registers a stream transformer ahead of
`MessagesTransformer` that redacts every wire surface of an agent run.
The transformer holds a sliding lookback buffer (default 128 characters)
per `(run_id, content-block index)` so PII patterns that straddle delta
boundaries are caught before the safe prefix is released downstream.
Anything older than the lookback is run through the configured detector
and emitted; the trailing tail stays buffered until a later delta
extends it past the cap or the block finishes. `_finalize_block` always
re-runs detection over the full block snapshot so the finalized content
lands fully redacted even when the in-flight buffer never released a
tail (short responses, or PII arriving in the final delta).
The `block` strategy is now supported on the streaming path via a
buffering mode that withholds every delta until the block resolves —
clean blocks release the full text at finalize, PII-bearing blocks zero
the wire and let `after_model` / `apply_to_tool_results` raise
`PIIDetectionError` on the original state message. Activation is gated
on `apply_to_output=True`, matching the existing post-hoc semantics. The
middleware's transformer factory is cloned by `StreamMux._make_child`
into every subgraph scope, so attaching `PIIMiddleware` at the outer
agent also redacts streamed deltas from sub-agents invoked inside tools.
## Tool-call and tools-channel coverage
The transformer covers every wire surface of an agent run, not just AI
message text:
- **Streamed AI text deltas** (`content-block-delta` of type
`text-delta`) — lookback machinery, redacted in place.
- **Streamed tool-call args** (`content-block-delta` with
`tool_call_chunk` / `server_tool_call_chunk` fields) — each delta
carries the full cumulative args string; detection runs on the field
directly and redacts in place. Verified empirically against
`_compat_bridge.py` and the consumer-side
`_merge_block_delta_into_store` snapshot-replace semantics.
- **Finalized tool-call blocks** (`content-block-finish` with
`tool_call` / `server_tool_call` / `invalid_tool_call`) — `args` dict
walked recursively and each string leaf redacted.
- **Tool execution events on the `tools` channel** —
`tool-started.input`, `tool-output-delta`, `tool-finished.output`,
`tool-error.message` all run through detection. String deltas use the
same lookback machinery as text-deltas keyed by `tool_call_id`;
structured payloads walk recursively.
- **State snapshots on the `values` channel** — message lists are walked
and each message's `.content` is redacted on a fresh copy. Graph state
itself stays intact for the state-level enforcer
(`apply_to_tool_results` via `before_model`) to act on independently.
- **Legacy `(BaseMessage, metadata)` payloads** on the `messages`
channel (Python 3.10 path, where `langgraph`'s `ASYNCIO_ACCEPTS_CONTEXT
= sys.version_info >= (3, 11)` falls back to a code path that doesn't
propagate the streaming callback into the chat model) — `.content` and
`AIMessage.tool_calls[*].args` are scrubbed. For `block`, the event's
`data` tuple is replaced with an empty-content copy so the original
message stays in state for `after_model` to raise on.
## Worth a careful look
- `_PIIStreamTransformer._mutate_text_delta` — lookback partition.
Anything older than `lookback` characters is released after redaction;
the tail stays buffered. Bulletproof against whitespace-permissive
detectors (notably `credit_card`, whose regex matches across spaces).
- `_PIIStreamTransformer._mutate_tool_call_chunk_delta` — direct
in-place redaction of the cumulative args string. No buffer; the wire
shape is cumulative-snapshot, the consumer-side merge is
replace-not-append.
- `_PIIStreamTransformer._mutate_legacy_payload` — the dual path:
mutate-in-place for non-`block` (idempotent with `after_model`),
replace-with-empty-copy for `block` (keeps original in graph state for
`after_model` to raise on).
- `_PIIStreamTransformer._redact_value` — the recursive walker.
`BaseMessage` branch returns a fresh `.content`-redacted copy via
`model_copy(update=...)` — never mutates in place — so tool-output
payloads that wrap a `ToolMessage` and message lists in state snapshots
flow through cleanly.
- The new `transformers` attribute on `PIIMiddleware`: this is what
makes `create_agent` pick the factory up. Multiple `PIIMiddleware`
instances each register one transformer; ordering is preserved within
the `before_builtins` lane.
## Compatibility
Bumps `langgraph` to `>=1.2.1` for the `before_builtins` opt-in on
`StreamTransformer`.
Bumps the `langchain-tests` minimum across the monorepo from `1.0.0` to
`1.1.9` and adds a partner-level `Makefile` so partner lockfiles can be
regenerated in one command, matching the existing convention under
`libs/`.
Dependabot has been stripping upper/lower bounds from internal
`langchain-*` deps in partner `pyproject.toml` files (e.g. #37288
reduced `langchain-core>=1.3.2,<2.0.0` to bare `langchain-core`). Locks
down the config so bumps preserve existing specifiers, and restores the
bounds it already mangled across the monorepo.
## Changes
- Add `versioning-strategy: increase` to every `uv` ecosystem block in
`.github/dependabot.yml` so future bumps move the lower bound in place
instead of rewriting the constraint.
- Ignore workspace-internal packages (`langchain-core`, `langchain`,
`langchain-classic`, `langchain-text-splitters`, `langchain-tests`,
`langchain-model-profiles`) on every `uv` block — these are editable
installs from local paths and their published constraints are
hand-curated for release, not Dependabot's to bump.
- Restore stripped bounds across all `libs/` packages — runtime
`dependencies` and every dep group (`test`, `dev`, `test_integration`,
`typing`, `lint`) — to `>=1.4.0,<2.0.0` for `langchain-core` and
`>=1.0.0,<2.0.0` for the other internal packages.
Re-enable the `[community]`, `[azure-ai]`, and `[cohere]` extras on
`langchain-classic`, and the `[cohere]` extra on `langchain` (v1). These
had been commented out as a temporary workaround during the `langchain`
-> `langchain-classic` rename so the renamed package could ship before
downstream partners were re-released against it. Now that
`langchain-community` 0.4.1, `langchain-cohere` 0.5.1, and
`langchain-azure-ai` 1.2.3 are published with the correct dependency
targets, the extras can be restored.
Bumps `langchain` from 1.2.16 → 1.2.17.
Picks up:
- `respond` decision added to HITL middleware (#37095)
> This PR was opened with AI-agent assistance.
## Description
Updates package metadata and README badges so LangChain social links
point to the new `@langchain_oss` X handle. This was completed with
AI-agent assistance.
## Test Plan
- [ ] Validate README badges and package metadata links point to
`https://x.com/langchain_oss`
_Opened collaboratively by Mason Daugherty and open-swe._
---------
Co-authored-by: open-swe[bot] <open-swe@users.noreply.github.com>
Co-authored-by: Mason Daugherty <61371264+mdrxy@users.noreply.github.com>
## Summary
Stop inlining the full agent state into every tool-dispatch `Send` in
`create_agent`. Dispatch with the bare list form `Send("tools",
[tool_call])` and let `ToolNode` hydrate `ToolRuntime.state` from graph
channels at tool-execution time.
**Depends on**
[langchain-ai/langgraph#7594](https://github.com/langchain-ai/langgraph/pull/7594)
which teaches `ToolNode` to read channel state via `CONFIG_KEY_READ`
when given a bare tool-call list. `uv.lock` pins that branch for CI
while the langgraph PR is in flight — this pin will be reverted to a
published `langgraph` version before merge.
## What was happening
Before this change, every pending tool call produced a `Send` whose
payload was:
```python
ToolCallWithContext(
__type="tool_call_with_context",
tool_call=tool_call,
state=state, # ← the FULL agent state dict, including messages list
)
```
For any agent that runs many turns, `state["messages"]` grows linearly
with the conversation. Every super-step that dispatches tools serializes
that whole list into every `Send`, and those Sends live forever in the
checkpointer's `__pregel_tasks` writes. The result is **O(N²)
`__pregel_tasks` storage** across a run.
## What changed
- `libs/langchain_v1/langchain/agents/factory.py`:
- `_make_model_to_tools_edge` now returns `Send("tools", [tool_call])` —
no inlined state.
- Drops the `ToolCallWithContext` import.
- `libs/langchain_v1/pyproject.toml` + `libs/langchain_v1/uv.lock`:
- Temporary `[tool.uv.sources]` pin on `langgraph`,
`langgraph-prebuilt`, `langgraph-checkpoint` to the companion PR branch
so CI exercises both changes end-to-end. Revert after langgraph release.
## Why it's safe
- Same snapshot semantics as before. `Send` is emitted at the end of the
model super-step and consumed at the start of the tools super-step;
channels by that point reflect every write from the model super-step
(including the new AIMessage). Parallel tool tasks all see the same
values since sibling writes don't land until end-of-super-step.
- Legacy `ToolCallWithContext` input path is preserved in `ToolNode` —
no-op for any external caller still constructing it by hand.
## Test plan
- [x] `tests/unit_tests/agents/` — **738 passed, 2 skipped, 1 xfailed**
- [x] `ruff check .` / `ruff format .` — clean
- [x] `mypy langchain/agents/factory.py` — clean
- [x] Before/after benchmark (below)
## Benchmark
Script runs `create_agent` with a mock `GenericFakeChatModel` and two
tools (`write_file`, `edit_file`). Each of the N turns dispatches 2 tool
calls. After the run, the `InMemorySaver` is inspected for bytes stored
under `__pregel_tasks` — the channel that carries the tool-dispatch
`Send` payloads.
| N | TASKS before | TASKS after | ratio |
|---:|---:|---:|---:|
| 5 | 87.6 KB | **4.7 KB** | **18.6× smaller** |
| 10 | 335 KB | **9.4 KB** | **35.7× smaller** |
| 25 | 2.05 MB | **23.7 KB** | **86.5× smaller** |
| 50 | 8.14 MB | **47.6 KB** | **171× smaller** |
| 100 | 32.5 MB | **95.3 KB** | **341× smaller** |
| 200 | 130 MB | **192 KB** | **677× smaller** |
| 500 | 815 MB | **482 KB** | **1,691× smaller** |
**Growth shape:**
- **Before:** per-Send bytes scale with current `messages` length (full
state is inlined), so total TASKS across N turns = Σ(2 × k) for k=1..N ≈
O(N²).
- **After:** per-Send bytes are constant — just the `tool_call` dict.
Total TASKS is O(#dispatches), completely independent of conversation
length. In this bench with ~2 dispatches/turn: **940–964 bytes per turn
across N=5..500, essentially flat.**
An agent that makes 100 tool calls in a single turn pays the same TASKS
cost as one that makes 100 across 50 turns — which is the semantically
correct behavior.
Note: the `messages` channel is unchanged by this PR — it's still the
dominant storage term (growing O(N²) via `add_messages`). TASKS was a
second, compounding cost sitting on top of it; at N=100 it added 40% on
top of `messages`, at N=500 it added 67%. After the fix, TASKS is a
rounding error regardless of N.
---------
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
CVE-2025-71176 (medium severity)
All are dev-only (test dependency group) — no impact on published
packages.
### Why syrupy was also bumped
syrupy 4.x (`<5.0.0`) constrains pytest to `<9.0.0`, blocking the CVE
fix. Widening to `<6.0.0` allows syrupy 5.x which supports pytest 9.x.
## Summary
Bumps `pygments` to `>=2.20.0` across all 21 affected packages to
address [CVE-2026-4539](https://github.com/advisories/GHSA-XXXX) — ReDoS
via inefficient GUID regex in Pygments.
- **Severity:** Low
- **Fixed in:** 2.20.0 (was 2.19.2)
- **Change:** Added `pygments>=2.20.0` to `constraint-dependencies` in
`[tool.uv]` for each package, then ran `uv lock --upgrade-package
pygments` to regenerate lock files.
Closes Dependabot alerts #3435–#3455.
## Release Note
Patch deps
### Test Plan
- [x] CI Green 🙏
Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
Alternative to https://github.com/langchain-ai/langchain/pull/35024.
Paving the way for summarization in `wrap_model_call` (which requires
state updates).
---
Add `ExtendedModelResponse` dataclass that allows `wrap_model_call`
middleware to return a `Command` alongside the model response for
additional state updates.
```py
@dataclass
class ExtendedModelResponse(Generic[ResponseT]):
model_response: ModelResponse[ResponseT]
command: Command
```
## Motivation
Previously, `wrap_model_call` middleware could only return a
`ModelResponse` or `AIMessage` — there was no way to inject additional
state updates (e.g. custom state fields) from the model call middleware
layer. `ExtendedModelResponse` fills this gap by accepting an optional
`Command`.
This feature is needed by the summarization middleware, which needs to
track summarization trigger points calculated during `wrap_model_call`.
## Why `Command` instead of a plain `state_update` dict?
We chose `Command` rather than the raw `state_update: dict` approach
from the earlier iteration because `Command` is the established
LangGraph primitive for state updates from nodes. Using `Command` means:
- State updates flow through the graph's reducers (e.g. `add_messages`)
rather than being merged as raw dicts. This makes messages updates
additive alongside the model response instead of replacing them.
- Consistency with `wrap_tool_call`, which already returns `Command`.
- Future-proof: as `Command` gains new capabilities (e.g. `goto`,
`send`), middleware can leverage them without API changes.
## Why keep `model_response` separate instead of using `Command`
directly?
The model node needs to distinguish the model's actual response
(messages + structured output) from supplementary middleware state
updates. If middleware returned only a `Command`, there would be no
clean way to extract the `ModelResponse` for structured output handling,
response validation, and the core model-to-tools routing logic. Keeping
`model_response` explicit preserves a clear boundary between "what the
model said" and "what middleware wants to update."
Also, in order to avoid breaking, the `handler` passed to
`wrap_tool_call` needs to always return a `ModelResponse`. There's no
easy way to preserve this if we pump it into a `Command`.
One nice thing about having this `ExtendedModelResponse` structure is
that it's extensible if we want to add more metadata in the future.
## Composition
When multiple middleware layers return `ExtendedModelResponse`, their
commands compose naturally:
- **Inner commands propagate outward:** At composition boundaries,
`ExtendedModelResponse` is unwrapped to its underlying `ModelResponse`
so outer middleware always sees a plain `ModelResponse` from
`handler()`. The inner command is captured and accumulated.
- **Commands are applied through reducers:** Each `Command` becomes a
separate state update applied through the graph's reducers. For
messages, this means they're additive (via `add_messages`), not
replacing.
- **Outer wins on conflicts:** For non-reducer state fields, commands
are applied inner-first then outer, so the outermost middleware's value
takes precedence on conflicting keys.
- **Retry-safe:** When outer middleware retries by calling `handler()`
again, accumulated inner commands are cleared and re-collected from the
fresh call.
```python
class Outer(AgentMiddleware):
def wrap_model_call(self, request, handler):
response = handler(request) # sees ModelResponse, not ExtendedModelResponse
return ExtendedModelResponse(
model_response=response,
command=Command(update={"outer_key": "val"}),
)
class Inner(AgentMiddleware):
def wrap_model_call(self, request, handler):
response = handler(request)
return ExtendedModelResponse(
model_response=response,
command=Command(update={"inner_key": "val"}),
)
# Final state merges both commands: {"inner_key": "val", "outer_key": "val"}
```
## Backwards compatibility
Fully backwards compatible. The `ModelCallResult` type alias is widened
from `ModelResponse | AIMessage` to `ModelResponse | AIMessage |
ExtendedModelResponse`, but existing middleware returning
`ModelResponse` or `AIMessage` continues to work identically.
## Internals
- `model_node` / `amodel_node` now return `list[Command]` instead of
`dict[str, Any]`
- `_build_commands` converts the model response + accumulated middleware
commands into a list of `Command` objects for LangGraph
- `_ComposedExtendedModelResponse` is the internal type that accumulates
commands across layers during composition
Closes https://github.com/langchain-ai/langchain/issues/33956
* Making `ModelRequest` generic on `ContextT` and `ResponseT` so that we
can thread type information through to `wrap_model_call`
* Making builtin middlewares generic on `ContextT` and `ResponseT` so
their context and response types can be inferred from the `create_agent`
signature
See new tests that verify backwards compatibility (for cases where folks
use custom middleware that wasn't parametrized).
This fixes:
1. Lack of access to context and response types in `wrap_model_call`
2. Lack of cohesion between middleware context + response types with
those specified in `create_agent`
See examples below:
### Type-safe context and response access
```python
class MyMiddleware(AgentMiddleware[AgentState[AnalysisResult], UserContext, AnalysisResult]):
def wrap_model_call(
self,
request: ModelRequest[UserContext],
handler: Callable[[ModelRequest[UserContext]], ModelResponse[AnalysisResult]],
) -> ModelResponse[AnalysisResult]:
# ✅ Now type-safe: IDE knows user_id exists and is str
user_id: str = request.runtime.context["user_id"]
# ❌ mypy error: "session_id" doesn't exist on UserContext
request.runtime.context["session_id"]
response = handler(request)
if response.structured_response is not None:
# ✅ Now type-safe: IDE knows sentiment exists and is str
sentiment: str = response.structured_response.sentiment
# ❌ mypy error: "summary" doesn't exist on AnalysisResult
response.structured_response.summary
return response
```
### Mismatched middleware/schema caught at `create_agent`
```python
class SessionMiddleware(AgentMiddleware[AgentState[Any], SessionContext, Any]):
...
# ❌ mypy error: SessionMiddleware expects SessionContext, not UserContext
create_agent(
model=model,
middleware=[SessionMiddleware()],
context_schema=UserContext, # mismatch!
)
class AnalysisMiddleware(AgentMiddleware[AgentState[AnalysisResult], ContextT, AnalysisResult]):
...
# ❌ mypy error: AnalysisMiddleware expects AnalysisResult, not SummaryResult
create_agent(
model=model,
middleware=[AnalysisMiddleware()],
response_format=SummaryResult, # mismatch!
)
```
dependent upon https://github.com/langchain-ai/langgraph/pull/6711
1. relax constraint in `factory.py` to allow for tools not
pre-registered in the `ModelRequest.tools` list
2. always add tool node if `wrap_tool_call` or `awrap_tool_call` is
implemented
3. add tests confirming you can register new tools at runtime in
`wrap_model_call` and execute them via `wrap_tool_call`
allows for the following pattern
```py
from langchain_core.messages import HumanMessage, ToolMessage
from langchain_core.tools import tool
from libs.langchain_v1.langchain.agents.factory import create_agent
from libs.langchain_v1.langchain.agents.middleware.types import (
AgentMiddleware,
ModelRequest,
ToolCallRequest,
)
@tool
def get_weather(location: str) -> str:
"""Get the current weather for a location."""
return f"The weather in {location} is sunny and 72°F."
@tool
def calculate_tip(bill_amount: float, tip_percentage: float = 20.0) -> str:
"""Calculate the tip amount for a bill."""
tip = bill_amount * (tip_percentage / 100)
return f"Tip: ${tip:.2f}, Total: ${bill_amount + tip:.2f}"
class DynamicToolMiddleware(AgentMiddleware):
"""Middleware that adds and handles a dynamic tool."""
def wrap_model_call(self, request: ModelRequest, handler):
updated = request.override(tools=[*request.tools, calculate_tip])
return handler(updated)
def wrap_tool_call(self, request: ToolCallRequest, handler):
if request.tool_call["name"] == "calculate_tip":
return handler(request.override(tool=calculate_tip))
return handler(request)
agent = create_agent(model="openai:gpt-4o-mini", tools=[get_weather], middleware=[DynamicToolMiddleware()])
result = agent.invoke({
"messages": [HumanMessage("What's the weather in NYC? Also calculate a 20% tip on a $85 bill")]
})
for msg in result["messages"]:
msg.pretty_print()
```