Adds an optional `when` predicate to `InterruptOnConfig`, allowing
dynamic per-tool-call control over whether a HITL interrupt fires.
---
**`when` predicate in `InterruptOnConfig`**
```python
class InterruptOnConfig(TypedDict):
allowed_decisions: list[DecisionType]
description: NotRequired[str | _DescriptionFactory]
args_schema: NotRequired[dict[str, Any]]
when: NotRequired[Callable[[ToolCallRequest], bool]] # new
```
When provided, `when` is called before adding a tool call to the batch
interrupt. If it returns `False`, the call is auto-approved and
excluded. If it returns `True` (or `when` is absent), existing behaviour
is unchanged.
The predicate receives a `ToolCallRequest` with:
- `tool_call` — the raw tool call dict (name, args, id)
- `tool` — `None` (no `BaseTool` instance is available at the
`after_model` stage)
- `state` — current agent state
- `runtime` — a `ToolRuntime` constructed from the node-level `Runtime`,
with `tool_call_id` populated
Example:
```python
HumanInTheLoopMiddleware(
interrupt_on={
"delete_file": InterruptOnConfig(
allowed_decisions=["approve", "reject"],
when=lambda req: req.tool_call["args"].get("path", "").startswith("/etc"),
)
}
)
```
This change is fully backwards-compatible — `when` is `NotRequired` and
existing configs without it behave identically.
> This PR was developed with AI-agent assistance.
Automated refresh of model profile data for all in-monorepo partner
integrations via `langchain-profiles refresh`.
🤖 Generated by the `refresh_model_profiles` workflow.
Co-authored-by: mdrxy <61371264+mdrxy@users.noreply.github.com>
Bumps the uv pin in `.github/actions/uv_setup/action.yml` from `0.11.15`
to [`0.11.17`](https://github.com/astral-sh/uv/releases/tag/0.11.17).
Opened automatically by `bump_uv_pin.yml`. Mirror availability on
`releases.astral.sh` was verified before this PR was created, so CI
should not race the fallback.
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Automated refresh of model profile data for all in-monorepo partner
integrations via `langchain-profiles refresh`.
🤖 Generated by the `refresh_model_profiles` workflow.
Co-authored-by: mdrxy <61371264+mdrxy@users.noreply.github.com>
Automated refresh of model profile data for all in-monorepo partner
integrations via `langchain-profiles refresh`.
🤖 Generated by the `refresh_model_profiles` workflow.
Co-authored-by: mdrxy <61371264+mdrxy@users.noreply.github.com>
Local pre-commit already verified `pyproject.toml`/version-file sync for
`core` and `langchain_v1`, but not `anthropic` — even though the
`check_core_versions.yml` CI workflow already gates all three. This
wires the existing `anthropic` check into pre-commit so the mismatch
gets caught before commit rather than only in CI.
Anthropic's API rejects `tool_use`/`tool_result` IDs that don't match
`^[a-zA-Z0-9_-]+$` with a 400. When a conversation thread is replayed
across providers — e.g. a user switches a running thread from Kimi (via
Fireworks) to Claude — the prior turns carry tool-call IDs minted by the
other provider (Kimi emits `functions.write_todos:0`, whose `.` and `:`
are invalid), and the request fails. Tool-call IDs are now normalized to
an Anthropic-compatible form during request formatting, with the
original `tool_use.id` and its paired `tool_use_id` mapped identically
so they stay linked.
Automated refresh of model profile data for all in-monorepo partner
integrations via `langchain-profiles refresh`.
🤖 Generated by the `refresh_model_profiles` workflow.
Co-authored-by: mdrxy <61371264+mdrxy@users.noreply.github.com>
## Description
Bumps `langchain-perplexity` to require the Perplexity SDK release with
fixed Responses streaming and removes the temporary SSE shim workaround.
## Release Note
`langchain-perplexity` now requires `perplexityai>=0.34.1` for Responses
API streaming.
## Test Plan
- [x] `NO_COLOR=1 uv run --group test pytest
tests/unit_tests/test_chat_models_responses.py --disable-socket
--allow-unix-socket`
_Opened collaboratively by Mason Daugherty and open-swe._
---------
Co-authored-by: open-swe[bot] <open-swe@users.noreply.github.com>
Co-authored-by: Mason Daugherty <61371264+mdrxy@users.noreply.github.com>
Co-authored-by: Mason Daugherty <github@mdrxy.com>
Ports langchain-ai/deepagents#3626. Adds a workflow that clears the
`waiting-on-author` label from an issue or PR as soon as the original
author posts a follow-up comment, closing the manual loop where
maintainers had to remember to strip the label after a reply landed.
`_sanitize_chat_completions_content` now filters every content-part dict
against an allowlist derived at import time from the `fireworks-ai`
SDK's `ContentUnionMember1` TypedDict, and runs on every message role —
not just `ToolMessage`. Fixes 400s of the form `Extra inputs are not
permitted, field:
'messages[N].content.list[ChatMessageContent][i].<key>'` when
cross-provider history (e.g. an Anthropic-shaped `AIMessage` carrying
the v1 streaming-reassembly `index` marker) is forwarded to a
Fireworks-hosted model.
Automated refresh of model profile data for all in-monorepo partner
integrations via `langchain-profiles refresh`.
🤖 Generated by the `refresh_model_profiles` workflow.
Co-authored-by: mdrxy <61371264+mdrxy@users.noreply.github.com>
Closes#37360
Adds a `use_responses_api` flag to `ChatPerplexity` so requests can be
routed through Perplexity's Agent API (the Perplexity-flavored Responses
API) in addition to the existing Chat Completions endpoint. This mirrors
the `use_responses_api` flag on `ChatOpenAI`.
## Motivation
Perplexity exposes two HTTP surfaces from the same SDK client object:
`client.chat.completions.create()` (Chat Completions) and
`client.responses.create()` (Agent API, OpenAI-compatible Responses
shape). The Agent API supports built-in tools (`web_search`,
`fetch_url`, `finance_search`, `people_search`), `instructions`,
`input`, `previous_response_id`, and `include` — none of which exist on
Chat Completions. Today `ChatPerplexity` only calls Chat Completions, so
users who want the Agent API have to drop down to the raw SDK.
## What this changes
- New field `use_responses_api: bool | None = None` on `ChatPerplexity`.
- New module-level helper `_use_responses_api(payload)` that returns
`True` when the payload contains a built-in tool (any `tools[*]` whose
`type` is not `"function"`) or any of the Responses-only fields
`previous_response_id`, `instructions`, `input`, `include`.
- New instance method `ChatPerplexity._use_responses_api(payload)` that
honors `self.use_responses_api` when it is a `bool`, otherwise delegates
to the module helper.
- New converters `_convert_responses_to_chat_result(response)` and
`_convert_responses_stream_event_to_chunk(event)` that translate Agent
API objects/events into `AIMessage` and `AIMessageChunk` (preserving
`usage_metadata`, `response_metadata`, citations, images, related
questions, search results, and `function_call` tool calls).
- A surgical `_to_responses_payload(...)` helper that renames `messages`
→ `input` and `max_tokens` → `max_output_tokens`, passes through
Responses-supported fields, and parks anything Perplexity-specific under
`extra_body`.
- Each of the four API call sites (`_stream`, `_astream`, `_generate`,
`_agenerate`) now branches on `self._use_responses_api(payload)`. The
Chat Completions path is untouched.
## Auto-detection rules
When `use_responses_api` is unset (the default), routing is decided per
call from the outgoing payload:
- Has a built-in tool? → Responses
- Has `previous_response_id`, `instructions`, `input`, or `include`? →
Responses
- Otherwise → Chat Completions
Explicit `use_responses_api=True` or `=False` always overrides
auto-detection.
## Backwards compatibility
Existing usage is unchanged.
`ChatPerplexity(model="sonar").invoke("hi")` still calls
`client.chat.completions.create()`. No public field was renamed or
removed; the new field is purely additive.
## Tests
Adds `tests/unit_tests/test_chat_models_responses.py` covering the
helper, auto-detect routing, explicit overrides in both directions,
response-to-`AIMessage` conversion (content, `usage_metadata`,
`response_metadata.id`), `function_call` → `tool_calls` conversion, and
sync + async streaming of `response.output_text.delta` and
`response.completed` events. All mocks use `MagicMock`/`AsyncMock`; no
network calls.
## Notes for reviewers
This was implemented with help from an AI agent. The shape mirrors
`langchain-openai`'s `use_responses_api` — same field name, same helper
name, same docstring style — so the diff should be familiar.
Closes nothing — net new feature.
---------
Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>
Co-authored-by: Mason Daugherty <github@mdrxy.com>
- Bumps `langchain` to **1.3.2** (patch)
- Raises minimum `langgraph` requirement from `>=1.2.1` to `>=1.2.2`
langgraph 1.2.2 fixes a race condition where DeltaChannel checkpoint
writes
could serialize `BaseMessage` objects with `id=None` before
`apply_writes()`
ran the reducer, causing the same message to appear with a different ID
on
every `get_state()` call and across resumed invocations
(langchain-ai/langgraph#7913).
The lockfile will be updated once langgraph 1.2.2 is published to PyPI
(langchain-ai/langgraph#7914).
Refresh `langchain-core`'s lockfile so the dev/CI environment resolves
`uuid-utils` to a release that ships free-threading wheels (`cp313t`,
`cp314t`). Unblocks `pip install` on Python 3.14 free-threaded builds —
previously the lock pinned `0.14.1`, which had no FT wheel and forced an
sdist build. Related to #34870.
Enable `pytest-retry` on the `integration_tests` Make target so live API
flakes (e.g. `test_batch` timing out against `api.anthropic.com`) no
longer fail the job on first miss. Matches the existing convention in
`libs/partners/groq`.
Automated refresh of model profile data for all in-monorepo partner
integrations via `langchain-profiles refresh`.
🤖 Generated by the `refresh_model_profiles` workflow.
Co-authored-by: mdrxy <61371264+mdrxy@users.noreply.github.com>
Automated refresh of model profile data for all in-monorepo partner
integrations via `langchain-profiles refresh`.
🤖 Generated by the `refresh_model_profiles` workflow.
Co-authored-by: mdrxy <61371264+mdrxy@users.noreply.github.com>
Add job-level `concurrency` to the scheduled integration tests so
per-package shards from overlapping workflow dispatches don't hit the
same live API credentials at once — e.g. a manually triggered
`anthropic` run colliding with the daily `all libs` run.
Scheduled integration runs set `LANGSMITH_TAGS` and `LANGSMITH_METADATA`
in `$GITHUB_ENV` (per #37615), but the LangSmith SDK does not read those
env vars natively, so the tags/metadata were silently dropped. A new
pytest plugin in `langchain-tests` bridges that gap by entering
`langsmith.run_helpers.tracing_context` for the duration of each
session.
Wire LangSmith tracing into the scheduled integration test workflow so
partner test runs emit traces to a shared project with GitHub Actions
metadata attached. Makes failures traceable back to the originating
workflow run, sha, and matrix shard without hunting through CI logs.
## Usage
- Filter the `oss-python-integration-tests` project by the
`github-actions` tag to see only CI runs; add a `sha-<sha>` or
`working_directory=<partner>` filter to narrow to a specific commit or
partner shard.
- From a failing trace, open the `github_run_url` metadata field to jump
directly to the originating workflow run; `github_run_attempt`
disambiguates reruns.
- Tags are flat strings (good for facets); structured fields like
`python_version` and `working_directory` live in metadata for richer
querying.
Fast-track companion to #37643.
GitHub's `workflow_dispatch` event is only discoverable when the
workflow file exists on the default branch — even though the workflow
code that runs comes from the `ref` passed to the dispatch. This PR
lands the `Middleware Evals` workflow file on master so that #37643
(which adds `libs/langchain_v1/tests/evals/`) can be dispatched against
the feature branch via:
```bash
gh workflow run middleware_evals.yml \
--ref nh/todo-middleware-loop-contract \
--field models='claude-sonnet-4-6,...'
```
without first merging the full eval framework.
## Caveats
- The workflow's pytest invocation depends on
`libs/langchain_v1/tests/evals/` and the partner SDK list, neither of
which exists on master yet. Dispatching with `--ref master` before
#37643 lands will fail at pytest collection. That's the intended
behavior — the workflow's purpose is to dispatch against branches that
ship the eval suite.
- Once #37643 merges to master, this workflow file already matches what
#37643 adds. The merge will be a no-op for `middleware_evals.yml`
itself.
`PIIMiddleware` previously scrubbed detected PII only at the state level
via its `after_model` / `before_model` hooks. Consumers reading the live
stream — `astream_events(version="v3")` or `run.messages` /
`run.tool_calls` / `run.values` — saw the raw model text, the raw
tool-call args, the raw tool outputs, and the raw state snapshots until
the run finished and the canonical conversation history was written.
This change registers a stream transformer ahead of
`MessagesTransformer` that redacts every wire surface of an agent run.
The transformer holds a sliding lookback buffer (default 128 characters)
per `(run_id, content-block index)` so PII patterns that straddle delta
boundaries are caught before the safe prefix is released downstream.
Anything older than the lookback is run through the configured detector
and emitted; the trailing tail stays buffered until a later delta
extends it past the cap or the block finishes. `_finalize_block` always
re-runs detection over the full block snapshot so the finalized content
lands fully redacted even when the in-flight buffer never released a
tail (short responses, or PII arriving in the final delta).
The `block` strategy is now supported on the streaming path via a
buffering mode that withholds every delta until the block resolves —
clean blocks release the full text at finalize, PII-bearing blocks zero
the wire and let `after_model` / `apply_to_tool_results` raise
`PIIDetectionError` on the original state message. Activation is gated
on `apply_to_output=True`, matching the existing post-hoc semantics. The
middleware's transformer factory is cloned by `StreamMux._make_child`
into every subgraph scope, so attaching `PIIMiddleware` at the outer
agent also redacts streamed deltas from sub-agents invoked inside tools.
## Tool-call and tools-channel coverage
The transformer covers every wire surface of an agent run, not just AI
message text:
- **Streamed AI text deltas** (`content-block-delta` of type
`text-delta`) — lookback machinery, redacted in place.
- **Streamed tool-call args** (`content-block-delta` with
`tool_call_chunk` / `server_tool_call_chunk` fields) — each delta
carries the full cumulative args string; detection runs on the field
directly and redacts in place. Verified empirically against
`_compat_bridge.py` and the consumer-side
`_merge_block_delta_into_store` snapshot-replace semantics.
- **Finalized tool-call blocks** (`content-block-finish` with
`tool_call` / `server_tool_call` / `invalid_tool_call`) — `args` dict
walked recursively and each string leaf redacted.
- **Tool execution events on the `tools` channel** —
`tool-started.input`, `tool-output-delta`, `tool-finished.output`,
`tool-error.message` all run through detection. String deltas use the
same lookback machinery as text-deltas keyed by `tool_call_id`;
structured payloads walk recursively.
- **State snapshots on the `values` channel** — message lists are walked
and each message's `.content` is redacted on a fresh copy. Graph state
itself stays intact for the state-level enforcer
(`apply_to_tool_results` via `before_model`) to act on independently.
- **Legacy `(BaseMessage, metadata)` payloads** on the `messages`
channel (Python 3.10 path, where `langgraph`'s `ASYNCIO_ACCEPTS_CONTEXT
= sys.version_info >= (3, 11)` falls back to a code path that doesn't
propagate the streaming callback into the chat model) — `.content` and
`AIMessage.tool_calls[*].args` are scrubbed. For `block`, the event's
`data` tuple is replaced with an empty-content copy so the original
message stays in state for `after_model` to raise on.
## Worth a careful look
- `_PIIStreamTransformer._mutate_text_delta` — lookback partition.
Anything older than `lookback` characters is released after redaction;
the tail stays buffered. Bulletproof against whitespace-permissive
detectors (notably `credit_card`, whose regex matches across spaces).
- `_PIIStreamTransformer._mutate_tool_call_chunk_delta` — direct
in-place redaction of the cumulative args string. No buffer; the wire
shape is cumulative-snapshot, the consumer-side merge is
replace-not-append.
- `_PIIStreamTransformer._mutate_legacy_payload` — the dual path:
mutate-in-place for non-`block` (idempotent with `after_model`),
replace-with-empty-copy for `block` (keeps original in graph state for
`after_model` to raise on).
- `_PIIStreamTransformer._redact_value` — the recursive walker.
`BaseMessage` branch returns a fresh `.content`-redacted copy via
`model_copy(update=...)` — never mutates in place — so tool-output
payloads that wrap a `ToolMessage` and message lists in state snapshots
flow through cleanly.
- The new `transformers` attribute on `PIIMiddleware`: this is what
makes `create_agent` pick the factory up. Multiple `PIIMiddleware`
instances each register one transformer; ordering is preserved within
the `before_builtins` lane.
## Compatibility
Bumps `langgraph` to `>=1.2.1` for the `before_builtins` opt-in on
`StreamTransformer`.
Automated refresh of model profile data for all in-monorepo partner
integrations via `langchain-profiles refresh`.
🤖 Generated by the `refresh_model_profiles` workflow.
Co-authored-by: mdrxy <61371264+mdrxy@users.noreply.github.com>
Adds `aws`, `google-genai`, and `google-vertexai` to the manual-run
`working-directory` dropdown on the `⏰ Integration Tests` workflow.
These partners live in external repos (`langchain-ai/langchain-google`,
`langchain-ai/langchain-aws`) and were previously only reachable via the
free-form `working-directory-override` input despite the job already
checking them out into `libs/partners/`.
`test_no_overrides_DO_NOT_OVERRIDE` only treated an override as valid
when the method itself carried an `@pytest.mark.xfail(reason=...)`.
Overrides that re-parametrize a standard test and xfail only a subset of
cases via `pytest.param(..., marks=pytest.mark.xfail(...))` were
rejected.
Loosen the xAI integration tests to handle two recent provider
behaviors: `web_search` responses may now include block types beyond the
core trio, and streaming aggregation under `output_version="v1"` does
not produce a `tool_call` content block (tool calls are only available
on `.tool_calls`).
Bumps the `langchain-tests` minimum across the monorepo from `1.0.0` to
`1.1.9` and adds a partner-level `Makefile` so partner lockfiles can be
regenerated in one command, matching the existing convention under
`libs/`.
Reasoning-emitting chat models return `[reasoning, text]` content blocks
where vanilla models return `[text]`. The shared streaming integration
tests asserted exactly one block, which fails when reasoning blocks are
returned when streaming is otherwise correct.
Relaxed to assert text presence without touching the lifecycle,
`chunk_position`, or `output_version` checks.
Adds a `transformers` attribute to `AgentMiddleware` so middleware can
declare scope-aware `StreamTransformer` factories alongside their
`tools` and lifecycle hooks. `create_agent` merges middleware-registered
factories with any caller-supplied ones at compile time.
## API
```python
class MyMiddleware(AgentMiddleware):
transformers = (MyTransformer,) # factory: (scope,) -> StreamTransformer
```
When the agent compiles, the final transformer order on the run mux is:
1. Built-in ``ToolCallTransformer``
2. Middleware-registered factories, in middleware order
3. Caller-supplied ``transformers=`` from ``create_agent``
This ordering keeps the built-in tool-call projection in front of any
consumer transformers and gives caller-supplied entries the final word.
`ChatFireworks` previously left `max_retries` at `None` (single attempt)
and only retried `APITimeoutError`. When the Fireworks edge drops TCP
without an HTTP response, the SDK wraps `httpx.RequestError` as a bare
`APIConnectionError`, which slipped past the retry decorator. Aligning
the default with the Fireworks SDK and `langchain-openai`
(`max_retries=2`) and broadening the retryable parent class closes both
gaps.
Fireworks integration tests have been flaky against the live API with
429s. Adds a shared, xdist-aware rate limiter and a global retry policy
so transient rate-limit errors no longer fail the suite. Mirrors the
same fix recently applied to `langchain-mistralai`.
Two unrelated nightly-CI failures rooted in upstream API drift. OpenAI
retired `gpt-4o-audio-preview` (now 404) and Azure embedding deployments
running `text-embedding-3-*` with truncated `dimensions` no longer
return unit-norm vectors.
Mistral integration tests have been flaky against the live API. This
adds a shared, xdist-aware rate limiter and a global retry policy so
transient 429s no longer fail the suite.