Automated refresh of model profile data for all in-monorepo partner
integrations via `langchain-profiles refresh`.
🤖 Generated by the `refresh_model_profiles` workflow.
Co-authored-by: mdrxy <61371264+mdrxy@users.noreply.github.com>
Bumps the uv pin in `.github/actions/uv_setup/action.yml` from `0.11.15`
to [`0.11.17`](https://github.com/astral-sh/uv/releases/tag/0.11.17).
Opened automatically by `bump_uv_pin.yml`. Mirror availability on
`releases.astral.sh` was verified before this PR was created, so CI
should not race the fallback.
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Automated refresh of model profile data for all in-monorepo partner
integrations via `langchain-profiles refresh`.
🤖 Generated by the `refresh_model_profiles` workflow.
Co-authored-by: mdrxy <61371264+mdrxy@users.noreply.github.com>
Automated refresh of model profile data for all in-monorepo partner
integrations via `langchain-profiles refresh`.
🤖 Generated by the `refresh_model_profiles` workflow.
Co-authored-by: mdrxy <61371264+mdrxy@users.noreply.github.com>
Local pre-commit already verified `pyproject.toml`/version-file sync for
`core` and `langchain_v1`, but not `anthropic` — even though the
`check_core_versions.yml` CI workflow already gates all three. This
wires the existing `anthropic` check into pre-commit so the mismatch
gets caught before commit rather than only in CI.
Anthropic's API rejects `tool_use`/`tool_result` IDs that don't match
`^[a-zA-Z0-9_-]+$` with a 400. When a conversation thread is replayed
across providers — e.g. a user switches a running thread from Kimi (via
Fireworks) to Claude — the prior turns carry tool-call IDs minted by the
other provider (Kimi emits `functions.write_todos:0`, whose `.` and `:`
are invalid), and the request fails. Tool-call IDs are now normalized to
an Anthropic-compatible form during request formatting, with the
original `tool_use.id` and its paired `tool_use_id` mapped identically
so they stay linked.
Automated refresh of model profile data for all in-monorepo partner
integrations via `langchain-profiles refresh`.
🤖 Generated by the `refresh_model_profiles` workflow.
Co-authored-by: mdrxy <61371264+mdrxy@users.noreply.github.com>
## Description
Bumps `langchain-perplexity` to require the Perplexity SDK release with
fixed Responses streaming and removes the temporary SSE shim workaround.
## Release Note
`langchain-perplexity` now requires `perplexityai>=0.34.1` for Responses
API streaming.
## Test Plan
- [x] `NO_COLOR=1 uv run --group test pytest
tests/unit_tests/test_chat_models_responses.py --disable-socket
--allow-unix-socket`
_Opened collaboratively by Mason Daugherty and open-swe._
---------
Co-authored-by: open-swe[bot] <open-swe@users.noreply.github.com>
Co-authored-by: Mason Daugherty <61371264+mdrxy@users.noreply.github.com>
Co-authored-by: Mason Daugherty <github@mdrxy.com>
Ports langchain-ai/deepagents#3626. Adds a workflow that clears the
`waiting-on-author` label from an issue or PR as soon as the original
author posts a follow-up comment, closing the manual loop where
maintainers had to remember to strip the label after a reply landed.
`_sanitize_chat_completions_content` now filters every content-part dict
against an allowlist derived at import time from the `fireworks-ai`
SDK's `ContentUnionMember1` TypedDict, and runs on every message role —
not just `ToolMessage`. Fixes 400s of the form `Extra inputs are not
permitted, field:
'messages[N].content.list[ChatMessageContent][i].<key>'` when
cross-provider history (e.g. an Anthropic-shaped `AIMessage` carrying
the v1 streaming-reassembly `index` marker) is forwarded to a
Fireworks-hosted model.
Automated refresh of model profile data for all in-monorepo partner
integrations via `langchain-profiles refresh`.
🤖 Generated by the `refresh_model_profiles` workflow.
Co-authored-by: mdrxy <61371264+mdrxy@users.noreply.github.com>
Closes#37360
Adds a `use_responses_api` flag to `ChatPerplexity` so requests can be
routed through Perplexity's Agent API (the Perplexity-flavored Responses
API) in addition to the existing Chat Completions endpoint. This mirrors
the `use_responses_api` flag on `ChatOpenAI`.
## Motivation
Perplexity exposes two HTTP surfaces from the same SDK client object:
`client.chat.completions.create()` (Chat Completions) and
`client.responses.create()` (Agent API, OpenAI-compatible Responses
shape). The Agent API supports built-in tools (`web_search`,
`fetch_url`, `finance_search`, `people_search`), `instructions`,
`input`, `previous_response_id`, and `include` — none of which exist on
Chat Completions. Today `ChatPerplexity` only calls Chat Completions, so
users who want the Agent API have to drop down to the raw SDK.
## What this changes
- New field `use_responses_api: bool | None = None` on `ChatPerplexity`.
- New module-level helper `_use_responses_api(payload)` that returns
`True` when the payload contains a built-in tool (any `tools[*]` whose
`type` is not `"function"`) or any of the Responses-only fields
`previous_response_id`, `instructions`, `input`, `include`.
- New instance method `ChatPerplexity._use_responses_api(payload)` that
honors `self.use_responses_api` when it is a `bool`, otherwise delegates
to the module helper.
- New converters `_convert_responses_to_chat_result(response)` and
`_convert_responses_stream_event_to_chunk(event)` that translate Agent
API objects/events into `AIMessage` and `AIMessageChunk` (preserving
`usage_metadata`, `response_metadata`, citations, images, related
questions, search results, and `function_call` tool calls).
- A surgical `_to_responses_payload(...)` helper that renames `messages`
→ `input` and `max_tokens` → `max_output_tokens`, passes through
Responses-supported fields, and parks anything Perplexity-specific under
`extra_body`.
- Each of the four API call sites (`_stream`, `_astream`, `_generate`,
`_agenerate`) now branches on `self._use_responses_api(payload)`. The
Chat Completions path is untouched.
## Auto-detection rules
When `use_responses_api` is unset (the default), routing is decided per
call from the outgoing payload:
- Has a built-in tool? → Responses
- Has `previous_response_id`, `instructions`, `input`, or `include`? →
Responses
- Otherwise → Chat Completions
Explicit `use_responses_api=True` or `=False` always overrides
auto-detection.
## Backwards compatibility
Existing usage is unchanged.
`ChatPerplexity(model="sonar").invoke("hi")` still calls
`client.chat.completions.create()`. No public field was renamed or
removed; the new field is purely additive.
## Tests
Adds `tests/unit_tests/test_chat_models_responses.py` covering the
helper, auto-detect routing, explicit overrides in both directions,
response-to-`AIMessage` conversion (content, `usage_metadata`,
`response_metadata.id`), `function_call` → `tool_calls` conversion, and
sync + async streaming of `response.output_text.delta` and
`response.completed` events. All mocks use `MagicMock`/`AsyncMock`; no
network calls.
## Notes for reviewers
This was implemented with help from an AI agent. The shape mirrors
`langchain-openai`'s `use_responses_api` — same field name, same helper
name, same docstring style — so the diff should be familiar.
Closes nothing — net new feature.
---------
Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>
Co-authored-by: Mason Daugherty <github@mdrxy.com>
- Bumps `langchain` to **1.3.2** (patch)
- Raises minimum `langgraph` requirement from `>=1.2.1` to `>=1.2.2`
langgraph 1.2.2 fixes a race condition where DeltaChannel checkpoint
writes
could serialize `BaseMessage` objects with `id=None` before
`apply_writes()`
ran the reducer, causing the same message to appear with a different ID
on
every `get_state()` call and across resumed invocations
(langchain-ai/langgraph#7913).
The lockfile will be updated once langgraph 1.2.2 is published to PyPI
(langchain-ai/langgraph#7914).
Refresh `langchain-core`'s lockfile so the dev/CI environment resolves
`uuid-utils` to a release that ships free-threading wheels (`cp313t`,
`cp314t`). Unblocks `pip install` on Python 3.14 free-threaded builds —
previously the lock pinned `0.14.1`, which had no FT wheel and forced an
sdist build. Related to #34870.
Enable `pytest-retry` on the `integration_tests` Make target so live API
flakes (e.g. `test_batch` timing out against `api.anthropic.com`) no
longer fail the job on first miss. Matches the existing convention in
`libs/partners/groq`.
Automated refresh of model profile data for all in-monorepo partner
integrations via `langchain-profiles refresh`.
🤖 Generated by the `refresh_model_profiles` workflow.
Co-authored-by: mdrxy <61371264+mdrxy@users.noreply.github.com>
Automated refresh of model profile data for all in-monorepo partner
integrations via `langchain-profiles refresh`.
🤖 Generated by the `refresh_model_profiles` workflow.
Co-authored-by: mdrxy <61371264+mdrxy@users.noreply.github.com>
Add job-level `concurrency` to the scheduled integration tests so
per-package shards from overlapping workflow dispatches don't hit the
same live API credentials at once — e.g. a manually triggered
`anthropic` run colliding with the daily `all libs` run.
Scheduled integration runs set `LANGSMITH_TAGS` and `LANGSMITH_METADATA`
in `$GITHUB_ENV` (per #37615), but the LangSmith SDK does not read those
env vars natively, so the tags/metadata were silently dropped. A new
pytest plugin in `langchain-tests` bridges that gap by entering
`langsmith.run_helpers.tracing_context` for the duration of each
session.
Wire LangSmith tracing into the scheduled integration test workflow so
partner test runs emit traces to a shared project with GitHub Actions
metadata attached. Makes failures traceable back to the originating
workflow run, sha, and matrix shard without hunting through CI logs.
## Usage
- Filter the `oss-python-integration-tests` project by the
`github-actions` tag to see only CI runs; add a `sha-<sha>` or
`working_directory=<partner>` filter to narrow to a specific commit or
partner shard.
- From a failing trace, open the `github_run_url` metadata field to jump
directly to the originating workflow run; `github_run_attempt`
disambiguates reruns.
- Tags are flat strings (good for facets); structured fields like
`python_version` and `working_directory` live in metadata for richer
querying.
Fast-track companion to #37643.
GitHub's `workflow_dispatch` event is only discoverable when the
workflow file exists on the default branch — even though the workflow
code that runs comes from the `ref` passed to the dispatch. This PR
lands the `Middleware Evals` workflow file on master so that #37643
(which adds `libs/langchain_v1/tests/evals/`) can be dispatched against
the feature branch via:
```bash
gh workflow run middleware_evals.yml \
--ref nh/todo-middleware-loop-contract \
--field models='claude-sonnet-4-6,...'
```
without first merging the full eval framework.
## Caveats
- The workflow's pytest invocation depends on
`libs/langchain_v1/tests/evals/` and the partner SDK list, neither of
which exists on master yet. Dispatching with `--ref master` before
#37643 lands will fail at pytest collection. That's the intended
behavior — the workflow's purpose is to dispatch against branches that
ship the eval suite.
- Once #37643 merges to master, this workflow file already matches what
#37643 adds. The merge will be a no-op for `middleware_evals.yml`
itself.
`PIIMiddleware` previously scrubbed detected PII only at the state level
via its `after_model` / `before_model` hooks. Consumers reading the live
stream — `astream_events(version="v3")` or `run.messages` /
`run.tool_calls` / `run.values` — saw the raw model text, the raw
tool-call args, the raw tool outputs, and the raw state snapshots until
the run finished and the canonical conversation history was written.
This change registers a stream transformer ahead of
`MessagesTransformer` that redacts every wire surface of an agent run.
The transformer holds a sliding lookback buffer (default 128 characters)
per `(run_id, content-block index)` so PII patterns that straddle delta
boundaries are caught before the safe prefix is released downstream.
Anything older than the lookback is run through the configured detector
and emitted; the trailing tail stays buffered until a later delta
extends it past the cap or the block finishes. `_finalize_block` always
re-runs detection over the full block snapshot so the finalized content
lands fully redacted even when the in-flight buffer never released a
tail (short responses, or PII arriving in the final delta).
The `block` strategy is now supported on the streaming path via a
buffering mode that withholds every delta until the block resolves —
clean blocks release the full text at finalize, PII-bearing blocks zero
the wire and let `after_model` / `apply_to_tool_results` raise
`PIIDetectionError` on the original state message. Activation is gated
on `apply_to_output=True`, matching the existing post-hoc semantics. The
middleware's transformer factory is cloned by `StreamMux._make_child`
into every subgraph scope, so attaching `PIIMiddleware` at the outer
agent also redacts streamed deltas from sub-agents invoked inside tools.
## Tool-call and tools-channel coverage
The transformer covers every wire surface of an agent run, not just AI
message text:
- **Streamed AI text deltas** (`content-block-delta` of type
`text-delta`) — lookback machinery, redacted in place.
- **Streamed tool-call args** (`content-block-delta` with
`tool_call_chunk` / `server_tool_call_chunk` fields) — each delta
carries the full cumulative args string; detection runs on the field
directly and redacts in place. Verified empirically against
`_compat_bridge.py` and the consumer-side
`_merge_block_delta_into_store` snapshot-replace semantics.
- **Finalized tool-call blocks** (`content-block-finish` with
`tool_call` / `server_tool_call` / `invalid_tool_call`) — `args` dict
walked recursively and each string leaf redacted.
- **Tool execution events on the `tools` channel** —
`tool-started.input`, `tool-output-delta`, `tool-finished.output`,
`tool-error.message` all run through detection. String deltas use the
same lookback machinery as text-deltas keyed by `tool_call_id`;
structured payloads walk recursively.
- **State snapshots on the `values` channel** — message lists are walked
and each message's `.content` is redacted on a fresh copy. Graph state
itself stays intact for the state-level enforcer
(`apply_to_tool_results` via `before_model`) to act on independently.
- **Legacy `(BaseMessage, metadata)` payloads** on the `messages`
channel (Python 3.10 path, where `langgraph`'s `ASYNCIO_ACCEPTS_CONTEXT
= sys.version_info >= (3, 11)` falls back to a code path that doesn't
propagate the streaming callback into the chat model) — `.content` and
`AIMessage.tool_calls[*].args` are scrubbed. For `block`, the event's
`data` tuple is replaced with an empty-content copy so the original
message stays in state for `after_model` to raise on.
## Worth a careful look
- `_PIIStreamTransformer._mutate_text_delta` — lookback partition.
Anything older than `lookback` characters is released after redaction;
the tail stays buffered. Bulletproof against whitespace-permissive
detectors (notably `credit_card`, whose regex matches across spaces).
- `_PIIStreamTransformer._mutate_tool_call_chunk_delta` — direct
in-place redaction of the cumulative args string. No buffer; the wire
shape is cumulative-snapshot, the consumer-side merge is
replace-not-append.
- `_PIIStreamTransformer._mutate_legacy_payload` — the dual path:
mutate-in-place for non-`block` (idempotent with `after_model`),
replace-with-empty-copy for `block` (keeps original in graph state for
`after_model` to raise on).
- `_PIIStreamTransformer._redact_value` — the recursive walker.
`BaseMessage` branch returns a fresh `.content`-redacted copy via
`model_copy(update=...)` — never mutates in place — so tool-output
payloads that wrap a `ToolMessage` and message lists in state snapshots
flow through cleanly.
- The new `transformers` attribute on `PIIMiddleware`: this is what
makes `create_agent` pick the factory up. Multiple `PIIMiddleware`
instances each register one transformer; ordering is preserved within
the `before_builtins` lane.
## Compatibility
Bumps `langgraph` to `>=1.2.1` for the `before_builtins` opt-in on
`StreamTransformer`.
Automated refresh of model profile data for all in-monorepo partner
integrations via `langchain-profiles refresh`.
🤖 Generated by the `refresh_model_profiles` workflow.
Co-authored-by: mdrxy <61371264+mdrxy@users.noreply.github.com>
Adds `aws`, `google-genai`, and `google-vertexai` to the manual-run
`working-directory` dropdown on the `⏰ Integration Tests` workflow.
These partners live in external repos (`langchain-ai/langchain-google`,
`langchain-ai/langchain-aws`) and were previously only reachable via the
free-form `working-directory-override` input despite the job already
checking them out into `libs/partners/`.
`test_no_overrides_DO_NOT_OVERRIDE` only treated an override as valid
when the method itself carried an `@pytest.mark.xfail(reason=...)`.
Overrides that re-parametrize a standard test and xfail only a subset of
cases via `pytest.param(..., marks=pytest.mark.xfail(...))` were
rejected.
Loosen the xAI integration tests to handle two recent provider
behaviors: `web_search` responses may now include block types beyond the
core trio, and streaming aggregation under `output_version="v1"` does
not produce a `tool_call` content block (tool calls are only available
on `.tool_calls`).
Bumps the `langchain-tests` minimum across the monorepo from `1.0.0` to
`1.1.9` and adds a partner-level `Makefile` so partner lockfiles can be
regenerated in one command, matching the existing convention under
`libs/`.
Reasoning-emitting chat models return `[reasoning, text]` content blocks
where vanilla models return `[text]`. The shared streaming integration
tests asserted exactly one block, which fails when reasoning blocks are
returned when streaming is otherwise correct.
Relaxed to assert text presence without touching the lifecycle,
`chunk_position`, or `output_version` checks.
Adds a `transformers` attribute to `AgentMiddleware` so middleware can
declare scope-aware `StreamTransformer` factories alongside their
`tools` and lifecycle hooks. `create_agent` merges middleware-registered
factories with any caller-supplied ones at compile time.
## API
```python
class MyMiddleware(AgentMiddleware):
transformers = (MyTransformer,) # factory: (scope,) -> StreamTransformer
```
When the agent compiles, the final transformer order on the run mux is:
1. Built-in ``ToolCallTransformer``
2. Middleware-registered factories, in middleware order
3. Caller-supplied ``transformers=`` from ``create_agent``
This ordering keeps the built-in tool-call projection in front of any
consumer transformers and gives caller-supplied entries the final word.
`ChatFireworks` previously left `max_retries` at `None` (single attempt)
and only retried `APITimeoutError`. When the Fireworks edge drops TCP
without an HTTP response, the SDK wraps `httpx.RequestError` as a bare
`APIConnectionError`, which slipped past the retry decorator. Aligning
the default with the Fireworks SDK and `langchain-openai`
(`max_retries=2`) and broadening the retryable parent class closes both
gaps.
Fireworks integration tests have been flaky against the live API with
429s. Adds a shared, xdist-aware rate limiter and a global retry policy
so transient rate-limit errors no longer fail the suite. Mirrors the
same fix recently applied to `langchain-mistralai`.
Two unrelated nightly-CI failures rooted in upstream API drift. OpenAI
retired `gpt-4o-audio-preview` (now 404) and Azure embedding deployments
running `text-embedding-3-*` with truncated `dimensions` no longer
return unit-norm vectors.
Mistral integration tests have been flaky against the live API. This
adds a shared, xdist-aware rate limiter and a global retry policy so
transient 429s no longer fail the suite.
Closes#37172
---
Bumps `langchain-fireworks` to the rewritten `fireworks-ai` 1.x SDK
(currently 1.2.0a*; Stainless-generated, pure-httpx, no
`grpcio`/`protobuf`/`googleapis-common-protos`).
The motivating bug is a startup crash in self-hosted LangGraph
environments that also import `langchain-google-vertexai`. Importing
`fireworks-ai` 0.19.x eagerly loads vendored grpcio protobuf modules
under `fireworks.control_plane.generated.protos_grpcio.*`, which
register `google/rpc/status.proto`, `google/api/*.proto`, and
`google/longrunning/*.proto` in the default protobuf descriptor pool.
When `langchain-google-vertexai` later triggers
`google.api_core.exceptions` → `grpc_status.rpc_status` →
`google.rpc.status_pb2`, the pool already holds a byte-different
descriptor for `google/rpc/status.proto` and startup dies with:
```
TypeError: Couldn't build proto file into descriptor pool:
duplicate file name google/rpc/status.proto
```
Fleet has been pinning around this by routing Fireworks through
`ChatOpenAI` against the OpenAI-compat endpoint, which works for
inference but means Fireworks `ModelProfile` data never loads — so Kimi
K2.6's ~262k context window goes unrecognized and summarization triggers
below limit.
The 1.x SDK does not vendor protobuf at all. The control-plane gRPC code
path is gone; chat inference goes over httpx. Verified locally that
`import langchain_fireworks` and `from langchain_fireworks import
ChatFireworks` load zero `_pb2` / `google.*` modules.
## What changed in `ChatFireworks`
- Imports switch from `fireworks.client` to the top-level `fireworks`
package.
- Async path now `await client.chat.completions.create(...)`; the 0.x
`acreate` shim is no longer used.
- Error classes remapped to the 1.x hierarchy. `InvalidRequestError` →
`BadRequestError`. `BadGatewayError` and `ServiceUnavailableError` no
longer exist (1.x maps all `>=500` to `InternalServerError`) and were
dropped from the retryable set with no loss of coverage.
`FireworksContextOverflowError`'s parent class becomes
`BadRequestError`.
- `stream_options` is moved into the SDK's `extra_body` because the
Stainless-generated `create()` signature does not model it as a typed
kwarg. Top-level `stream_options` is preserved as a caller convenience;
if a caller supplies both `extra_body["stream_options"]` and a top-level
value, `extra_body` wins and the discarded value is logged.
- The 0.x `(connect, read)` tuple form of `request_timeout` is
normalized to an `httpx.Timeout` so existing user code keeps working.
- The SDK's built-in retry layer is suppressed via `max_retries=0` on
client construction so retries remain owned by
`create_base_retry_decorator` and surface through the LangChain
`run_manager`.
## Lifecycle methods
Adds `close()` and `aclose()` on `ChatFireworks`. The 1.x
`AsyncFireworks` client defaults to `httpx_aiohttp.HttpxAiohttpClient`,
whose underlying aiohttp `ClientSession` is created lazily on first
request. Sync-only paths therefore never open a session — which fixes
the "Unclosed client session" warnings from #37172 at the source.
Callers using async paths can now release the connector
deterministically rather than relying on GC after the event loop has
stopped. An autouse fixture in the integration `conftest.py` calls
`aclose()` between tests to silence the corresponding `Unclosed
connector` warning that surfaces under `pytest-asyncio`.
## Relation to #37227
Supersedes #37227. That PR monkey-patched
`fireworks._util.is_running_in_async_context` and
`fireworks.client.api_client.is_running_in_async_context` to suppress
the 0.x SDK's eager `aiohttp.ClientSession` creation in async contexts.
Both module paths are removed in 1.x; the SDK's lazy-session behavior
makes the suppression unnecessary, and the explicit `aclose()` provides
the cleaner long-term lifecycle hook. Thanks to @keenborder786 for
surfacing the failure mode.
## Installation note
`fireworks-ai` 1.x is currently published as an alpha (`1.2.0a*`); a
stable 1.x is not yet out. `pip install langchain-fireworks` / `uv pip
install langchain-fireworks` will need `--pre` (or `--prerelease=allow`)
until Fireworks GAs 1.x. The `pyproject.toml` adds `[tool.uv] prerelease
= "allow"` so the in-repo dev environment resolves cleanly. The package
version is bumped to `1.4.0` — the public surface (`ChatFireworks`,
`Fireworks`, `FireworksEmbeddings`) is unchanged; the breakage is
confined to internal error classes and the transitive SDK.