Commit Graph

762 Commits

Author SHA1 Message Date
Nick Hollon
da380bccf8 chore(infra): merge v1.4 into master (#37350) 2026-05-11 11:39:25 -07:00
Mason Daugherty
8b21400627 fix(core): avoid eager pydantic.v1 import in @deprecated (#37308)
`langchain_core._api.deprecation` previously did `from
pydantic.v1.fields import FieldInfo as FieldInfoV1` at module scope,
which triggers Pydantic's `UserWarning("Core Pydantic V1 functionality
isn't compatible with Python 3.14 or greater.")` on every
`langchain_core` import under 3.14+. The v1 symbol is only needed inside
one runtime branch of `@deprecated`, so it's now resolved lazily.

## Changes
- Replace the top-level v1 `FieldInfo` import with
`_is_pydantic_v1_field_info`, which probes
`sys.modules.get("pydantic.v1.fields")` instead of forcing the import.
The reconstruction inside `deprecated`'s `finalize` closure imports
`FieldInfoV1` lazily, gated by the predicate — so the warning only fires
if a caller has already loaded `pydantic.v1` themselves.
- Add a subprocess-based regression test asserting that importing
`langchain_core._api.deprecation` does not pull any `pydantic.v1*`
module into `sys.modules`. Verified to fail when the eager import is
reintroduced.
- Add a v1 `FieldInfo` decoration test — the v1 branch of `@deprecated`
previously had zero direct coverage.
- Update the stale `# Last Any should be FieldInfoV1 but this leads to
circular imports` comment on `T`'s bound, which no longer reflects the
real reason (it's about the 3.14 warning, not circularity).
2026-05-09 20:35:17 -04:00
Nick Hollon
c979c6187b fix(core, langchain): harden load() against untrusted manifests (#37197) 2026-05-05 14:36:58 -04:00
Mason Daugherty
a1f336fdc7 fix(core): preserve structured inputs on tool runs in tracers (#37108)
Tool runs in `_TracerCore._create_tool_run` were discarding the
structured `inputs` dict that `BaseTool.run` passes to `on_tool_start`,
replacing it with `{"input": str(filtered_tool_input)}`. Consequently,
every multi-arg tool (e.g. ones in `deepagents` like `execute`,
`edit_file`, `write_file`, `grep`, ...) appeared in LangSmith with a
stringified, escaped dump of its arguments — multi-line bash commands
rendered with `\n` and were effectively unreadable. Chain runs already
preserved dicts via `_get_chain_inputs`; tool runs are now symmetric.

## Changes
- Preserve `inputs` when it is already a `dict` in the `original` /
`original+chat` branch of `_TracerCore._create_tool_run`, falling back
to `{"input": input_str}` only when no structured payload was provided
- Add regression tests in the sync and async base-tracer suites that
pass a structured `inputs` to `on_tool_start` and assert the dict
survives onto the resulting `Run`

## Breaking change
Custom `BaseTracer` subclasses that parsed `Run.inputs["input"]` as a
stringified dict for tool runs will need to read the structured fields
directly. The shape now matches what `on_tool_start(inputs=...)` has
always received — introduced alongside `_schema_format` in the
`astream_events` work — and what `streaming_events` consumers already
see.
2026-04-30 14:56:14 -04:00
Mason Daugherty
37be34be82 fix(core): make removal optional in warn_deprecated (#37056)
Drop the `NotImplementedError` branch in `warn_deprecated` so callers
can pass `pending=False` without specifying a `removal` version. The
previous behavior contradicted the docstring (which claimed an empty
default would auto-compute a removal version) — no such computation
existed; the function just raised a placeholder "Need to determine which
default deprecation schedule to use" error.
2026-04-28 11:05:31 -04:00
Sharvil Saxena
78546e9242 fix(core): validate batch_size in _batch and _abatch to prevent infinite loop (#36663) 2026-04-26 15:13:20 -04:00
Nick Hollon
9ce72eba9f feat(core): add content-block-centric streaming (v2) (#36834) 2026-04-24 11:36:17 -04:00
Hunter Lovell
9a671d7919 feat(core): allow _format_output to pass through list of ToolOutputMixin instances (#36963) 2026-04-23 13:49:46 -04:00
Jacob Lee
40026a7282 feat(core): Update inheritance behavior for tracer metadata for special keys (#36900)
JS equivalent: https://github.com/langchain-ai/langchainjs/pull/10733
2026-04-20 14:58:01 -07:00
Eugene Yurtsev
b00646d882 chore(core): keep checkpoint_ns behavior in streaming metadata for backwards compat (#36828)
minor buglet
2026-04-16 15:17:20 -04:00
Jacob Lee
c04e05feb1 feat(core): Add chat model and LLM invocation params to traceable metadata (#36771)
Equivalent to: https://github.com/langchain-ai/langchainjs/pull/10711/

---------

Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>
2026-04-16 18:30:54 +00:00
ccurme
338aa8131a fix(core): restore cloud metadata IPs and link-local range in SSRF policy (#36816) 2026-04-16 09:15:42 -04:00
ccurme
7d601dc2c6 chore(core): harden private SSRF utilities (#36768) 2026-04-15 16:13:20 -04:00
Jacob Lee
a6eb829701 fix(core): Use reference counting for storing inherited run trees to support garbage collection (#36660)
When a langsmith `@traceable` function invokes a LangChain Runnable or
LangGraph subgraph, the callback manager's `_configure` function injects
the `@traceable` RunTree into the `LangChainTracer`'s `run_map` so that
child runs can resolve their parent for trace nesting. However, since
the RunTree was created outside the tracer's callback lifecycle,
`_end_trace` never removes it. The entry persists in `run_map`
indefinitely, retaining the full RunTree and its entire child tree.

In applications with nested subgraph invocations (e.g. an outer
investigation graph delegating to skill agent subgraphs, each compiled
as their own `StateGraph`), this causes RunTree objects to accumulate
linearly with every call.

**Fix:** Track which `run_map` entries were injected externally via a
shared `_external_run_ids` refcount dict on `_TracerCore`. When
`_start_trace` adds a child under an external parent, it increments the
count. When `_end_trace` finishes a child, it decrements — and evicts
the external parent from `run_map` once the last child completes.

The refcount (rather than a simple set) is necessary because a single
external parent may have multiple sibling children in the callback chain
(e.g. a `prompt | llm` `RunnableSequence`). Only truly external runs are
tracked — the `_configure` guard `if run_id_str not in handler.run_map`
prevents tracer-managed runs from being misclassified.
2026-04-13 09:50:37 -04:00
Eugene Yurtsev
af4d711a2f chore(core): reduce streaming metadata / perf (#36588)
- looking into reducing streaming metadata / perfm

---------

Co-authored-by: William Fu-Hinthorn <13333726+hinthornw@users.noreply.github.com>
2026-04-10 10:47:54 -04:00
Eugene Yurtsev
af2ed47c6f fix(core): add more sanitization to templates (#36612)
add more sanitization to templates
2026-04-08 14:10:10 -04:00
ccurme
7629c74726 fix(core): handle symlinks in deprecated prompt save path (#36585)
Resolve symlinks before validating file extensions in the deprecated
`save()` method on prompt classes.

Credit to Jeff Ponte (@JDP-Security) for reporting the symlink
resolution issue.
2026-04-07 10:45:42 -04:00
Michael Chin
ebecdddb1b fix(core): add init validator and serialization mappings for Bedrock models (#34510)
Adds serialization mappings for `ChatBedrockConverse` and `BedrockLLM`
to unblock standard tests on `langchain-core>=1.2.5` (context:
[langchain-aws#821](https://github.com/langchain-ai/langchain-aws/pull/821)).
Also introduces a class-specific validator system in
`langchain_core.load` that blocks deserialization of AWS Bedrock models
when `endpoint_url` or `base_url` parameters are present, preventing
SSRF attacks via crafted serialized payloads.

Closes #34645

## Changes
- Add `ChatBedrockConverse` and `BedrockLLM` entries to
`SERIALIZABLE_MAPPING` in `mapping.py`, mapping legacy paths to their
`langchain_aws` import locations
- Add `validators.py` with `_bedrock_validator` — rejects
deserialization kwargs containing `endpoint_url` or `base_url` for all
Bedrock-related classes (`ChatBedrock`, `BedrockChat`,
`ChatBedrockConverse`, `ChatAnthropicBedrock`, `BedrockLLM`, `Bedrock`)
- `CLASS_INIT_VALIDATORS` registry covers both serialized (legacy) keys
and resolved import paths from `ALL_SERIALIZABLE_MAPPINGS`, preventing
bypass via direct-path payloads
- Move kwargs extraction and all validator checks
(`CLASS_INIT_VALIDATORS` + `init_validator`) in `Reviver.__call__` to
run **before** `importlib.import_module()` — fail fast on security
violations before executing third-party code
- Class-specific validators are independent of `init_validator` and
cannot be disabled by passing `init_validator=None`

## Testing
- `test_validator_registry_keys_in_serializable_mapping` — structural
invariant test ensuring every `CLASS_INIT_VALIDATORS` key exists in
`ALL_SERIALIZABLE_MAPPINGS`
- 10 end-to-end `load()` tests covering all Bedrock class paths (legacy
aliases, resolved import paths, `ChatAnthropicBedrock`,
`init_validator=None` bypass attempt)
- Unit tests for `_bedrock_validator` covering `endpoint_url`,
`base_url`, both params, and safe kwargs

---------

Co-authored-by: Mason Daugherty <mason@langchain.dev>
Co-authored-by: Mason Daugherty <github@mdrxy.com>
2026-04-03 19:22:39 -04:00
ccurme
0b5f2c08ee fix(core): harden check for txt files in deprecated prompt loading functions (#36471) 2026-04-02 16:42:48 -04:00
jasiecky
c9f51aef85 fix(core): fixed typos in the documentation (#36459)
Fixes #36458 

Fixed typos in the documentation in the core module.
2026-04-02 11:32:12 -04:00
Weiguang Li
e6c1b29e80 fix(core): add "computer" to _WellKnownOpenAITools (#36261) 2026-03-29 08:54:42 -04:00
Jacob Lee
389f7ad1bc revert: Revert "fix(core): trace invocation params in metadata" (#36322) 2026-03-27 19:14:02 -04:00
ccurme
27add91347 fix(core): validate paths in prompt.save and load_prompt, deprecate methods (#36200) 2026-03-24 14:27:14 -04:00
Mason Daugherty
2f64d80cc6 fix(core,model-profiles): add missing ModelProfile fields, warn on schema drift (#36129)
PR #35788 added 7 new fields to the `langchain-profiles` CLI output
(`name`, `status`, `release_date`, `last_updated`, `open_weights`,
`attachment`, `temperature`) but didn't update `ModelProfile` in
`langchain-core`. Partner packages like `langchain-aws` that set
`extra="forbid"` on their Pydantic models hit `extra_forbidden`
validation errors when Pydantic encountered undeclared TypedDict keys at
construction time. This adds the missing fields, makes `ModelProfile`
forward-compatible, provides a base-class hook so partners can stop
duplicating model-profile validator boilerplate, migrates all in-repo
partners to the new hook, and adds runtime + CI-time warnings for schema
drift.

## Changes

### `langchain-core`
- Add `__pydantic_config__ = ConfigDict(extra="allow")` to
`ModelProfile` so unknown profile keys pass Pydantic validation even on
models with `extra="forbid"` — forward-compatibility for when the CLI
schema evolves ahead of core
- Declare the 7 missing fields on `ModelProfile`: `name`, `status`,
`release_date`, `last_updated`, `open_weights` (metadata) and
`attachment`, `temperature` (capabilities)
- Add `_warn_unknown_profile_keys()` in `model_profile.py` — emits a
`UserWarning` when a profile dict contains keys not in `ModelProfile`,
suggesting a core upgrade. Wrapped in a bare `except` so introspection
failures never crash model construction
- Add `BaseChatModel._resolve_model_profile()` hook that returns `None`
by default. Partners can override this single method instead of
redefining the full `_set_model_profile` validator — the base validator
calls it automatically
- Add `BaseChatModel._check_profile_keys` as a separate
`model_validator` that calls `_warn_unknown_profile_keys`. Uses a
distinct method name so partner overrides of `_set_model_profile` don't
inadvertently suppress the check

### `langchain-profiles` CLI
- Add `_warn_undeclared_profile_keys()` to the CLI (`cli.py`), called
after merging augmentations in `refresh()` — warns at profile-generation
time (not just runtime) when emitted keys aren't declared in
`ModelProfile`. Gracefully skips if `langchain-core` isn't installed
- Add guard test
`test_model_data_to_profile_keys_subset_of_model_profile` in
model-profiles — feeds a fully-populated model dict to
`_model_data_to_profile()` and asserts every emitted key exists in
`ModelProfile.__annotations__`. CI fails before any release if someone
adds a CLI field without updating the TypedDict

### Partner packages
- Migrate all 10 in-repo partners to the `_resolve_model_profile()`
hook, replacing duplicated `@model_validator` / `_set_model_profile`
overrides: anthropic, deepseek, fireworks, groq, huggingface, mistralai,
openai (base + azure), openrouter, perplexity, xai
- Anthropic retains custom logic (context-1m beta → `max_input_tokens`
override); all others reduce to a one-liner
- Add `pr_lint.yml` scope for the new `model-profiles` package
2026-03-23 00:44:27 -04:00
Mason Daugherty
5ffece5c03 chore(core): remove stale blockbuster allowlist for deleted context module (#36168)
Closes #29530

---

Remove a stale BlockBuster allowlist entry in `conftest.py` referencing
`aconfig_with_context` — the function and its containing module
(`langchain_core/beta/runnables/context.py`) were deleted in `fded6c6b1`
(Sep 2025, #32850). Spotted by @antonio-mello-ai in #29530.
2026-03-22 20:39:55 -04:00
ccurme
70c88c0e72 fix(core): trace invocation params in metadata (#36080) 2026-03-18 13:20:18 -04:00
Eugene Yurtsev
dd136337d7 feat(core): harden anti-ssrf (#35960)
harden anti-ssrf

---------

Co-authored-by: Mason Daugherty <mason@langchain.dev>
2026-03-18 10:41:43 -04:00
Mohammad Mohtashim
b21c0a8062 fix(core): preserve default_factory when generating tool call schema (#35550) 2026-03-08 15:34:21 -04:00
Mason Daugherty
61fd90a2f3 fix(core): extract usage metadata from serialized tracer message outputs (#35526)
Fixes missing `run.metadata.usage_metadata` population in
`LangChainTracer` for real LLM/chat traces following #34414

- Fix extraction to read usage from serialized tracer message shape:
`outputs.generations[*][*].message.kwargs.usage_metadata`
- Remove non-serialized direct message shape handling
(`message.usage_metadata`) from extractor to match real tracer output
path
- Clarify tracer docstrings around chat callback naming
(`on_chat_model_start` + shared `on_llm_end`) to reduce ambiguity

## Why

#34414 introduced usage duplication into `run.metadata.usage_metadata`,
but the extractor read `message.usage_metadata`.

In real tracer flow, messages are serialized with `dumpd(...)` during
run completion, so usage metadata lives under
`message.kwargs.usage_metadata`. Because of this mismatch, duplication
did not trigger in real traces.
2026-03-02 17:43:33 -05:00
Guofang.Tang
78678534f9 fix(core): treat empty tool chunk ids as missing in merge (#35414) 2026-02-24 18:12:49 -05:00
Tanzim Hossain Romel
2d1492a864 fix(core): improve error message for non-JSON-serializable tool schemas (#34376) 2026-02-22 17:32:00 -05:00
Balaji Seshadri
d6e46bb4b0 fix(core): improve typing/docs for on_chat_model_start to clarify required positional args (#35324) 2026-02-22 14:46:32 -05:00
yaowubarbara
5053436dcf fix(core): fix merge_lists incorrectly merging parallel tool calls (#35281) 2026-02-18 20:33:17 -05:00
Shivangi Sharma
3686bcbd96 fix(core): accept int temperature in _get_ls_params for LangSmith tracing (#35302) 2026-02-18 18:30:02 -05:00
ccurme
9c160e2368 revert: accept integer temperature values in _get_ls_params (#35319) 2026-02-18 18:19:28 -05:00
Balaji Seshadri
a9f3627229 fix(core): accept integer temperature values in _get_ls_params (#35317) 2026-02-18 17:52:13 -05:00
KarthikRed2000
a565cf85eb fix(core): prevent recursion error when args_schema is dict (#35260) 2026-02-17 17:47:17 -05:00
Mason Daugherty
ba3ad67328 fix(core): preserve index and timestamp fields when merging (#34731)
Porting https://github.com/langchain-ai/langchainjs/pull/9781
2026-02-17 11:29:41 -05:00
Christophe Bornet
b97c629f9a style: bump ruff version to 0.15 (#35042) 2026-02-12 19:34:02 -05:00
ccurme
16cabfa212 fix(core): fix setting ChatGeneration.text (#35191) 2026-02-12 15:43:53 -05:00
ccurme
2b4b1dc29a fix(openai): sanitize urls when counting tokens in images (#35143) 2026-02-10 15:25:10 -05:00
ccurme
e8e47b083e feat(core): count tokens from tool schemas in count_tokens_approximately (#35098) 2026-02-09 14:10:44 -05:00
ccurme
032d01dd0f fix(core): adjust cap when scaling approximate token counts (#35017) 2026-02-04 19:02:48 -05:00
ccurme
5981ee142c fix(core): apply cap when scaling approximate token counts (#35005) 2026-02-03 21:20:48 -05:00
ccurme
09654f4382 feat(core): allow scaling by reported usage when counting tokens approximately (#34996) 2026-02-03 15:19:18 -05:00
Mason Daugherty
ae5b50f37f test(core): increase delta_time for flaky test (#34982)
This regularly flaked
2026-02-02 13:33:15 -05:00
Nandana Dileep
ef067078b8 fix(core): fix nested mustache variable extraction and update docs (#34872) 2026-01-31 21:30:57 -05:00
Akshaya Shanbhogue
84864d77ed fix(core): allow base model annotations for empty model (#34932) 2026-01-31 20:49:02 -05:00
Louis Auneau
f5252b438e fix(core): google docstring parsing with no arguments/reserved arguments (#34861) 2026-01-30 22:48:58 -05:00
Lewis Whitehill
0c9d392d41 test(core): add tests for approximate token counting with multimodal messages (#34898) 2026-01-30 12:35:16 -08:00