Commit Graph

756 Commits

Author SHA1 Message Date
Sydney Runkle
63dd915ae8 perf(core): cache _create_subset_model results with lru_cache
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-22 18:00:35 -04:00
Sydney Runkle
f54de97145 perf(core): invalidate cached tool_call_schema and args on field mutation
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-22 18:00:35 -04:00
Jacob Lee
40026a7282 feat(core): Update inheritance behavior for tracer metadata for special keys (#36900)
JS equivalent: https://github.com/langchain-ai/langchainjs/pull/10733
2026-04-20 14:58:01 -07:00
Eugene Yurtsev
b00646d882 chore(core): keep checkpoint_ns behavior in streaming metadata for backwards compat (#36828)
minor buglet
2026-04-16 15:17:20 -04:00
Jacob Lee
c04e05feb1 feat(core): Add chat model and LLM invocation params to traceable metadata (#36771)
Equivalent to: https://github.com/langchain-ai/langchainjs/pull/10711/

---------

Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>
2026-04-16 18:30:54 +00:00
ccurme
338aa8131a fix(core): restore cloud metadata IPs and link-local range in SSRF policy (#36816) 2026-04-16 09:15:42 -04:00
ccurme
7d601dc2c6 chore(core): harden private SSRF utilities (#36768) 2026-04-15 16:13:20 -04:00
Jacob Lee
a6eb829701 fix(core): Use reference counting for storing inherited run trees to support garbage collection (#36660)
When a langsmith `@traceable` function invokes a LangChain Runnable or
LangGraph subgraph, the callback manager's `_configure` function injects
the `@traceable` RunTree into the `LangChainTracer`'s `run_map` so that
child runs can resolve their parent for trace nesting. However, since
the RunTree was created outside the tracer's callback lifecycle,
`_end_trace` never removes it. The entry persists in `run_map`
indefinitely, retaining the full RunTree and its entire child tree.

In applications with nested subgraph invocations (e.g. an outer
investigation graph delegating to skill agent subgraphs, each compiled
as their own `StateGraph`), this causes RunTree objects to accumulate
linearly with every call.

**Fix:** Track which `run_map` entries were injected externally via a
shared `_external_run_ids` refcount dict on `_TracerCore`. When
`_start_trace` adds a child under an external parent, it increments the
count. When `_end_trace` finishes a child, it decrements — and evicts
the external parent from `run_map` once the last child completes.

The refcount (rather than a simple set) is necessary because a single
external parent may have multiple sibling children in the callback chain
(e.g. a `prompt | llm` `RunnableSequence`). Only truly external runs are
tracked — the `_configure` guard `if run_id_str not in handler.run_map`
prevents tracer-managed runs from being misclassified.
2026-04-13 09:50:37 -04:00
Eugene Yurtsev
af4d711a2f chore(core): reduce streaming metadata / perf (#36588)
- looking into reducing streaming metadata / perfm

---------

Co-authored-by: William Fu-Hinthorn <13333726+hinthornw@users.noreply.github.com>
2026-04-10 10:47:54 -04:00
Eugene Yurtsev
af2ed47c6f fix(core): add more sanitization to templates (#36612)
add more sanitization to templates
2026-04-08 14:10:10 -04:00
ccurme
7629c74726 fix(core): handle symlinks in deprecated prompt save path (#36585)
Resolve symlinks before validating file extensions in the deprecated
`save()` method on prompt classes.

Credit to Jeff Ponte (@JDP-Security) for reporting the symlink
resolution issue.
2026-04-07 10:45:42 -04:00
Michael Chin
ebecdddb1b fix(core): add init validator and serialization mappings for Bedrock models (#34510)
Adds serialization mappings for `ChatBedrockConverse` and `BedrockLLM`
to unblock standard tests on `langchain-core>=1.2.5` (context:
[langchain-aws#821](https://github.com/langchain-ai/langchain-aws/pull/821)).
Also introduces a class-specific validator system in
`langchain_core.load` that blocks deserialization of AWS Bedrock models
when `endpoint_url` or `base_url` parameters are present, preventing
SSRF attacks via crafted serialized payloads.

Closes #34645

## Changes
- Add `ChatBedrockConverse` and `BedrockLLM` entries to
`SERIALIZABLE_MAPPING` in `mapping.py`, mapping legacy paths to their
`langchain_aws` import locations
- Add `validators.py` with `_bedrock_validator` — rejects
deserialization kwargs containing `endpoint_url` or `base_url` for all
Bedrock-related classes (`ChatBedrock`, `BedrockChat`,
`ChatBedrockConverse`, `ChatAnthropicBedrock`, `BedrockLLM`, `Bedrock`)
- `CLASS_INIT_VALIDATORS` registry covers both serialized (legacy) keys
and resolved import paths from `ALL_SERIALIZABLE_MAPPINGS`, preventing
bypass via direct-path payloads
- Move kwargs extraction and all validator checks
(`CLASS_INIT_VALIDATORS` + `init_validator`) in `Reviver.__call__` to
run **before** `importlib.import_module()` — fail fast on security
violations before executing third-party code
- Class-specific validators are independent of `init_validator` and
cannot be disabled by passing `init_validator=None`

## Testing
- `test_validator_registry_keys_in_serializable_mapping` — structural
invariant test ensuring every `CLASS_INIT_VALIDATORS` key exists in
`ALL_SERIALIZABLE_MAPPINGS`
- 10 end-to-end `load()` tests covering all Bedrock class paths (legacy
aliases, resolved import paths, `ChatAnthropicBedrock`,
`init_validator=None` bypass attempt)
- Unit tests for `_bedrock_validator` covering `endpoint_url`,
`base_url`, both params, and safe kwargs

---------

Co-authored-by: Mason Daugherty <mason@langchain.dev>
Co-authored-by: Mason Daugherty <github@mdrxy.com>
2026-04-03 19:22:39 -04:00
ccurme
0b5f2c08ee fix(core): harden check for txt files in deprecated prompt loading functions (#36471) 2026-04-02 16:42:48 -04:00
jasiecky
c9f51aef85 fix(core): fixed typos in the documentation (#36459)
Fixes #36458 

Fixed typos in the documentation in the core module.
2026-04-02 11:32:12 -04:00
Weiguang Li
e6c1b29e80 fix(core): add "computer" to _WellKnownOpenAITools (#36261) 2026-03-29 08:54:42 -04:00
Jacob Lee
389f7ad1bc revert: Revert "fix(core): trace invocation params in metadata" (#36322) 2026-03-27 19:14:02 -04:00
ccurme
27add91347 fix(core): validate paths in prompt.save and load_prompt, deprecate methods (#36200) 2026-03-24 14:27:14 -04:00
Mason Daugherty
2f64d80cc6 fix(core,model-profiles): add missing ModelProfile fields, warn on schema drift (#36129)
PR #35788 added 7 new fields to the `langchain-profiles` CLI output
(`name`, `status`, `release_date`, `last_updated`, `open_weights`,
`attachment`, `temperature`) but didn't update `ModelProfile` in
`langchain-core`. Partner packages like `langchain-aws` that set
`extra="forbid"` on their Pydantic models hit `extra_forbidden`
validation errors when Pydantic encountered undeclared TypedDict keys at
construction time. This adds the missing fields, makes `ModelProfile`
forward-compatible, provides a base-class hook so partners can stop
duplicating model-profile validator boilerplate, migrates all in-repo
partners to the new hook, and adds runtime + CI-time warnings for schema
drift.

## Changes

### `langchain-core`
- Add `__pydantic_config__ = ConfigDict(extra="allow")` to
`ModelProfile` so unknown profile keys pass Pydantic validation even on
models with `extra="forbid"` — forward-compatibility for when the CLI
schema evolves ahead of core
- Declare the 7 missing fields on `ModelProfile`: `name`, `status`,
`release_date`, `last_updated`, `open_weights` (metadata) and
`attachment`, `temperature` (capabilities)
- Add `_warn_unknown_profile_keys()` in `model_profile.py` — emits a
`UserWarning` when a profile dict contains keys not in `ModelProfile`,
suggesting a core upgrade. Wrapped in a bare `except` so introspection
failures never crash model construction
- Add `BaseChatModel._resolve_model_profile()` hook that returns `None`
by default. Partners can override this single method instead of
redefining the full `_set_model_profile` validator — the base validator
calls it automatically
- Add `BaseChatModel._check_profile_keys` as a separate
`model_validator` that calls `_warn_unknown_profile_keys`. Uses a
distinct method name so partner overrides of `_set_model_profile` don't
inadvertently suppress the check

### `langchain-profiles` CLI
- Add `_warn_undeclared_profile_keys()` to the CLI (`cli.py`), called
after merging augmentations in `refresh()` — warns at profile-generation
time (not just runtime) when emitted keys aren't declared in
`ModelProfile`. Gracefully skips if `langchain-core` isn't installed
- Add guard test
`test_model_data_to_profile_keys_subset_of_model_profile` in
model-profiles — feeds a fully-populated model dict to
`_model_data_to_profile()` and asserts every emitted key exists in
`ModelProfile.__annotations__`. CI fails before any release if someone
adds a CLI field without updating the TypedDict

### Partner packages
- Migrate all 10 in-repo partners to the `_resolve_model_profile()`
hook, replacing duplicated `@model_validator` / `_set_model_profile`
overrides: anthropic, deepseek, fireworks, groq, huggingface, mistralai,
openai (base + azure), openrouter, perplexity, xai
- Anthropic retains custom logic (context-1m beta → `max_input_tokens`
override); all others reduce to a one-liner
- Add `pr_lint.yml` scope for the new `model-profiles` package
2026-03-23 00:44:27 -04:00
Mason Daugherty
5ffece5c03 chore(core): remove stale blockbuster allowlist for deleted context module (#36168)
Closes #29530

---

Remove a stale BlockBuster allowlist entry in `conftest.py` referencing
`aconfig_with_context` — the function and its containing module
(`langchain_core/beta/runnables/context.py`) were deleted in `fded6c6b1`
(Sep 2025, #32850). Spotted by @antonio-mello-ai in #29530.
2026-03-22 20:39:55 -04:00
ccurme
70c88c0e72 fix(core): trace invocation params in metadata (#36080) 2026-03-18 13:20:18 -04:00
Eugene Yurtsev
dd136337d7 feat(core): harden anti-ssrf (#35960)
harden anti-ssrf

---------

Co-authored-by: Mason Daugherty <mason@langchain.dev>
2026-03-18 10:41:43 -04:00
Mohammad Mohtashim
b21c0a8062 fix(core): preserve default_factory when generating tool call schema (#35550) 2026-03-08 15:34:21 -04:00
Mason Daugherty
61fd90a2f3 fix(core): extract usage metadata from serialized tracer message outputs (#35526)
Fixes missing `run.metadata.usage_metadata` population in
`LangChainTracer` for real LLM/chat traces following #34414

- Fix extraction to read usage from serialized tracer message shape:
`outputs.generations[*][*].message.kwargs.usage_metadata`
- Remove non-serialized direct message shape handling
(`message.usage_metadata`) from extractor to match real tracer output
path
- Clarify tracer docstrings around chat callback naming
(`on_chat_model_start` + shared `on_llm_end`) to reduce ambiguity

## Why

#34414 introduced usage duplication into `run.metadata.usage_metadata`,
but the extractor read `message.usage_metadata`.

In real tracer flow, messages are serialized with `dumpd(...)` during
run completion, so usage metadata lives under
`message.kwargs.usage_metadata`. Because of this mismatch, duplication
did not trigger in real traces.
2026-03-02 17:43:33 -05:00
Guofang.Tang
78678534f9 fix(core): treat empty tool chunk ids as missing in merge (#35414) 2026-02-24 18:12:49 -05:00
Tanzim Hossain Romel
2d1492a864 fix(core): improve error message for non-JSON-serializable tool schemas (#34376) 2026-02-22 17:32:00 -05:00
Balaji Seshadri
d6e46bb4b0 fix(core): improve typing/docs for on_chat_model_start to clarify required positional args (#35324) 2026-02-22 14:46:32 -05:00
yaowubarbara
5053436dcf fix(core): fix merge_lists incorrectly merging parallel tool calls (#35281) 2026-02-18 20:33:17 -05:00
Shivangi Sharma
3686bcbd96 fix(core): accept int temperature in _get_ls_params for LangSmith tracing (#35302) 2026-02-18 18:30:02 -05:00
ccurme
9c160e2368 revert: accept integer temperature values in _get_ls_params (#35319) 2026-02-18 18:19:28 -05:00
Balaji Seshadri
a9f3627229 fix(core): accept integer temperature values in _get_ls_params (#35317) 2026-02-18 17:52:13 -05:00
KarthikRed2000
a565cf85eb fix(core): prevent recursion error when args_schema is dict (#35260) 2026-02-17 17:47:17 -05:00
Mason Daugherty
ba3ad67328 fix(core): preserve index and timestamp fields when merging (#34731)
Porting https://github.com/langchain-ai/langchainjs/pull/9781
2026-02-17 11:29:41 -05:00
Christophe Bornet
b97c629f9a style: bump ruff version to 0.15 (#35042) 2026-02-12 19:34:02 -05:00
ccurme
16cabfa212 fix(core): fix setting ChatGeneration.text (#35191) 2026-02-12 15:43:53 -05:00
ccurme
2b4b1dc29a fix(openai): sanitize urls when counting tokens in images (#35143) 2026-02-10 15:25:10 -05:00
ccurme
e8e47b083e feat(core): count tokens from tool schemas in count_tokens_approximately (#35098) 2026-02-09 14:10:44 -05:00
ccurme
032d01dd0f fix(core): adjust cap when scaling approximate token counts (#35017) 2026-02-04 19:02:48 -05:00
ccurme
5981ee142c fix(core): apply cap when scaling approximate token counts (#35005) 2026-02-03 21:20:48 -05:00
ccurme
09654f4382 feat(core): allow scaling by reported usage when counting tokens approximately (#34996) 2026-02-03 15:19:18 -05:00
Mason Daugherty
ae5b50f37f test(core): increase delta_time for flaky test (#34982)
This regularly flaked
2026-02-02 13:33:15 -05:00
Nandana Dileep
ef067078b8 fix(core): fix nested mustache variable extraction and update docs (#34872) 2026-01-31 21:30:57 -05:00
Akshaya Shanbhogue
84864d77ed fix(core): allow base model annotations for empty model (#34932) 2026-01-31 20:49:02 -05:00
Louis Auneau
f5252b438e fix(core): google docstring parsing with no arguments/reserved arguments (#34861) 2026-01-30 22:48:58 -05:00
Lewis Whitehill
0c9d392d41 test(core): add tests for approximate token counting with multimodal messages (#34898) 2026-01-30 12:35:16 -08:00
zer0
6ff8436fb0 fix(core): raise outputparserexception for unknown tools (#34923) 2026-01-30 09:35:31 -08:00
cc
585b691c1d feat(core): add multimodal support to count_tokens_approximately (#34883) 2026-01-26 15:04:25 -08:00
Mason Daugherty
11df1bedc3 style(core): lint (#34862)
it looks scary but i promise it is not

improving documentation consistency across core. primarily update
docstrings and comments for better formatting, readability, and
accuracy, as well as add minor clarifications and formatting
improvements to user-facing documentation.
2026-01-23 23:07:48 -05:00
Weichen Zhao
d899681040 feat(core): add XML format option for get_buffer_string (#34802)
## Summary

Add XML format option for `get_buffer_string()` to provide unambiguous
message serialization. This fixes role prefix ambiguity when message
content contains strings like "Human:" or "AI:".

  Fixes #34786

  ## Changes

- Add `format="xml"` parameter with proper XML escaping using
`quoteattr()` for attributes
- Add explicit validation for format parameter (raises `ValueError` for
invalid values)
  - Add comprehensive tests for XML format edge cases

<img width="1952" height="706" alt="image"
src="https://github.com/user-attachments/assets/1cd6f887-9365-43cf-a532-72d7addd8bad"
/>
<img width="2786" height="776" alt="image"
src="https://github.com/user-attachments/assets/a07b0db0-519c-46d7-b34b-b404237d812b"
/>

---------

Co-authored-by: Mason Daugherty <github@mdrxy.com>
Co-authored-by: Mason Daugherty <mason@langchain.dev>
2026-01-22 13:33:08 -05:00
David Fernandez
5b401fa414 refactor(core): generalize comma_list utility to support any Iterable (#34714)
Updates `comma_list` in `libs/core/langchain_core/utils/strings.py` to
accept `Iterable[Any]` instead of `list[Any]`, making the utility more
flexible.

---------

Co-authored-by: Mason Daugherty <github@mdrxy.com>
2026-01-12 20:26:59 -05:00
Mason Daugherty
0b99ca4fcd docs(core): enhance docstrings for ToolCall and ToolCallChunk (#34719) 2026-01-12 15:50:28 -05:00