langchain

mirror of https://github.com/hwchase17/langchain.git synced 2026-04-04 11:25:11 +00:00

Author	SHA1	Message	Date
ccurme	27add91347	fix(core): validate paths in `prompt.save` and `load_prompt`, deprecate methods (#36200 )	2026-03-24 14:27:14 -04:00
Mason Daugherty	2f64d80cc6	fix(core,model-profiles): add missing `ModelProfile` fields, warn on schema drift (#36129 ) PR #35788 added 7 new fields to the `langchain-profiles` CLI output (`name`, `status`, `release_date`, `last_updated`, `open_weights`, `attachment`, `temperature`) but didn't update `ModelProfile` in `langchain-core`. Partner packages like `langchain-aws` that set `extra="forbid"` on their Pydantic models hit `extra_forbidden` validation errors when Pydantic encountered undeclared TypedDict keys at construction time. This adds the missing fields, makes `ModelProfile` forward-compatible, provides a base-class hook so partners can stop duplicating model-profile validator boilerplate, migrates all in-repo partners to the new hook, and adds runtime + CI-time warnings for schema drift. ## Changes ### `langchain-core` - Add `__pydantic_config__ = ConfigDict(extra="allow")` to `ModelProfile` so unknown profile keys pass Pydantic validation even on models with `extra="forbid"` — forward-compatibility for when the CLI schema evolves ahead of core - Declare the 7 missing fields on `ModelProfile`: `name`, `status`, `release_date`, `last_updated`, `open_weights` (metadata) and `attachment`, `temperature` (capabilities) - Add `_warn_unknown_profile_keys()` in `model_profile.py` — emits a `UserWarning` when a profile dict contains keys not in `ModelProfile`, suggesting a core upgrade. Wrapped in a bare `except` so introspection failures never crash model construction - Add `BaseChatModel._resolve_model_profile()` hook that returns `None` by default. Partners can override this single method instead of redefining the full `_set_model_profile` validator — the base validator calls it automatically - Add `BaseChatModel._check_profile_keys` as a separate `model_validator` that calls `_warn_unknown_profile_keys`. Uses a distinct method name so partner overrides of `_set_model_profile` don't inadvertently suppress the check ### `langchain-profiles` CLI - Add `_warn_undeclared_profile_keys()` to the CLI (`cli.py`), called after merging augmentations in `refresh()` — warns at profile-generation time (not just runtime) when emitted keys aren't declared in `ModelProfile`. Gracefully skips if `langchain-core` isn't installed - Add guard test `test_model_data_to_profile_keys_subset_of_model_profile` in model-profiles — feeds a fully-populated model dict to `_model_data_to_profile()` and asserts every emitted key exists in `ModelProfile.__annotations__`. CI fails before any release if someone adds a CLI field without updating the TypedDict ### Partner packages - Migrate all 10 in-repo partners to the `_resolve_model_profile()` hook, replacing duplicated `@model_validator` / `_set_model_profile` overrides: anthropic, deepseek, fireworks, groq, huggingface, mistralai, openai (base + azure), openrouter, perplexity, xai - Anthropic retains custom logic (context-1m beta → `max_input_tokens` override); all others reduce to a one-liner - Add `pr_lint.yml` scope for the new `model-profiles` package	2026-03-23 00:44:27 -04:00
Mason Daugherty	5ffece5c03	chore(core): remove stale blockbuster allowlist for deleted context module (#36168 ) Closes #29530 --- Remove a stale BlockBuster allowlist entry in `conftest.py` referencing `aconfig_with_context` — the function and its containing module (`langchain_core/beta/runnables/context.py`) were deleted in `fded6c6b1` (Sep 2025, #32850). Spotted by @antonio-mello-ai in #29530.	2026-03-22 20:39:55 -04:00
ccurme	70c88c0e72	fix(core): trace invocation params in metadata (#36080 )	2026-03-18 13:20:18 -04:00
Eugene Yurtsev	dd136337d7	feat(core): harden anti-ssrf (#35960 ) harden anti-ssrf --------- Co-authored-by: Mason Daugherty <mason@langchain.dev>	2026-03-18 10:41:43 -04:00
Mohammad Mohtashim	b21c0a8062	fix(core): preserve default_factory when generating tool call schema (#35550 )	2026-03-08 15:34:21 -04:00
Mason Daugherty	61fd90a2f3	fix(core): extract usage metadata from serialized tracer message outputs (#35526 ) Fixes missing `run.metadata.usage_metadata` population in `LangChainTracer` for real LLM/chat traces following #34414 - Fix extraction to read usage from serialized tracer message shape: `outputs.generations[][].message.kwargs.usage_metadata` - Remove non-serialized direct message shape handling (`message.usage_metadata`) from extractor to match real tracer output path - Clarify tracer docstrings around chat callback naming (`on_chat_model_start` + shared `on_llm_end`) to reduce ambiguity ## Why #34414 introduced usage duplication into `run.metadata.usage_metadata`, but the extractor read `message.usage_metadata`. In real tracer flow, messages are serialized with `dumpd(...)` during run completion, so usage metadata lives under `message.kwargs.usage_metadata`. Because of this mismatch, duplication did not trigger in real traces.	2026-03-02 17:43:33 -05:00
Guofang.Tang	78678534f9	fix(core): treat empty tool chunk ids as missing in merge (#35414 )	2026-02-24 18:12:49 -05:00
Tanzim Hossain Romel	2d1492a864	fix(core): improve error message for non-JSON-serializable tool schemas (#34376 )	2026-02-22 17:32:00 -05:00
Balaji Seshadri	d6e46bb4b0	fix(core): improve typing/docs for on_chat_model_start to clarify required positional args (#35324 )	2026-02-22 14:46:32 -05:00
yaowubarbara	5053436dcf	fix(core): fix merge_lists incorrectly merging parallel tool calls (#35281 )	2026-02-18 20:33:17 -05:00
Shivangi Sharma	3686bcbd96	fix(core): accept int temperature in _get_ls_params for LangSmith tracing (#35302 )	2026-02-18 18:30:02 -05:00
ccurme	9c160e2368	revert: accept integer temperature values in _get_ls_params (#35319 )	2026-02-18 18:19:28 -05:00
Balaji Seshadri	a9f3627229	fix(core): accept integer temperature values in _get_ls_params (#35317 )	2026-02-18 17:52:13 -05:00
KarthikRed2000	a565cf85eb	fix(core): prevent recursion error when args_schema is dict (#35260 )	2026-02-17 17:47:17 -05:00
Mason Daugherty	ba3ad67328	fix(core): preserve index and timestamp fields when merging (#34731 ) Porting https://github.com/langchain-ai/langchainjs/pull/9781	2026-02-17 11:29:41 -05:00
Christophe Bornet	b97c629f9a	style: bump ruff version to 0.15 (#35042 )	2026-02-12 19:34:02 -05:00
ccurme	16cabfa212	fix(core): fix setting `ChatGeneration.text` (#35191 )	2026-02-12 15:43:53 -05:00
ccurme	2b4b1dc29a	fix(openai): sanitize urls when counting tokens in images (#35143 )	2026-02-10 15:25:10 -05:00
ccurme	e8e47b083e	feat(core): count tokens from tool schemas in `count_tokens_approximately` (#35098 )	2026-02-09 14:10:44 -05:00
ccurme	032d01dd0f	fix(core): adjust cap when scaling approximate token counts (#35017 )	2026-02-04 19:02:48 -05:00
ccurme	5981ee142c	fix(core): apply cap when scaling approximate token counts (#35005 )	2026-02-03 21:20:48 -05:00
ccurme	09654f4382	feat(core): allow scaling by reported usage when counting tokens approximately (#34996 )	2026-02-03 15:19:18 -05:00
Mason Daugherty	ae5b50f37f	test(core): increase `delta_time` for flaky test (#34982 ) This regularly flaked	2026-02-02 13:33:15 -05:00
Nandana Dileep	ef067078b8	fix(core): fix nested mustache variable extraction and update docs (#34872 )	2026-01-31 21:30:57 -05:00
Akshaya Shanbhogue	84864d77ed	fix(core): allow base model annotations for empty model (#34932 )	2026-01-31 20:49:02 -05:00
Louis Auneau	f5252b438e	fix(core): google docstring parsing with no arguments/reserved arguments (#34861 )	2026-01-30 22:48:58 -05:00
Lewis Whitehill	0c9d392d41	test(core): add tests for approximate token counting with multimodal messages (#34898 )	2026-01-30 12:35:16 -08:00
zer0	6ff8436fb0	fix(core): raise outputparserexception for unknown tools (#34923 )	2026-01-30 09:35:31 -08:00
cc	585b691c1d	feat(core): add multimodal support to count_tokens_approximately (#34883 )	2026-01-26 15:04:25 -08:00
Mason Daugherty	11df1bedc3	style(core): lint (#34862 ) it looks scary but i promise it is not improving documentation consistency across core. primarily update docstrings and comments for better formatting, readability, and accuracy, as well as add minor clarifications and formatting improvements to user-facing documentation.	2026-01-23 23:07:48 -05:00
Weichen Zhao	d899681040	feat(core): add XML format option for `get_buffer_string` (#34802 ) ## Summary Add XML format option for `get_buffer_string()` to provide unambiguous message serialization. This fixes role prefix ambiguity when message content contains strings like "Human:" or "AI:". Fixes #34786 ## Changes - Add `format="xml"` parameter with proper XML escaping using `quoteattr()` for attributes - Add explicit validation for format parameter (raises `ValueError` for invalid values) - Add comprehensive tests for XML format edge cases <img width="1952" height="706" alt="image" src="https://github.com/user-attachments/assets/1cd6f887-9365-43cf-a532-72d7addd8bad" /> <img width="2786" height="776" alt="image" src="https://github.com/user-attachments/assets/a07b0db0-519c-46d7-b34b-b404237d812b" /> --------- Co-authored-by: Mason Daugherty <github@mdrxy.com> Co-authored-by: Mason Daugherty <mason@langchain.dev>	2026-01-22 13:33:08 -05:00
David Fernandez	5b401fa414	refactor(core): generalize `comma_list` utility to support any `Iterable` (#34714 ) Updates `comma_list` in `libs/core/langchain_core/utils/strings.py` to accept `Iterable[Any]` instead of `list[Any]`, making the utility more flexible. --------- Co-authored-by: Mason Daugherty <github@mdrxy.com>	2026-01-12 20:26:59 -05:00
Mason Daugherty	0b99ca4fcd	docs(core): enhance docstrings for `ToolCall` and `ToolCallChunk` (#34719 )	2026-01-12 15:50:28 -05:00
Shreyansh Singh Gautam	2ef23882d2	fix(core): add `tool_call_id` to `on_tool_error` event data (#33731 ) # Add `tool_call_id` to `on_tool_error` event data ## Summary This PR addresses issue #33597 by adding `tool_call_id` to the `on_tool_error` callback event data. This enables users to link tool errors to specific tool calls in stateless agent implementations, which is essential for building OpenAI-compatible APIs and tracking tool execution flows. ## Problem When streaming events using `astream_events` with `version="v2"`, the `on_tool_error` event only included the error and input data, but lacked the `tool_call_id`. This made it difficult to: - Link errors to specific tool calls in stateless agent scenarios - Implement OpenAI-compatible APIs that require tool call tracking - Track tool execution flows when using `run_id` is not sufficient ## Solution The fix adds `tool_call_id` propagation through the callback chain: 1. Pass `tool_call_id` to callbacks: Updated `BaseTool.run()` and `BaseTool.arun()` to pass `tool_call_id` to both `on_tool_start` and `on_tool_error` callbacks 2. Store in event stream handler: Modified `_AstreamEventsCallbackHandler` to store `tool_call_id` in run info during `on_tool_start` 3. Include in error events: Updated `on_tool_error` handler to extract and include `tool_call_id` in the event data ## Changes - `libs/core/langchain_core/tools/base.py`: - Pass `tool_call_id` to `on_tool_start` in both sync and async methods - Pass `tool_call_id` to `on_tool_error` when errors occur - `libs/core/langchain_core/tracers/event_stream.py`: - Store `tool_call_id` in run info during `on_tool_start` - Extract `tool_call_id` from kwargs or run info in `on_tool_error` - Include `tool_call_id` in the `on_tool_error` event data ## Testing The fix was verified by: 1. Direct tool invocation: Confirmed `tool_call_id` appears in `on_tool_error` event data when calling tools directly 2. Agent integration: Tested with `create_agent` to ensure `tool_call_id` is present in error events during agent execution ```python # Example verification async for event in agent.astream_events( {"messages": "Please demonstrate a tool error"}, version="v2", ): if event["event"] == "on_tool_error": assert "tool_call_id" in event["data"] # ✓ Now passes print(event["data"]["tool_call_id"]) ``` ## Backward Compatibility - ✅ Fully backward compatible: `tool_call_id` is optional (can be `None`) - ✅ No breaking changes: All changes are additive - ✅ Existing code continues to work without modification ## Related Issues Fixes #33597 --------- Co-authored-by: Mason Daugherty <github@mdrxy.com>	2026-01-10 02:35:13 -05:00
Bhavesh Sharma	e261924030	fix(core): improve error message for missing title in JSON schema functions (#34683 ) Changes Created I have fixed the issue where a generic and misleading error message was displayed when a JSON schema was missing the top-level title key. [Fix: Improve error message for missing title in JSON schema functions](https://github.com/Bhavesh007Sharma/langchain/tree/fix-json-schema-title-error) File Modified: libs/core/langchain_core/utils/function_calling.py I updated the convert_to_openai_function validation logic to specifically check for dict inputs that look like schemas ( type or properties keys present) but are missing the title key. # Before (Generic Error) raise ValueError( f"Unsupported function\n\n{function}\n\nFunctions must be passed in" " as Dict, pydantic.BaseModel, or Callable. If they're a dict they must" " either be in OpenAI function format or valid JSON schema with top-level" " 'title' and 'description' keys." ) # After (Specific Error) if isinstance(function, dict) and ("type" in function or "properties" in function): msg = ( "Unsupported function\n\nTo use a JSON schema as a function, " "it must have a top-level 'title' key to be used as the function name." ) raise ValueError(msg) Verification Results Automated Tests I created a reproduction script reproduce_issue.py to confirm the behavior. Before Fix: The script would have raised the generic "Unsupported function" error claiming description was also required. After Fix: The script now confirms that the new, specific error message is raised when title is missing. (Note: Verification was performed by inspecting the code logic and running a lightweight reproduction script locally, as full suite verification had environment dependency issues.) --------- Co-authored-by: Mason Daugherty <github@mdrxy.com>	2026-01-09 23:10:09 -05:00
Christophe Bornet	8cb7dbd37b	chore(core): improve types for `RunnableLambda` (#34539 ) Co-authored-by: Mason Daugherty <mason@langchain.dev>	2026-01-09 21:42:27 -05:00
Christophe Bornet	2a2a4067ca	chore(core): improve types for `StreamingRunnable` (#34540 )	2026-01-09 21:34:50 -05:00
Christophe Bornet	8e3c6b109f	style(core): fix some noqa escapes (#34675 ) Co-authored-by: Mason Daugherty <github@mdrxy.com> Co-authored-by: Mason Daugherty <mason@langchain.dev>	2026-01-09 17:36:08 -05:00
Sydney Runkle	ed2aa9f747	fix: don't trace injected args only found in signature (#34670 ) for the case when they're not included in the `args_schema` this was predicted by @eyurtsev's comment here: https://github.com/langchain-ai/langchain/pull/33729/files#r2475538173 pairing w/ this PR in mcp adapters: https://github.com/langchain-ai/langchain-mcp-adapters/pull/407	2026-01-09 09:58:34 -05:00
Manas karthik	48cd13114f	test(core): add edge case for empty examples in LengthBasedExampleSelector (#34641 )	2026-01-07 15:26:53 -05:00
Mohammad Mohtashim	e6a9694f5d	fix(core): fix strict schema generation for functions with optional args (#34599 )	2026-01-07 15:13:18 -05:00
Chris Papademetrious	0c7b7e045d	feat(core): support custom message separator in get_buffer_string() (#34569 )	2026-01-07 11:46:17 -05:00
ゆり	be2c7f1aa8	test(core): add tests for formatting utils and merge functions (#34511 ) ## Summary Add comprehensive test coverage for previously untested utilities in `langchain-core`. ## Changes ### New file: `test_formatting.py` (18 tests) Tests for `StrictFormatter` class: - `test_vformat_with_keyword_args` - basic functionality - `test_vformat_with_multiple_keyword_args` - multiple placeholders - `test_vformat_with_empty_string` - edge case - `test_vformat_with_no_placeholders` - literal strings - `test_vformat_raises_on_positional_args` - error handling - `test_vformat_raises_on_multiple_positional_args` - error handling - `test_vformat_with_special_characters` - newlines, tabs - `test_vformat_with_unicode` - emoji, CJK characters - `test_vformat_with_format_spec` - format specifications - `test_vformat_with_nested_braces` - escaped braces Tests for `validate_input_variables`: - `test_validate_input_variables_success` - valid input - `test_validate_input_variables_with_extra_variables` - extra vars allowed - `test_validate_input_variables_with_missing_variable` - KeyError - `test_validate_input_variables_empty_format` - edge case - `test_validate_input_variables_no_placeholders` - edge case Tests for `formatter` singleton: - `test_formatter_is_strict_formatter` - type check - `test_formatter_format_works` - functionality - `test_formatter_rejects_positional_args` - error handling ### Extended `test_utils.py` (14 new tests) Tests for `merge_lists`: - Parametrized tests covering None handling, simple merge, empty lists, index-based merging - `test_merge_lists_multiple_others` - merging 3+ lists - `test_merge_lists_all_none` - all None inputs Tests for `merge_obj`: - Parametrized tests for None, strings, dicts, lists, equal values - `test_merge_obj_type_mismatch` - TypeError on type mismatch - `test_merge_obj_unmergeable_values` - ValueError on different values - `test_merge_obj_tuple_raises` - ValueError for tuples ## Test plan - [x] Tests follow existing patterns in the codebase - [x] All tests are unit tests (no network calls) - [x] Tests cover happy paths and error conditions - [x] Tests verify no mutation of input data ## AI Disclosure This contribution was developed with AI assistance (Claude Code). 🤖 Generated with [Claude Code](https://claude.com/claude-code) --------- Co-authored-by: yurekami <yurekami@users.noreply.github.com> Co-authored-by: Mason Daugherty <github@mdrxy.com> Co-authored-by: Mason Daugherty <mason@langchain.dev>	2026-01-05 14:20:11 -05:00
aroun-coumar	730a3676f8	fix(core): strip message IDs from cache keys using `model_copy` (#33915 ) Description: Closes #[33883](https://github.com/langchain-ai/langchain/issues/33883) Chat model cache keys are generated by serializing messages via `dumps(messages)`. The optional `BaseMessage.id` field (a UUID used solely for tracing/threading) is included in this serialization, causing functionally identical messages to produce different cache keys. This results in repeated API calls, cache bloat, and degraded performance in production workloads (e.g., agents, RAG chains, long conversations). This change normalizes messages only for cache key generation by stripping the nonsemantic `id` field using Pydantic V2’s `model_copy(update={"id": None})`. The normalization is applied in both synchronous and asynchronous cache paths (`_generate_with_cache` / `_agenerate_with_cache`) immediately before `dumps()`. ```python normalized_messages = [ msg.model_copy(update={"id": None}) if getattr(msg, "id", None) is not None else msg for msg in messages ] prompt = dumps(normalized_messages) --------- Co-authored-by: Mason Daugherty <mason@langchain.dev> Co-authored-by: Mason Daugherty <github@mdrxy.com>	2026-01-05 10:37:10 -05:00
Angus Jelinek	458a186540	chore(core): Update LangChainTracer to use Pydantic v2 methods (#34541 )	2026-01-02 16:02:13 -05:00
Christophe Bornet	e03d6b80d5	chore(deps): bump mypy to v1.19 and ruff to v1.14 (#34521 ) * Set mypy to >=1.19.1,<1.20 * Set ruff to >=0.14.10,<0.15	2025-12-29 18:07:55 -06:00
Christophe Bornet	03ae39747b	refactor(core): fix some missing generic types (#31658 ) See https://mypy.readthedocs.io/en/stable/config_file.html#confval-disallow_any_generics --------- Co-authored-by: Mason Daugherty <mason@langchain.dev> Co-authored-by: Mason Daugherty <github@mdrxy.com>	2025-12-27 16:53:08 -06:00
Christophe Bornet	5ef9f6e036	style(core): add ruff RUF012 rule (#34492 ) Co-authored-by: Mason Daugherty <mason@langchain.dev>	2025-12-27 02:36:28 -06:00
Connor Hyatt	e3939ade5a	fix(core): support `(message class, template)` tuples in `ChatPromptTemplate.from_messages` (#33989 ) ### Description `ChatPromptTemplate.from_messages` supports multiple tuple formats for defining message templates. One documented format is `(message class, template)`, which allows users to specify the message type using the class directly: ```python ChatPromptTemplate.from_messages([ (SystemMessage, "You are a helpful assistant named {name}."), (HumanMessage, "{input}"), ]) ``` However, this syntax was broken. Passing a tuple like `(HumanMessage, "{input}")` would raise a Pydantic validation error because the conversion logic in `_convert_to_message_template` didn't handle `BaseMessage` subclasses—it only recognized string-based role identifiers like `"human"` or `"system"`. This PR adds the missing branch to detect when the first element of a tuple is a message class (by checking for the `type` class attribute) and routes it through `_create_template_from_message_type`, which already knows how to create the appropriate `MessagePromptTemplate` for each message type. ### Changes - Updated `_convert_to_message_template` to properly support `(message class, template)` tuples ### Testing Added 16 comprehensive unit tests covering: - Basic usage with `HumanMessage`, `AIMessage`, and `SystemMessage` classes - Integration with `invoke()` method - Mixed syntax (message class tuples alongside string tuples) - Multiple template variables - Edge cases: empty templates, static text (no variables) - Correct extraction of `input_variables` - Partial variables support - Combination with `MessagesPlaceholder` - Mustache template format - Template operations: `append()`, `extend()`, concatenation, and slicing - Special characters and unicode in templates ### Issue Fixes #33791 ### Dependencies None --------- Co-authored-by: Mason Daugherty <mason@langchain.dev>	2025-12-27 02:20:33 -06:00

1 2 3 4 5 ...

740 Commits