langchain

mirror of https://github.com/hwchase17/langchain.git synced 2026-03-18 02:53:16 +00:00

Author	SHA1	Message	Date
ccurme	725d204b95	fix(langchain): tag messages generated from summarization (#34693 )	2026-01-12 09:26:09 -05:00
Shreyansh Singh Gautam	2ef23882d2	fix(core): add `tool_call_id` to `on_tool_error` event data (#33731 ) # Add `tool_call_id` to `on_tool_error` event data ## Summary This PR addresses issue #33597 by adding `tool_call_id` to the `on_tool_error` callback event data. This enables users to link tool errors to specific tool calls in stateless agent implementations, which is essential for building OpenAI-compatible APIs and tracking tool execution flows. ## Problem When streaming events using `astream_events` with `version="v2"`, the `on_tool_error` event only included the error and input data, but lacked the `tool_call_id`. This made it difficult to: - Link errors to specific tool calls in stateless agent scenarios - Implement OpenAI-compatible APIs that require tool call tracking - Track tool execution flows when using `run_id` is not sufficient ## Solution The fix adds `tool_call_id` propagation through the callback chain: 1. Pass `tool_call_id` to callbacks: Updated `BaseTool.run()` and `BaseTool.arun()` to pass `tool_call_id` to both `on_tool_start` and `on_tool_error` callbacks 2. Store in event stream handler: Modified `_AstreamEventsCallbackHandler` to store `tool_call_id` in run info during `on_tool_start` 3. Include in error events: Updated `on_tool_error` handler to extract and include `tool_call_id` in the event data ## Changes - `libs/core/langchain_core/tools/base.py`: - Pass `tool_call_id` to `on_tool_start` in both sync and async methods - Pass `tool_call_id` to `on_tool_error` when errors occur - `libs/core/langchain_core/tracers/event_stream.py`: - Store `tool_call_id` in run info during `on_tool_start` - Extract `tool_call_id` from kwargs or run info in `on_tool_error` - Include `tool_call_id` in the `on_tool_error` event data ## Testing The fix was verified by: 1. Direct tool invocation: Confirmed `tool_call_id` appears in `on_tool_error` event data when calling tools directly 2. Agent integration: Tested with `create_agent` to ensure `tool_call_id` is present in error events during agent execution ```python # Example verification async for event in agent.astream_events( {"messages": "Please demonstrate a tool error"}, version="v2", ): if event["event"] == "on_tool_error": assert "tool_call_id" in event["data"] # ✓ Now passes print(event["data"]["tool_call_id"]) ``` ## Backward Compatibility - ✅ Fully backward compatible: `tool_call_id` is optional (can be `None`) - ✅ No breaking changes: All changes are additive - ✅ Existing code continues to work without modification ## Related Issues Fixes #33597 --------- Co-authored-by: Mason Daugherty <github@mdrxy.com>	2026-01-10 02:35:13 -05:00
Bhavesh Sharma	e261924030	fix(core): improve error message for missing title in JSON schema functions (#34683 ) Changes Created I have fixed the issue where a generic and misleading error message was displayed when a JSON schema was missing the top-level title key. [Fix: Improve error message for missing title in JSON schema functions](https://github.com/Bhavesh007Sharma/langchain/tree/fix-json-schema-title-error) File Modified: libs/core/langchain_core/utils/function_calling.py I updated the convert_to_openai_function validation logic to specifically check for dict inputs that look like schemas ( type or properties keys present) but are missing the title key. # Before (Generic Error) raise ValueError( f"Unsupported function\n\n{function}\n\nFunctions must be passed in" " as Dict, pydantic.BaseModel, or Callable. If they're a dict they must" " either be in OpenAI function format or valid JSON schema with top-level" " 'title' and 'description' keys." ) # After (Specific Error) if isinstance(function, dict) and ("type" in function or "properties" in function): msg = ( "Unsupported function\n\nTo use a JSON schema as a function, " "it must have a top-level 'title' key to be used as the function name." ) raise ValueError(msg) Verification Results Automated Tests I created a reproduction script reproduce_issue.py to confirm the behavior. Before Fix: The script would have raised the generic "Unsupported function" error claiming description was also required. After Fix: The script now confirms that the new, specific error message is raised when title is missing. (Note: Verification was performed by inspecting the code logic and running a lightweight reproduction script locally, as full suite verification had environment dependency issues.) --------- Co-authored-by: Mason Daugherty <github@mdrxy.com>	2026-01-09 23:10:09 -05:00
Krud-x	d22cfaf7c6	fix(core): make `yield_keys` prefix keyword-only to match `BaseStore` (#34659 ) This PR fixes a signature mismatch between BaseStore and its concrete implementations by making the `prefix` parameter keyword-only in `yield_keys` and `ayield_keys`. This aligns the implementations with the BaseStore interface contract, prevents Liskov Substitution Principle violations, and ensures consistent method signatures across store backends. Fixes #32637 Breaking changes None. This change only enforces the existing abstract interface and does not modify runtime behavior Testing - Verified that existing test suites pass after the signature fix. Parts of this contribution were assisted by generative AI for code navigation and drafting. All final design decisions and changes were reviewed and validated manually. --------- Co-authored-by: Khagesh-Anayasmi <khagesh.desai@anayasmi.in> Co-authored-by: Mason Daugherty <mason@langchain.dev>	2026-01-09 23:07:47 -05:00
Mason Daugherty	3bd8c0c4a3	fix(standard-tests): add type ignore (#34696 ) Regression introduced in `8e3c6b109f` The commit changed the return annotation of `with_structured_output` from `typing.Dict \| BaseModel` to `builtins.dict[str, Any] \| BaseModel`. Since `BaseModel` refers to `pydantic.BaseModel (v2)`, but the test `test_structured_output_pydantic_2_v1` uses `pydantic.v1.BaseModel`, mypy's `warn_unreachable` setting flags the `isinstance` checks as unreachable (since a class can't be both a `dict` and a different `BaseModel` type). Switching to `builtins.dict[str, Any]` made the type more precise, which exposed this type incompatibility that was always latent but hidden by the looser `typing.Dict` annotation.	2026-01-09 23:07:05 -05:00
Christophe Bornet	a7b943bbe3	fix(langchain): activate `test_return_direct_spec` tests, fix types (#34565 ) Co-authored-by: Mason Daugherty <mason@langchain.dev> Co-authored-by: Mason Daugherty <github@mdrxy.com>	2026-01-09 22:52:12 -05:00
Christophe Bornet	5fbf270c9d	chore(langchain): fix types in `test_todo`, `test_tool_retry` (#34503 ) Co-authored-by: Mason Daugherty <mason@langchain.dev>	2026-01-09 22:50:20 -05:00
Christophe Bornet	e73b027686	chore(langchain): fix types in `test_shell_tool` (#34502 ) Co-authored-by: Mason Daugherty <mason@langchain.dev>	2026-01-09 22:46:56 -05:00
Christophe Bornet	ecd19ff71f	chore(langchain): activate `mypy warn_return_any` rule (#34549 ) Co-authored-by: Mason Daugherty <github@mdrxy.com> Co-authored-by: Mason Daugherty <mason@langchain.dev>	2026-01-09 22:46:25 -05:00
Christophe Bornet	cb0d227d8a	chore(langchain): fix types in `test_tool_selection` and `test_tool_emulator` (#34499 ) Co-authored-by: Mason Daugherty <mason@langchain.dev>	2026-01-09 22:37:54 -05:00
Christophe Bornet	b688e36e38	chore(langchain): fix types in `test_shell_execution_policies` (#34498 ) Co-authored-by: Mason Daugherty <mason@langchain.dev>	2026-01-09 22:31:53 -05:00
Christophe Bornet	606ef38e74	chore(langchain): improve `ignore_missing_imports` config (#34551 ) Co-authored-by: Mason Daugherty <mason@langchain.dev>	2026-01-09 22:18:45 -05:00
Christophe Bornet	36e590ca5f	test(langchain): complete and activate `test_responses` tests (#34560 ) Co-authored-by: Mason Daugherty <github@mdrxy.com> Co-authored-by: Mason Daugherty <mason@langchain.dev>	2026-01-09 22:17:03 -05:00
Christophe Bornet	fc417aaf17	fix(langchain): activate mypy warn-unreachable (#34553 ) Co-authored-by: Mason Daugherty <mason@langchain.dev>	2026-01-09 22:11:16 -05:00
Christophe Bornet	5dc8ba3c99	chore(langchain): fix types in `test_injected_runtime_create_agent`, `test_create_agent_tool_validation` (#34568 )	2026-01-09 21:50:18 -05:00
Christophe Bornet	f1ab8c5c80	chore(langchain): fix types in `test_response_format` and `test_state_schema` (#34571 )	2026-01-09 21:49:16 -05:00
Christophe Bornet	bfe0a26547	chore(langchain): remove generic from `FakeToolCallingModel` (#34572 ) * Making `FakeToolCallingModel` generic on its `structured_response` doesn't help anywhere in typing. * There are more than 120 references of `FakeToolCallingModel` in the code where you get ` error: Need type annotation for "model" [var-annotated]` because mypy can't resolve the generic type (we don't see them atm because they are in files temporarily excluded from mypy checking). We would need to explicitly type them to `FakeToolCallingModel[Any]` Co-authored-by: Mason Daugherty <mason@langchain.dev>	2026-01-09 21:48:33 -05:00
Christophe Bornet	bb5bd1181f	chore(langchain): fix types in `test_context_editing`, `test_agent_name`, `test_response_format_integration` (#34574 ) Co-authored-by: Mason Daugherty <mason@langchain.dev>	2026-01-09 21:47:46 -05:00
Mason Daugherty	9093c6effe	chore(core): bump lock (#34695 )	2026-01-09 21:42:41 -05:00
Christophe Bornet	8cb7dbd37b	chore(core): improve types for `RunnableLambda` (#34539 ) Co-authored-by: Mason Daugherty <mason@langchain.dev>	2026-01-09 21:42:27 -05:00
Christophe Bornet	2a2a4067ca	chore(core): improve types for `StreamingRunnable` (#34540 )	2026-01-09 21:34:50 -05:00
Christophe Bornet	5e9765d811	chore(langchain): fix types in `test_overrides` (#34635 ) Co-authored-by: Mason Daugherty <github@mdrxy.com> Co-authored-by: Mason Daugherty <mason@langchain.dev>	2026-01-09 18:31:13 -05:00
Mason Daugherty	703736a1e3	feat(langchain): add `state` to `_ModelRequestOverrides` (#34692 ) Appears `override()`'s docstring in `langgraph` already shows `state=new_state` as a valid usage pattern Works since `dataclasses.replace()` accepts any field, but the `TypedDicts` weren't updated to match. Caused mypy to flag legitimate usage as an error.	2026-01-09 18:28:24 -05:00
Christophe Bornet	61fd703e5f	chore(langchain): fix types in `test_tools` (#34592 ) Co-authored-by: Mason Daugherty <mason@langchain.dev> Co-authored-by: Mason Daugherty <github@mdrxy.com>	2026-01-09 18:05:28 -05:00
Christophe Bornet	4e40c2766a	chore(langchain): fix types in `test_summarization` (#34656 ) Co-authored-by: Mason Daugherty <mason@langchain.dev> Co-authored-by: Mason Daugherty <github@mdrxy.com>	2026-01-09 17:54:42 -05:00
Christophe Bornet	9ce73a73f8	test(langchain): activate `test_responses_spec` tests (#34564 ) description by @mdrxy - Enable `test_responses_spec.py` integration tests that were previously skipped at module level - Widen `ToolStrategy.schema` type annotation from `type[SchemaT]` to `type[SchemaT] \| dict[str, Any]` to match actual supported usage (JSON schema dicts were already handled at runtime) - Fix type annotations and linting issues in test file (modernize to `dict`/`list`, add return types, prefix unused `_request` param) - Improve generic typing in `load_spec` utility with bounded `TypeVar` Co-authored-by: Mason Daugherty <mason@langchain.dev>	2026-01-09 17:44:33 -05:00
Christophe Bornet	b4cd67ac15	style(langchain): fix some ruff preview rules (#34663 ) Co-authored-by: Mason Daugherty <github@mdrxy.com> Co-authored-by: Mason Daugherty <mason@langchain.dev>	2026-01-09 17:41:05 -05:00
Christophe Bornet	8e3c6b109f	style(core): fix some noqa escapes (#34675 ) Co-authored-by: Mason Daugherty <github@mdrxy.com> Co-authored-by: Mason Daugherty <mason@langchain.dev>	2026-01-09 17:36:08 -05:00
Christophe Bornet	fd69425439	style(text-splitters): fix some ruff preview rules (#34665 ) Co-authored-by: Mason Daugherty <mason@langchain.dev> Co-authored-by: Mason Daugherty <github@mdrxy.com>	2026-01-09 17:28:18 -05:00
Christophe Bornet	e6dde3267a	chore(langchain): fix types in `test_framework` (#34567 ) Co-authored-by: Mason Daugherty <mason@langchain.dev> Co-authored-by: Mason Daugherty <github@mdrxy.com>	2026-01-09 17:24:38 -05:00
Christophe Bornet	23c4c506d3	chore(langchain): fix types in `memory_assert`, `conftest`, `conftest_checkpointer` and `conftest_store` (#34636 ) Co-authored-by: Mason Daugherty <mason@langchain.dev> Co-authored-by: Mason Daugherty <github@mdrxy.com>	2026-01-09 17:18:05 -05:00
Christophe Bornet	d1404e63bb	chore(langchain): fix types in `test_system_message` (#34634 ) Co-authored-by: Mason Daugherty <github@mdrxy.com> Co-authored-by: Mason Daugherty <mason@langchain.dev>	2026-01-09 17:17:57 -05:00
Mason Daugherty	18c25e9f10	chore: ban relative imports on all packages (#34691 )	2026-01-09 17:02:24 -05:00
Christophe Bornet	8e824d9ec4	style: bump ruff version to 0.14.11 (#34674 ) With ruff 0.14.11+, we can remove `PLW1510` from `unfixable` (see https://github.com/astral-sh/ruff/issues/17091)	2026-01-09 16:30:24 -05:00
Sydney Runkle	fbe9babb34	fix: remove relative imports (#34680 ) standardizing on absolute imports rather than relative across the codebase	2026-01-09 13:00:51 -05:00
Mason Daugherty	2e8744559d	fix(langchain,langchain-classic): more descriptive error msg when dep is not installed (#34679 )	2026-01-09 12:41:55 -05:00
ccurme	19edaa8acb	chore(openai): delete outdated test (#34682 )	2026-01-09 12:37:44 -05:00
Guofang.Tang	384158daec	fix(langchain): infer provider from mixed-case prefixes (#34672 ) Fix provider inference for mixed-case model prefixes and add matching unit coverage.	2026-01-09 11:07:14 -05:00
Sydney Runkle	c080296bed	release: langchain-core 1.2.7 (#34678 )	2026-01-09 16:02:38 +00:00
Sydney Runkle	323c76504a	fix: add test confirming we don't inject args based on `args_schema` alone (#34677 ) pending exclusion from function signature	2026-01-09 11:00:13 -05:00
Sydney Runkle	ed2aa9f747	fix: don't trace injected args only found in signature (#34670 ) for the case when they're not included in the `args_schema` this was predicted by @eyurtsev's comment here: https://github.com/langchain-ai/langchain/pull/33729/files#r2475538173 pairing w/ this PR in mcp adapters: https://github.com/langchain-ai/langchain-mcp-adapters/pull/407	2026-01-09 09:58:34 -05:00
Mason Daugherty	76da99e022	release(langchain): 1.2.3 (#34668 )	2026-01-08 15:24:32 -05:00
Aman Gupta	2847814c70	feat(core): add more file extensions to ignore in HTML link extraction (#34552 ) # feat(core): add more file extensions to ignore in HTML link extraction ## Description This PR enhances the HTML link extraction utility in `libs/core/langchain_core/utils/html.py` by expanding the `SUFFIXES_TO_IGNORE` list to include additional common binary file extensions: - `.webp` - `.pdf` - `.docx` - `.xlsx` - `.pptx` - `.pptm` These file types are non-HTML, non-crawlable resources. Ignoring them prevents `find_all_links` and `extract_sub_links` from mistakenly treating such binary assets as navigable links. This improves link filtering, reduces unnecessary crawling, and aligns behavior with typical web scraping expectations. ## Summary of Changes - Updated `libs/core/langchain_core/utils/html.py`: Added `.webp`, `.pdf`, `.docx`, `.xlsx`, `.pptx`, `.pptm` to `SUFFIXES_TO_IGNORE`. ## Related Issues N/A ## Verification - `ruff check libs/core/langchain_core/utils/html.py`: Passed - `mypy libs/core/langchain_core/utils/html.py`: Passed - `pytest libs/core/tests/unit_tests/utils/test_html.py`: Passed (11 tests) --------- Co-authored-by: Mason Daugherty <mason@langchain.dev>	2026-01-08 14:40:22 -05:00
ccurme	d383f00489	refactor(langchain): engage summarization based on reported `usage_metadata` (#34632 )	2026-01-08 11:12:00 -05:00
Aman Gupta	50c5bb5607	refactor(core): improve docstrings for HTML link extraction utilities (#34550 ) # refactor(core): improve docstrings for HTML link extraction utilities ## Description This PR updates and clarifies the docstrings for `find_all_links` and `extract_sub_links` in `libs/core/langchain_core/utils/html.py`. The previous return-value descriptions were vague (e.g., "all links", "sub links"). They have now been revised to clearly describe the behavior and output of each function: - find_all_links → “A list of all links found in the HTML.” - extract_sub_links → “A list of absolute paths to sub links.” These improvements make the utilities more understandable and developer-friendly without altering functionality. ## Verification - `ruff check libs/core/langchain_core/utils/html.py`: Passed - `pytest libs/core/tests/unit_tests/utils/test_html.py`: Passed ## Checklists - PR title follows the required format: `TYPE(SCOPE): DESCRIPTION` - Changes are limited to the `langchain-core` package - `make format`, `make lint`, and `make test` pass	2026-01-08 10:21:17 -05:00
Mason Daugherty	2b6911d9af	fix(langchain): keep tool call / `AIMessage` pairings when summarizing (#34609 ) Fixes #34282 Before: When using agents with tools (like file reading, web search, etc.), the conversation looks like this: ``` [User] "Read these 10 files and summarize them" [AI] "I'll read all 10 files" + [tool_call: read_file x 10] [Tool] "Contents of file1.txt..." [Tool] "Contents of file2.txt..." [Tool] "Contents of file3.txt..." ... (7 more tool responses) ``` When the conversation gets too long, `SummarizationMiddleware` kicks in to compress older messages. The problem was: If you asked to keep the last 6 messages, you'd get: ``` [Summary] "Here's what happened before..." [Tool] "Contents of file5.txt..." [Tool] "Contents of file6.txt..." [Tool] "Contents of file7.txt..." [Tool] "Contents of file8.txt..." [Tool] "Contents of file9.txt..." [Tool] "Contents of file10.txt..." ``` The AI's original request to read the files (`[AI]` message with `tool_calls`) was summarized away, but the tool responses remained. This caused the error: ``` Error code: 400 - "No tool call found for function call output with call_id..." ``` Many APIs require that every tool response has a matching tool request. Without the AI message, the tool responses are "orphaned." ## The fix Now when the cutoff lands on tool messages, we move backward to include the AI message that requested those tools: Same scenario, keeping last 6 messages: ``` [Summary] "Here's what happened before..." [AI] "I'll read all 10 files" + [tool_call: read_file x 10] [Tool] "Contents of file1.txt..." [Tool] "Contents of file2.txt..." ... (all 10 tool responses) ``` The AI message is preserved along with its tool responses, keeping them paired together. ## Practical examples ### Example 1: Parallel tool calls Scenario: Agent reads 10 files in parallel, summarization triggers (see above) ### Example 2: Mixed conversation Scenario: User asks question, AI uses tools, user says thanks ``` [User] "What's the weather?" [AI] "Let me check" + [tool_call: get_weather] [Tool] "72F and sunny" [AI] "It's 72F and sunny!" [User] "Thanks!" ``` Keeping last 2 messages: \| Before (Bug) \| After (Fix) \| \|--------------\|-------------\| \| Only `[User] "Thanks!"` kept \| `[AI] + [Tool] + [AI] + [User]` all kept \| \| Lost the weather info \| Tool pair preserved with response \| ### Example 3: Multiple tool sequences ``` [User] "Search for X" [AI] [tool_call: search] [Tool] "Results for X" [User] "Now search for Y" [AI] [tool_call: search] [Tool] "Results for Y" [User] "Great!" ``` Keeping last 3 messages: If cutoff lands on `[Tool] "Results for Y"`, we now include `[AI] [tool_call: search]` to keep the pair together.	2026-01-08 10:07:56 -05:00
Guofang.Tang	f805ea9601	test(langchain): cover chat model provider inference (#34657 ) Add unit coverage for chat model provider inference across common model name prefixes. This improves regression protection without touching runtime --------- Co-authored-by: Mason Daugherty <github@mdrxy.com>	2026-01-08 09:59:12 -05:00
Stephan Günther	0276cc0290	fix(langchain): fix copy-paste error on azure_openai embedding provider map (#34655 ) Fixes a bug introduced with commit `85f1ba2` (released in `langchain == 1.2.1`). Whenever the index embedding of the langgraph-server is configured with `azure_openai` provider, the wrong class is going to be initialized (and fails to do so if the now unexpected credentials in environment variable `OPENAI_API_KEY` is not provided). Example configuration file `langgraph.json` that will reproduce the issue: (see https://docs.langchain.com/langsmith/cli#adding-semantic-search-to-the-store) ```json { "dependencies": ["."], "graphs": { "chat": "src/agents/chat/graph.py:graph", }, "store": { "index": { "embed": "azure_openai:text-embedding-3-small", "dims": 1536 } }, "python_version": "3.13", "image_distro": "wolfi" } ```	2026-01-08 09:54:53 -05:00
Eugene Yurtsev	ceca38d3fe	fix(langchain): add test to verify version (#34644 ) verify version in langchain to avoid accidental drift	2026-01-07 22:36:10 +00:00
Eugene Yurtsev	5554a36ad5	release(langchain): release 1.2.2 (#34643 ) Release langchain 1.2.2	2026-01-07 17:27:58 -05:00

... 4 5 6 7 8 ...

8592 Commits