Commit Graph

8594 Commits

Author SHA1 Message Date
Mason Daugherty
1dc2600cd4 docs(langchain): clarify model ID usage for reliable behavior (#34718)
Clarify the preference for using exact model IDs from provider
documentation over aliases to ensure reliable behavior in face of
upstream backend changes.
2026-01-12 15:10:59 -05:00
David Fernandez
6bcc4a1af1 docs: Fix TODO in Ollama compatibility docstring (#34713)
Replaces a leftover TODO in
`libs/partners/ollama/langchain_ollama/_compat.py` with a proper return
value description.
2026-01-12 12:52:25 -05:00
ccurme
725d204b95 fix(langchain): tag messages generated from summarization (#34693) 2026-01-12 09:26:09 -05:00
Shreyansh Singh Gautam
2ef23882d2 fix(core): add tool_call_id to on_tool_error event data (#33731)
# Add `tool_call_id` to `on_tool_error` event data

## Summary

This PR addresses issue #33597 by adding `tool_call_id` to the
`on_tool_error` callback event data. This enables users to link tool
errors to specific tool calls in stateless agent implementations, which
is essential for building OpenAI-compatible APIs and tracking tool
execution flows.

## Problem

When streaming events using `astream_events` with `version="v2"`, the
`on_tool_error` event only included the error and input data, but lacked
the `tool_call_id`. This made it difficult to:

- Link errors to specific tool calls in stateless agent scenarios
- Implement OpenAI-compatible APIs that require tool call tracking
- Track tool execution flows when using `run_id` is not sufficient

## Solution

The fix adds `tool_call_id` propagation through the callback chain:

1. **Pass `tool_call_id` to callbacks**: Updated `BaseTool.run()` and
`BaseTool.arun()` to pass `tool_call_id` to both `on_tool_start` and
`on_tool_error` callbacks
2. **Store in event stream handler**: Modified
`_AstreamEventsCallbackHandler` to store `tool_call_id` in run info
during `on_tool_start`
3. **Include in error events**: Updated `on_tool_error` handler to
extract and include `tool_call_id` in the event data

## Changes

- **`libs/core/langchain_core/tools/base.py`**:
- Pass `tool_call_id` to `on_tool_start` in both sync and async methods
  - Pass `tool_call_id` to `on_tool_error` when errors occur

- **`libs/core/langchain_core/tracers/event_stream.py`**:
  - Store `tool_call_id` in run info during `on_tool_start`
  - Extract `tool_call_id` from kwargs or run info in `on_tool_error`
  - Include `tool_call_id` in the `on_tool_error` event data

## Testing

The fix was verified by:

1. Direct tool invocation: Confirmed `tool_call_id` appears in
`on_tool_error` event data when calling tools directly
2. Agent integration: Tested with `create_agent` to ensure
`tool_call_id` is present in error events during agent execution

```python
# Example verification
async for event in agent.astream_events(
    {"messages": "Please demonstrate a tool error"},
    version="v2",
):
    if event["event"] == "on_tool_error":
        assert "tool_call_id" in event["data"]  # ✓ Now passes
        print(event["data"]["tool_call_id"])
```

## Backward Compatibility

-  Fully backward compatible: `tool_call_id` is optional (can be
`None`)
-  No breaking changes: All changes are additive
-  Existing code continues to work without modification

## Related Issues

Fixes #33597

---------

Co-authored-by: Mason Daugherty <github@mdrxy.com>
2026-01-10 02:35:13 -05:00
Bhavesh Sharma
e261924030 fix(core): improve error message for missing title in JSON schema functions (#34683)
Changes Created
I have fixed the issue where a generic and misleading error message was
displayed when a JSON schema was missing the top-level
title
 key.

[Fix: Improve error message for missing title in JSON schema
functions](https://github.com/Bhavesh007Sharma/langchain/tree/fix-json-schema-title-error)
File Modified: 
libs/core/langchain_core/utils/function_calling.py

I updated the 
convert_to_openai_function
 validation logic to specifically check for 
dict
 inputs that look like schemas (
type
 or 
properties
 keys present) but are missing the 
title
 key.

# Before (Generic Error)
raise ValueError(
    f"Unsupported function\n\n{function}\n\nFunctions must be passed in"
" as Dict, pydantic.BaseModel, or Callable. If they're a dict they must"
" either be in OpenAI function format or valid JSON schema with
top-level"
    " 'title' and 'description' keys."
)
# After (Specific Error)
if isinstance(function, dict) and ("type" in function or "properties" in
function):
    msg = (
        "Unsupported function\n\nTo use a JSON schema as a function, "
"it must have a top-level 'title' key to be used as the function name."
    )
    raise ValueError(msg)
Verification Results
Automated Tests
I created a reproduction script 
reproduce_issue.py
 to confirm the behavior.

Before Fix: The script would have raised the generic "Unsupported
function" error claiming description was also required.
After Fix: The script now confirms that the new, specific error message
is raised when
title
 is missing.
(Note: Verification was performed by inspecting the code logic and
running a lightweight reproduction script locally, as full suite
verification had environment dependency issues.)

---------

Co-authored-by: Mason Daugherty <github@mdrxy.com>
2026-01-09 23:10:09 -05:00
Krud-x
d22cfaf7c6 fix(core): make yield_keys prefix keyword-only to match BaseStore (#34659)
This PR fixes a signature mismatch between BaseStore and its concrete
implementations by making the `prefix` parameter keyword-only in
`yield_keys` and `ayield_keys`.

This aligns the implementations with the BaseStore interface contract,
prevents Liskov Substitution Principle violations, and ensures
consistent
method signatures across store backends.

Fixes #32637

Breaking changes 
None. This change only enforces the existing abstract interface and does
not modify runtime behavior

Testing
- Verified that existing test suites pass after the signature fix.

Parts of this contribution were assisted by generative AI for
code navigation and drafting. All final design decisions and changes
were
reviewed and validated manually.

---------

Co-authored-by: Khagesh-Anayasmi <khagesh.desai@anayasmi.in>
Co-authored-by: Mason Daugherty <mason@langchain.dev>
2026-01-09 23:07:47 -05:00
Mason Daugherty
3bd8c0c4a3 fix(standard-tests): add type ignore (#34696)
Regression introduced in 8e3c6b109f

The commit changed the return annotation of `with_structured_output`
from `typing.Dict | BaseModel` to `builtins.dict[str, Any] | BaseModel`.
Since `BaseModel` refers to `pydantic.BaseModel (v2)`, but the test
`test_structured_output_pydantic_2_v1` uses `pydantic.v1.BaseModel`,
mypy's `warn_unreachable` setting flags the `isinstance` checks as
unreachable (since a class can't be both a `dict` and a different
`BaseModel` type).

Switching to `builtins.dict[str, Any]` made the type more precise, which
exposed this type incompatibility that was always latent but hidden by
the looser `typing.Dict` annotation.
2026-01-09 23:07:05 -05:00
Christophe Bornet
a7b943bbe3 fix(langchain): activate test_return_direct_spec tests, fix types (#34565)
Co-authored-by: Mason Daugherty <mason@langchain.dev>
Co-authored-by: Mason Daugherty <github@mdrxy.com>
2026-01-09 22:52:12 -05:00
Christophe Bornet
5fbf270c9d chore(langchain): fix types in test_todo, test_tool_retry (#34503)
Co-authored-by: Mason Daugherty <mason@langchain.dev>
2026-01-09 22:50:20 -05:00
Christophe Bornet
e73b027686 chore(langchain): fix types in test_shell_tool (#34502)
Co-authored-by: Mason Daugherty <mason@langchain.dev>
2026-01-09 22:46:56 -05:00
Christophe Bornet
ecd19ff71f chore(langchain): activate mypy warn_return_any rule (#34549)
Co-authored-by: Mason Daugherty <github@mdrxy.com>
Co-authored-by: Mason Daugherty <mason@langchain.dev>
2026-01-09 22:46:25 -05:00
Christophe Bornet
cb0d227d8a chore(langchain): fix types in test_tool_selection and test_tool_emulator (#34499)
Co-authored-by: Mason Daugherty <mason@langchain.dev>
2026-01-09 22:37:54 -05:00
Christophe Bornet
b688e36e38 chore(langchain): fix types in test_shell_execution_policies (#34498)
Co-authored-by: Mason Daugherty <mason@langchain.dev>
2026-01-09 22:31:53 -05:00
Christophe Bornet
606ef38e74 chore(langchain): improve ignore_missing_imports config (#34551)
Co-authored-by: Mason Daugherty <mason@langchain.dev>
2026-01-09 22:18:45 -05:00
Christophe Bornet
36e590ca5f test(langchain): complete and activate test_responses tests (#34560)
Co-authored-by: Mason Daugherty <github@mdrxy.com>
Co-authored-by: Mason Daugherty <mason@langchain.dev>
2026-01-09 22:17:03 -05:00
Christophe Bornet
fc417aaf17 fix(langchain): activate mypy warn-unreachable (#34553)
Co-authored-by: Mason Daugherty <mason@langchain.dev>
2026-01-09 22:11:16 -05:00
Christophe Bornet
5dc8ba3c99 chore(langchain): fix types in test_injected_runtime_create_agent, test_create_agent_tool_validation (#34568) 2026-01-09 21:50:18 -05:00
Christophe Bornet
f1ab8c5c80 chore(langchain): fix types in test_response_format and test_state_schema (#34571) 2026-01-09 21:49:16 -05:00
Christophe Bornet
bfe0a26547 chore(langchain): remove generic from FakeToolCallingModel (#34572)
* Making `FakeToolCallingModel` generic on its `structured_response`
doesn't help anywhere in typing.
* There are more than 120 references of `FakeToolCallingModel` in the
code where you get ` error: Need type annotation for "model"
[var-annotated]` because mypy can't resolve the generic type (we don't
see them atm because they are in files temporarily excluded from mypy
checking). We would need to explicitly type them to
`FakeToolCallingModel[Any]`

Co-authored-by: Mason Daugherty <mason@langchain.dev>
2026-01-09 21:48:33 -05:00
Christophe Bornet
bb5bd1181f chore(langchain): fix types in test_context_editing, test_agent_name, test_response_format_integration (#34574)
Co-authored-by: Mason Daugherty <mason@langchain.dev>
2026-01-09 21:47:46 -05:00
Mason Daugherty
9093c6effe chore(core): bump lock (#34695) 2026-01-09 21:42:41 -05:00
Christophe Bornet
8cb7dbd37b chore(core): improve types for RunnableLambda (#34539)
Co-authored-by: Mason Daugherty <mason@langchain.dev>
2026-01-09 21:42:27 -05:00
Christophe Bornet
2a2a4067ca chore(core): improve types for StreamingRunnable (#34540) 2026-01-09 21:34:50 -05:00
Christophe Bornet
5e9765d811 chore(langchain): fix types in test_overrides (#34635)
Co-authored-by: Mason Daugherty <github@mdrxy.com>
Co-authored-by: Mason Daugherty <mason@langchain.dev>
2026-01-09 18:31:13 -05:00
Mason Daugherty
703736a1e3 feat(langchain): add state to _ModelRequestOverrides (#34692)
Appears `override()`'s docstring in `langgraph` already shows
`state=new_state` as a valid usage pattern

Works since `dataclasses.replace()` accepts any field, but the
`TypedDicts` weren't updated to match. Caused mypy to flag legitimate
usage as an error.
2026-01-09 18:28:24 -05:00
Christophe Bornet
61fd703e5f chore(langchain): fix types in test_tools (#34592)
Co-authored-by: Mason Daugherty <mason@langchain.dev>
Co-authored-by: Mason Daugherty <github@mdrxy.com>
2026-01-09 18:05:28 -05:00
Christophe Bornet
4e40c2766a chore(langchain): fix types in test_summarization (#34656)
Co-authored-by: Mason Daugherty <mason@langchain.dev>
Co-authored-by: Mason Daugherty <github@mdrxy.com>
2026-01-09 17:54:42 -05:00
Christophe Bornet
9ce73a73f8 test(langchain): activate test_responses_spec tests (#34564)
description by @mdrxy

- Enable `test_responses_spec.py` integration tests that were previously
skipped at module level
- Widen `ToolStrategy.schema` type annotation from `type[SchemaT]` to
`type[SchemaT] | dict[str, Any]` to match actual supported usage (JSON
schema dicts were already handled at runtime)
- Fix type annotations and linting issues in test file (modernize to
`dict`/`list`, add return types, prefix unused `_request` param)
- Improve generic typing in `load_spec` utility with bounded `TypeVar`

Co-authored-by: Mason Daugherty <mason@langchain.dev>
2026-01-09 17:44:33 -05:00
Christophe Bornet
b4cd67ac15 style(langchain): fix some ruff preview rules (#34663)
Co-authored-by: Mason Daugherty <github@mdrxy.com>
Co-authored-by: Mason Daugherty <mason@langchain.dev>
2026-01-09 17:41:05 -05:00
Christophe Bornet
8e3c6b109f style(core): fix some noqa escapes (#34675)
Co-authored-by: Mason Daugherty <github@mdrxy.com>
Co-authored-by: Mason Daugherty <mason@langchain.dev>
2026-01-09 17:36:08 -05:00
Christophe Bornet
fd69425439 style(text-splitters): fix some ruff preview rules (#34665)
Co-authored-by: Mason Daugherty <mason@langchain.dev>
Co-authored-by: Mason Daugherty <github@mdrxy.com>
2026-01-09 17:28:18 -05:00
Christophe Bornet
e6dde3267a chore(langchain): fix types in test_framework (#34567)
Co-authored-by: Mason Daugherty <mason@langchain.dev>
Co-authored-by: Mason Daugherty <github@mdrxy.com>
2026-01-09 17:24:38 -05:00
Christophe Bornet
23c4c506d3 chore(langchain): fix types in memory_assert, conftest, conftest_checkpointer and conftest_store (#34636)
Co-authored-by: Mason Daugherty <mason@langchain.dev>
Co-authored-by: Mason Daugherty <github@mdrxy.com>
2026-01-09 17:18:05 -05:00
Christophe Bornet
d1404e63bb chore(langchain): fix types in test_system_message (#34634)
Co-authored-by: Mason Daugherty <github@mdrxy.com>
Co-authored-by: Mason Daugherty <mason@langchain.dev>
2026-01-09 17:17:57 -05:00
Mason Daugherty
18c25e9f10 chore: ban relative imports on all packages (#34691) 2026-01-09 17:02:24 -05:00
Christophe Bornet
8e824d9ec4 style: bump ruff version to 0.14.11 (#34674)
With ruff 0.14.11+, we can remove `PLW1510` from `unfixable` (see
https://github.com/astral-sh/ruff/issues/17091)
2026-01-09 16:30:24 -05:00
Sydney Runkle
fbe9babb34 fix: remove relative imports (#34680)
standardizing on absolute imports rather than relative across the
codebase
2026-01-09 13:00:51 -05:00
Mason Daugherty
2e8744559d fix(langchain,langchain-classic): more descriptive error msg when dep is not installed (#34679) 2026-01-09 12:41:55 -05:00
ccurme
19edaa8acb chore(openai): delete outdated test (#34682) 2026-01-09 12:37:44 -05:00
Guofang.Tang
384158daec fix(langchain): infer provider from mixed-case prefixes (#34672)
Fix provider inference for mixed-case model prefixes and add matching
unit coverage.
2026-01-09 11:07:14 -05:00
Sydney Runkle
c080296bed release: langchain-core 1.2.7 (#34678) 2026-01-09 16:02:38 +00:00
Sydney Runkle
323c76504a fix: add test confirming we don't inject args based on args_schema alone (#34677)
pending exclusion from function signature
2026-01-09 11:00:13 -05:00
Sydney Runkle
ed2aa9f747 fix: don't trace injected args only found in signature (#34670)
for the case when they're not included in the `args_schema`

this was predicted by @eyurtsev's comment here:
https://github.com/langchain-ai/langchain/pull/33729/files#r2475538173

pairing w/ this PR in mcp adapters:
https://github.com/langchain-ai/langchain-mcp-adapters/pull/407
2026-01-09 09:58:34 -05:00
Mason Daugherty
76da99e022 release(langchain): 1.2.3 (#34668) 2026-01-08 15:24:32 -05:00
Aman Gupta
2847814c70 feat(core): add more file extensions to ignore in HTML link extraction (#34552)
# feat(core): add more file extensions to ignore in HTML link extraction

## Description
This PR enhances the HTML link extraction utility in  
`libs/core/langchain_core/utils/html.py` by expanding the
`SUFFIXES_TO_IGNORE` list to include additional common binary file
extensions:

- `.webp`
- `.pdf`
- `.docx`
- `.xlsx`
- `.pptx`
- `.pptm`

These file types are non-HTML, non-crawlable resources. Ignoring them
prevents `find_all_links` and `extract_sub_links` from mistakenly
treating such binary assets as navigable links. This improves link
filtering, reduces unnecessary crawling, and aligns behavior with
typical web scraping expectations.

## Summary of Changes
- **Updated** `libs/core/langchain_core/utils/html.py`: Added `.webp`,
`.pdf`, `.docx`, `.xlsx`, `.pptx`, `.pptm` to `SUFFIXES_TO_IGNORE`.

## Related Issues
N/A

## Verification
- `ruff check libs/core/langchain_core/utils/html.py`: **Passed**  
- `mypy libs/core/langchain_core/utils/html.py`: **Passed**  
- `pytest libs/core/tests/unit_tests/utils/test_html.py`: **Passed** (11
tests)

---------

Co-authored-by: Mason Daugherty <mason@langchain.dev>
2026-01-08 14:40:22 -05:00
ccurme
d383f00489 refactor(langchain): engage summarization based on reported usage_metadata (#34632) 2026-01-08 11:12:00 -05:00
Aman Gupta
50c5bb5607 refactor(core): improve docstrings for HTML link extraction utilities (#34550)
# refactor(core): improve docstrings for HTML link extraction utilities

## Description
This PR updates and clarifies the docstrings for `find_all_links` and
`extract_sub_links` in
`libs/core/langchain_core/utils/html.py`.

The previous return-value descriptions were vague (e.g., "all links",
"sub links"). They have now been revised to clearly describe the
behavior and output of each function:

- **find_all_links** → “A list of all links found in the HTML.”
- **extract_sub_links** → “A list of absolute paths to sub links.”

These improvements make the utilities more understandable and
developer-friendly without altering functionality.

## Verification
- `ruff check libs/core/langchain_core/utils/html.py`: **Passed**  
- `pytest libs/core/tests/unit_tests/utils/test_html.py`: **Passed**

## Checklists
- PR title follows the required format: `TYPE(SCOPE): DESCRIPTION`  
- Changes are limited to the `langchain-core` package  
- `make format`, `make lint`, and `make test` pass
2026-01-08 10:21:17 -05:00
Mason Daugherty
2b6911d9af fix(langchain): keep tool call / AIMessage pairings when summarizing (#34609)
Fixes #34282

**Before:** When using agents with tools (like file reading, web search,
etc.), the conversation looks like this:

```
[User]     "Read these 10 files and summarize them"
[AI]       "I'll read all 10 files" + [tool_call: read_file x 10]
[Tool]     "Contents of file1.txt..."
[Tool]     "Contents of file2.txt..."
[Tool]     "Contents of file3.txt..."
... (7 more tool responses)
```

When the conversation gets too long, `SummarizationMiddleware` kicks in
to compress older messages. The problem was:

If you asked to keep the last 6 messages, you'd get:

```
[Summary]  "Here's what happened before..."
[Tool]     "Contents of file5.txt..."
[Tool]     "Contents of file6.txt..."
[Tool]     "Contents of file7.txt..."
[Tool]     "Contents of file8.txt..."
[Tool]     "Contents of file9.txt..."
[Tool]     "Contents of file10.txt..."
```

The AI's original request to read the files (`[AI]` message with
`tool_calls`) was summarized away, but the tool responses remained. This
caused the error:

```
Error code: 400 - "No tool call found for function call output with call_id..."
```

Many APIs require that every tool response has a matching tool request.
Without the AI message, the tool responses are "orphaned."

## The fix

Now when the cutoff lands on tool messages, we **move backward** to
include the AI message that requested those tools:

Same scenario, keeping last 6 messages:

```
[Summary]  "Here's what happened before..."
[AI]       "I'll read all 10 files" + [tool_call: read_file x 10]
[Tool]     "Contents of file1.txt..."
[Tool]     "Contents of file2.txt..."
... (all 10 tool responses)
```

The AI message is preserved along with its tool responses, keeping them
paired together.

## Practical examples

### Example 1: Parallel tool calls

**Scenario:** Agent reads 10 files in parallel, summarization triggers
(see above)

### Example 2: Mixed conversation

**Scenario:** User asks question, AI uses tools, user says thanks

```
[User]     "What's the weather?"
[AI]       "Let me check" + [tool_call: get_weather]
[Tool]     "72F and sunny"
[AI]       "It's 72F and sunny!"
[User]     "Thanks!"
```

Keeping last 2 messages:

| Before (Bug) | After (Fix) |
|--------------|-------------|
| Only `[User] "Thanks!"` kept | `[AI] + [Tool] + [AI] + [User]` all
kept |
| Lost the weather info | Tool pair preserved with response |

### Example 3: Multiple tool sequences

```
[User]     "Search for X"
[AI]       [tool_call: search]
[Tool]     "Results for X"
[User]     "Now search for Y"
[AI]       [tool_call: search]
[Tool]     "Results for Y"
[User]     "Great!"
```

**Keeping last 3 messages:** If cutoff lands on `[Tool] "Results for
Y"`, we now include `[AI] [tool_call: search]` to keep the pair
together.
2026-01-08 10:07:56 -05:00
Guofang.Tang
f805ea9601 test(langchain): cover chat model provider inference (#34657)
Add unit coverage for chat model provider inference across common model
name prefixes. This improves regression protection without touching
runtime

---------

Co-authored-by: Mason Daugherty <github@mdrxy.com>
2026-01-08 09:59:12 -05:00
Stephan Günther
0276cc0290 fix(langchain): fix copy-paste error on azure_openai embedding provider map (#34655)
Fixes a bug introduced with commit 85f1ba2 (released in `langchain ==
1.2.1`).

Whenever the index embedding of the langgraph-server is configured with
`azure_openai` provider, the wrong class is going to be initialized (and
fails to do so if the now unexpected credentials in environment variable
`OPENAI_API_KEY` is not provided).

Example configuration file `langgraph.json` that will reproduce the
issue:
(see
https://docs.langchain.com/langsmith/cli#adding-semantic-search-to-the-store)

```json
{
  "dependencies": ["."],
  "graphs": {
    "chat": "src/agents/chat/graph.py:graph",
  },
  "store": {
    "index": {
      "embed": "azure_openai:text-embedding-3-small",
      "dims": 1536
    }
  },
  "python_version": "3.13",
  "image_distro": "wolfi"
}
```
2026-01-08 09:54:53 -05:00