Commit Graph

8284 Commits

Author SHA1 Message Date
Aarav Dugar
4c86e8ba39 chore(groq): document vision support (#34620) 2026-01-07 11:37:05 -05:00
Manas karthik
048de6dfb6 test(text-splitters): add edge case tests for CharacterTextSplitter (#34628) 2026-01-07 11:06:44 -05:00
Mason Daugherty
557eddfd51 refactor(core): add warning for fallback GPT-2 tokenizer usage (#34621) 2026-01-06 19:11:10 -05:00
Mason Daugherty
aa9c63b96a release(langchain): 1.2.1 (#34622) 2026-01-06 19:10:49 -05:00
Mason Daugherty
8aeff95341 fix(core,langchain): use get_buffer_string for message summarization (#34607)
Fixes #34517

Supersedes #34557, #34570

Fixes token inflation in `SummarizationMiddleware` that caused context
window overflow during summarization.

**Root cause:** When formatting messages for the summary prompt,
`str(messages)` was implicitly called, which includes all Pydantic
metadata fields (`usage_metadata`, `response_metadata`,
`additional_kwargs`, etc.). This caused the stringified representation
to use ~2.5x more tokens than `count_tokens_approximately` estimates.

**Problem:**
- Summarization triggers at 85% of context window based on
`count_tokens_approximately`
- But `str(messages)` in the prompt uses 2.5x more tokens
- Results in `ContextLengthExceeded`

**Fix:** Use `get_buffer_string()` to format messages, which produces
compact output:

```
Human: What's the weather?
AI: Let me check...[tool_calls]
Tool: 72°F and sunny
```

Instead of verbose Pydantic repr:

```python
[HumanMessage(content='What's the weather?', additional_kwargs={}, response_metadata={}), ...]
```
2026-01-06 19:05:03 -05:00
Christophe Bornet
0438f8c277 chore(langchain): fix types in test_model_fallback (#34615) 2026-01-06 13:07:18 -05:00
Christophe Bornet
7f4f130479 chore(langchain): fix types in test_pii (#34617) 2026-01-06 13:06:25 -05:00
ccurme
6537939f53 chore(langchain): add admonition around redaction_rules (#34618) 2026-01-06 13:01:09 -05:00
Ademola Balogun
a2529cd805 fix(langchain): correct typo 'langchain experiment' to 'langchain_experimental' in error messages (#34608)
Fixed typo in ImportError messages where "langchain experiment" should
be "langchain_experimental" for consistency with the actual package
name.

This helps improve clarity for users who encounter these error messages
when trying to use deprecated tools that have moved to the
langchain_experimental package.

Related issues: #13858, #13859

Co-authored-by: Ademola <ademicho@gmail>
2026-01-05 18:10:06 -05:00
ccurme
c1f1641018 fix(anthropic): fix version (#34606) 2026-01-05 16:03:20 -05:00
ccurme
225e0fa8c9 release(anthropic): 1.3.1 (#34605) 2026-01-05 15:55:15 -05:00
Loganaden Velvindron
f021e899dc fix(anthropic): CVE-2025-68664 (#34563) 2026-01-05 15:51:25 -05:00
lwtaiyty
578cef9622 fix(anthropic): skip cache_control for code_execution blocks (#34579) 2026-01-05 15:40:59 -05:00
Christophe Bornet
7979fd3d9f chore(langchain): fix types in test_composition (#34580) 2026-01-05 14:49:34 -05:00
Christophe Bornet
3b65985551 chore(langchain): fix types in test_decorators (#34583) 2026-01-05 14:47:10 -05:00
Christophe Bornet
c4babed5c6 chore(langchain): fix types in test_wrap_tool_call (#34600) 2026-01-05 14:38:31 -05:00
Christophe Bornet
5ae53fdfb3 chore(langchain): fix types in test_model_call_limit_types (#34601) 2026-01-05 14:37:03 -05:00
Christophe Bornet
901690ceec chore(langchain): fix types in test_file_search and test_human_in_the_loop (#34602) 2026-01-05 14:34:35 -05:00
ゆり
be2c7f1aa8 test(core): add tests for formatting utils and merge functions (#34511)
## Summary
Add comprehensive test coverage for previously untested utilities in
`langchain-core`.

## Changes

### New file: `test_formatting.py` (18 tests)

Tests for `StrictFormatter` class:
- `test_vformat_with_keyword_args` - basic functionality
- `test_vformat_with_multiple_keyword_args` - multiple placeholders
- `test_vformat_with_empty_string` - edge case
- `test_vformat_with_no_placeholders` - literal strings
- `test_vformat_raises_on_positional_args` - error handling
- `test_vformat_raises_on_multiple_positional_args` - error handling
- `test_vformat_with_special_characters` - newlines, tabs
- `test_vformat_with_unicode` - emoji, CJK characters
- `test_vformat_with_format_spec` - format specifications
- `test_vformat_with_nested_braces` - escaped braces

Tests for `validate_input_variables`:
- `test_validate_input_variables_success` - valid input
- `test_validate_input_variables_with_extra_variables` - extra vars
allowed
- `test_validate_input_variables_with_missing_variable` - KeyError
- `test_validate_input_variables_empty_format` - edge case
- `test_validate_input_variables_no_placeholders` - edge case

Tests for `formatter` singleton:
- `test_formatter_is_strict_formatter` - type check
- `test_formatter_format_works` - functionality
- `test_formatter_rejects_positional_args` - error handling

### Extended `test_utils.py` (14 new tests)

Tests for `merge_lists`:
- Parametrized tests covering None handling, simple merge, empty lists,
index-based merging
- `test_merge_lists_multiple_others` - merging 3+ lists
- `test_merge_lists_all_none` - all None inputs

Tests for `merge_obj`:
- Parametrized tests for None, strings, dicts, lists, equal values
- `test_merge_obj_type_mismatch` - TypeError on type mismatch
- `test_merge_obj_unmergeable_values` - ValueError on different values
- `test_merge_obj_tuple_raises` - ValueError for tuples

## Test plan
- [x] Tests follow existing patterns in the codebase
- [x] All tests are unit tests (no network calls)
- [x] Tests cover happy paths and error conditions
- [x] Tests verify no mutation of input data

## AI Disclosure
This contribution was developed with AI assistance (Claude Code).

🤖 Generated with [Claude Code](https://claude.com/claude-code)

---------

Co-authored-by: yurekami <yurekami@users.noreply.github.com>
Co-authored-by: Mason Daugherty <github@mdrxy.com>
Co-authored-by: Mason Daugherty <mason@langchain.dev>
2026-01-05 14:20:11 -05:00
ccurme
b5c5ba0a5f release(xai): 1.2.1 (#34604) 2026-01-05 13:55:38 -05:00
ccurme
944b43dd25 fix(xai): count reasoning tokens in output total (#34603) 2026-01-05 13:25:30 -05:00
aroun-coumar
730a3676f8 fix(core): strip message IDs from cache keys using model_copy (#33915)
**Description:**  

*Closes
#[33883](https://github.com/langchain-ai/langchain/issues/33883)*

Chat model cache keys are generated by serializing messages via
`dumps(messages)`. The optional `BaseMessage.id` field (a UUID used
solely for tracing/threading) is included in this serialization, causing
functionally identical messages to produce different cache keys. This
results in repeated API calls, cache bloat, and degraded performance in
production workloads (e.g., agents, RAG chains, long conversations).

This change normalizes messages **only for cache key generation** by
stripping the nonsemantic `id` field using Pydantic V2’s
`model_copy(update={"id": None})`. The normalization is applied in both
synchronous and asynchronous cache paths (`_generate_with_cache` /
`_agenerate_with_cache`) immediately before `dumps()`.

```python
normalized_messages = [
    msg.model_copy(update={"id": None})
    if getattr(msg, "id", None) is not None
    else msg
    for msg in messages
]
prompt = dumps(normalized_messages)

---------

Co-authored-by: Mason Daugherty <mason@langchain.dev>
Co-authored-by: Mason Daugherty <github@mdrxy.com>
2026-01-05 10:37:10 -05:00
Julia (Juli) Huang
cd5b36456a fix(text-splitters): HTMLSemanticPreservingSplitter nested preserved … (#34587)
Summary
Fixes an issue where HTMLSemanticPreservingSplitter failed to preserve
elements nested inside non-container tags. With these changes, preserved
elements are now correctly detected and handled at any nesting depth.

Root Cause
`_process_element()` only recursed into a small set of hard-coded
container tags (`html`, `body`, `div`, `main`). For other tags, the
subtree was flattened into text, preventing nested preserved elements
(inside `<p>`, `<section>`, `<article>`, etc.) from being detected.


Fix
- Updated traversal logic in _process_element (html.py) to recursively
process child elements for any tag that contains nested elements
- Avoided duplicate text extraction
- Preserved correct placeholder ordering
- Treated leaf nodes as text only

Tests
Adds regression tests covering preserved elements nested inside
non-container tags, including:
- table inside section
- nested divs
- code inside paragraph

All existing tests pass (make lint, format, test, etc).

Breaking changes
None.

Fixes
Fixes #31569

Disclaimer
GitHub Copilot was used to assist with test case design in
test_text_splitters.py and documentation comments; all code logic was
manually implemented and reviewed.

---------

Co-authored-by: julih <julih@julihs-MacBook-Pro.local>
Co-authored-by: Mason Daugherty <github@mdrxy.com>
Co-authored-by: Mason Daugherty <mason@langchain.dev>
2026-01-05 10:28:27 -05:00
Mohan Kumar S
13cfdf1676 fix(core): exclude injected args from tool schema (#34582) 2026-01-05 09:59:59 -05:00
Andre Roelofs
c25f3847d0 refactor(core): select chunk_id via ranking and remove extra allocation (#34588) 2026-01-05 09:13:05 -05:00
Christophe Bornet
7ca0efde04 chore(langchain): fix types in test_diagram and test_sync_async_wrappers (#34591) 2026-01-05 09:05:24 -05:00
Christophe Bornet
e5d4acf681 style(langchain): add ruff rule PLC0415 (#34559) 2026-01-04 01:26:04 -05:00
ccurme
659eab2607 release(core): 1.2.6 (#34586) 2026-01-02 16:20:20 -05:00
Angus Jelinek
458a186540 chore(core): Update LangChainTracer to use Pydantic v2 methods (#34541) 2026-01-02 16:02:13 -05:00
ccurme
a7aad60989 fix(xai): ensure citations are streamed just once (#34556) 2025-12-31 18:01:41 -05:00
ccurme
9da28bac86 release(xai): 1.2.0 (#34555) 2025-12-31 16:37:21 -05:00
ccurme
0b91774263 fix(xai): stream usage metadata by default (#34531) 2025-12-31 16:30:52 -05:00
weiii668
5517ef37fb docs(core): add docstrings to internal helper functions (#34525)
Co-authored-by: weiii668 <your-email@example.com>
Co-authored-by: Mason Daugherty <github@mdrxy.com>
Co-authored-by: Mason Daugherty <mason@langchain.dev>
2025-12-30 21:58:00 -06:00
Mason Daugherty
2bbe4216e0 docs(core): refresh content.py docstrings (#34546)
minor formatting improvements and increased disambiguation between `id`
and `file_id` for `FileContentBlock` in response to
https://github.com/langchain-ai/langchain-google/pull/1477
2025-12-30 20:44:47 -06:00
Mason Daugherty
721bf15430 fix(langchain): resolve race condition in ShellSession.execute() (#34535)
Addresses a flaky test

When executing `exit 1` as a startup command, the shell process
terminates immediately. The code then tries to write a marker command
(`printf '...'`) to stdin, but the pipe is already broken because the
shell has exited, causing `BrokenPipeError`.
2025-12-29 18:16:08 -06:00
Mason Daugherty
dcfd9c0e04 fix(infra): use langchain_v1 for dev container deps (#34534) 2025-12-29 18:10:40 -06:00
Christophe Bornet
e03d6b80d5 chore(deps): bump mypy to v1.19 and ruff to v1.14 (#34521)
* Set mypy to >=1.19.1,<1.20
* Set ruff to >=0.14.10,<0.15
2025-12-29 18:07:55 -06:00
Christophe Bornet
ea25f5ebdd chore(text-splitters): bump dependency locks for python 3.14 (#34522)
* Support sentence-transformers optional dep on python 3.14
* Bump some dep locks to use pre-built wheels instead of building them
(murmurhash, cymem, preshed, thinc, srsly, blis)
* Still not possible to use spacy: even though there are wheels
available, spacy depends on Pydantic v1 which doesn't work on Python
3.14.
* Speeds up installation and CI.

Co-authored-by: Mason Daugherty <mason@langchain.dev>
2025-12-29 17:55:34 -06:00
Christophe Bornet
04c0c1bdc3 chore(langchain-classic): bump markupsafe lock for python 3.14 (#34523)
Bump lock of MarkupSafe to 3.0.3 which has Python 3.14 pre-built wheels.
Speeds up installation and CI.

Co-authored-by: Mason Daugherty <mason@langchain.dev>
2025-12-29 17:55:26 -06:00
Efe Çelik
c1f5d0963d fix: typo: saved the world 'wether' -> 'whether' (#34524)
Changed "wether" to "whether" in test comments.
2025-12-29 17:28:09 -06:00
Mason Daugherty
e81f00fb29 docs(standard-tests): remove autodoc comment (#34532) 2025-12-29 17:25:52 -06:00
Christophe Bornet
03ae39747b refactor(core): fix some missing generic types (#31658)
See
https://mypy.readthedocs.io/en/stable/config_file.html#confval-disallow_any_generics

---------

Co-authored-by: Mason Daugherty <mason@langchain.dev>
Co-authored-by: Mason Daugherty <github@mdrxy.com>
2025-12-27 16:53:08 -06:00
Sarah Clark
10de0a5364 fix(langchain-classic): pass default to config.getoption (#34034)
Fixes #34033

---------

Co-authored-by: Mason Daugherty <mason@langchain.dev>
2025-12-27 16:36:51 -06:00
Mason Daugherty
30ac1da0de release(standard-tests): 1.1.2 (#34507) 2025-12-27 03:01:56 -06:00
Dragos Bobolea
6d447f89d9 fix(fireworks): bind_tools(strict: bool) and reasoning_content (#34343)
Extract strict from kwargs and pass it to convert_to_openai_tool when
converting tools. This ensures that when strict is provided, it's
properly used during tool conversion and removed from kwargs before
calling the parent bind method.

Also extract reasoning_content from API responses and store it in
additional_kwargs for AIMessage objects.

Fixes https://github.com/langchain-ai/langchain/issues/34341 and
https://github.com/langchain-ai/langchain/issues/34342

---------

Co-authored-by: Mason Daugherty <mason@langchain.dev>
2025-12-27 02:42:06 -06:00
Christophe Bornet
5ef9f6e036 style(core): add ruff RUF012 rule (#34492)
Co-authored-by: Mason Daugherty <mason@langchain.dev>
2025-12-27 02:36:28 -06:00
Connor Hyatt
e3939ade5a fix(core): support (message class, template) tuples in ChatPromptTemplate.from_messages (#33989)
### Description

`ChatPromptTemplate.from_messages` supports multiple tuple formats for
defining message templates. One documented format is `(message class,
template)`, which allows users to specify the message type using the
class directly:

```python
ChatPromptTemplate.from_messages([
    (SystemMessage, "You are a helpful assistant named {name}."),
    (HumanMessage, "{input}"),
])
```

However, this syntax was broken. Passing a tuple like `(HumanMessage,
"{input}")` would raise a Pydantic validation error because the
conversion logic in `_convert_to_message_template` didn't handle
`BaseMessage` subclasses—it only recognized string-based role
identifiers like `"human"` or `"system"`.

This PR adds the missing branch to detect when the first element of a
tuple is a message class (by checking for the `type` class attribute)
and routes it through `_create_template_from_message_type`, which
already knows how to create the appropriate `MessagePromptTemplate` for
each message type.

### Changes

- Updated `_convert_to_message_template` to properly support `(message
class, template)` tuples

### Testing

Added 16 comprehensive unit tests covering:

- Basic usage with `HumanMessage`, `AIMessage`, and `SystemMessage`
classes
- Integration with `invoke()` method
- Mixed syntax (message class tuples alongside string tuples)
- Multiple template variables
- Edge cases: empty templates, static text (no variables)
- Correct extraction of `input_variables`
- Partial variables support
- Combination with `MessagesPlaceholder`
- Mustache template format
- Template operations: `append()`, `extend()`, concatenation, and
slicing
- Special characters and unicode in templates

### Issue

Fixes #33791

### Dependencies

None

---------

Co-authored-by: Mason Daugherty <mason@langchain.dev>
2025-12-27 02:20:33 -06:00
Miguel Athie
b0e4ef3158 test(core): add regression test for list-index $ref resolution (#34097)
This PR adds a regression test covering the JSON Schema `$ref` pattern
found in
MCP-style schemas, where a `$ref` points into a list-based structure
such as:


#/properties/body/anyOf/1/properties/Message/properties/bccRecipients/items

This pattern historically failed due to incorrect handling of numeric
list
components in `_retrieve_ref`. The underlying bug has since been fixed,
and
this test ensures coverage so we don't regress on list-index `$ref`
resolution.

The new test (`test_dereference_refs_list_index_items_ref_mcp_like`)
verifies:

- correct traversal into `anyOf[1]`
- proper dereferencing of `items.$ref`
- no errors thrown
- `ccRecipients.items` is identical to the resolved schema of
`bccRecipients.items`

No code changes are included, just the one test — this PR adds coverage
to preserve the expected
behavior and documents support for this real-world MCP schema pattern.

Related to #32012.

---------

Co-authored-by: Mason Daugherty <mason@langchain.dev>
2025-12-27 02:18:51 -06:00
gjeltep
ca7790f895 fix(core): fix callback manager merge mixing handlers (#32028) (#33617)
## Description
Fixed `BaseCallbackManager.merge()` method to correctly preserve the
distinction between `handlers` and `inheritable_handlers` during merge
operations.

Previously, the merge method was using `add_handler()` which incorrectly
added handlers to both lists when `inherit=True`, causing
cross-contamination between regular and inheritable handlers.

The fix directly passes the combined handler lists to the constructor
instead of using `add_handler()`, ensuring proper separation is
maintained.

## Issue
Fixes #32028

## Dependencies
None

## Testing
- Modified existing test `test_merge_preserves_handler_distinction()` to
verify handlers remain properly separated after merge

## Checklist
- [x] **Breaking Changes**: No breaking changes - only fixes incorrect
behavior
- [x] **Type Hints**: All functions have complete type annotations
- [x] **Tests**: Fix is fully tested with existing unit test
- [x] **Security**: No security implications
- [x] **Documentation**: No documentation changes needed - bug fix only
- [x] **Code Quality**: Passes lint and format checks
- [x] **Commit Message**: Follows Conventional Commits format

---------

Co-authored-by: Mason Daugherty <mason@langchain.dev>
2025-12-27 02:01:59 -06:00
Christophe Bornet
5884fb9523 style(text-splitters,standard-tests,cli): add ruff TC and RUF012 rules (#34495)
Co-authored-by: Mason Daugherty <mason@langchain.dev>
2025-12-27 01:41:33 -06:00