Commit Graph

210 Commits

Author SHA1 Message Date
Christophe Bornet
b97c629f9a style: bump ruff version to 0.15 (#35042) 2026-02-12 19:34:02 -05:00
ccurme
e8e47b083e feat(core): count tokens from tool schemas in count_tokens_approximately (#35098) 2026-02-09 14:10:44 -05:00
ccurme
032d01dd0f fix(core): adjust cap when scaling approximate token counts (#35017) 2026-02-04 19:02:48 -05:00
ccurme
5981ee142c fix(core): apply cap when scaling approximate token counts (#35005) 2026-02-03 21:20:48 -05:00
ccurme
09654f4382 feat(core): allow scaling by reported usage when counting tokens approximately (#34996) 2026-02-03 15:19:18 -05:00
Mason Daugherty
3aca3fbebe docs(core): add examples for pretty_repr, pretty_print (#34968) 2026-02-01 16:37:03 -08:00
Mason Daugherty
12c34a4139 docs(core): use proper admonition for get_buffer_string (#34967) 2026-02-01 16:36:44 -08:00
Mason Daugherty
638c33f65d fix(core): replace Iterable with Iterator for block iteration (#34934) 2026-01-30 12:08:22 -08:00
Mason Daugherty
f8d5a5069f chore(core): nits (#34897) 2026-01-26 18:05:37 -08:00
cc
585b691c1d feat(core): add multimodal support to count_tokens_approximately (#34883) 2026-01-26 15:04:25 -08:00
Weichen Zhao
d899681040 feat(core): add XML format option for get_buffer_string (#34802)
## Summary

Add XML format option for `get_buffer_string()` to provide unambiguous
message serialization. This fixes role prefix ambiguity when message
content contains strings like "Human:" or "AI:".

  Fixes #34786

  ## Changes

- Add `format="xml"` parameter with proper XML escaping using
`quoteattr()` for attributes
- Add explicit validation for format parameter (raises `ValueError` for
invalid values)
  - Add comprehensive tests for XML format edge cases

<img width="1952" height="706" alt="image"
src="https://github.com/user-attachments/assets/1cd6f887-9365-43cf-a532-72d7addd8bad"
/>
<img width="2786" height="776" alt="image"
src="https://github.com/user-attachments/assets/a07b0db0-519c-46d7-b34b-b404237d812b"
/>

---------

Co-authored-by: Mason Daugherty <github@mdrxy.com>
Co-authored-by: Mason Daugherty <mason@langchain.dev>
2026-01-22 13:33:08 -05:00
Mason Daugherty
0b99ca4fcd docs(core): enhance docstrings for ToolCall and ToolCallChunk (#34719) 2026-01-12 15:50:28 -05:00
Christophe Bornet
8e3c6b109f style(core): fix some noqa escapes (#34675)
Co-authored-by: Mason Daugherty <github@mdrxy.com>
Co-authored-by: Mason Daugherty <mason@langchain.dev>
2026-01-09 17:36:08 -05:00
Chris Papademetrious
0c7b7e045d feat(core): support custom message separator in get_buffer_string() (#34569) 2026-01-07 11:46:17 -05:00
Mason Daugherty
8aeff95341 fix(core,langchain): use get_buffer_string for message summarization (#34607)
Fixes #34517

Supersedes #34557, #34570

Fixes token inflation in `SummarizationMiddleware` that caused context
window overflow during summarization.

**Root cause:** When formatting messages for the summary prompt,
`str(messages)` was implicitly called, which includes all Pydantic
metadata fields (`usage_metadata`, `response_metadata`,
`additional_kwargs`, etc.). This caused the stringified representation
to use ~2.5x more tokens than `count_tokens_approximately` estimates.

**Problem:**
- Summarization triggers at 85% of context window based on
`count_tokens_approximately`
- But `str(messages)` in the prompt uses 2.5x more tokens
- Results in `ContextLengthExceeded`

**Fix:** Use `get_buffer_string()` to format messages, which produces
compact output:

```
Human: What's the weather?
AI: Let me check...[tool_calls]
Tool: 72°F and sunny
```

Instead of verbose Pydantic repr:

```python
[HumanMessage(content='What's the weather?', additional_kwargs={}, response_metadata={}), ...]
```
2026-01-06 19:05:03 -05:00
Andre Roelofs
c25f3847d0 refactor(core): select chunk_id via ranking and remove extra allocation (#34588) 2026-01-05 09:13:05 -05:00
Mason Daugherty
2bbe4216e0 docs(core): refresh content.py docstrings (#34546)
minor formatting improvements and increased disambiguation between `id`
and `file_id` for `FileContentBlock` in response to
https://github.com/langchain-ai/langchain-google/pull/1477
2025-12-30 20:44:47 -06:00
Christophe Bornet
03ae39747b refactor(core): fix some missing generic types (#31658)
See
https://mypy.readthedocs.io/en/stable/config_file.html#confval-disallow_any_generics

---------

Co-authored-by: Mason Daugherty <mason@langchain.dev>
Co-authored-by: Mason Daugherty <github@mdrxy.com>
2025-12-27 16:53:08 -06:00
Christophe Bornet
5ef9f6e036 style(core): add ruff RUF012 rule (#34492)
Co-authored-by: Mason Daugherty <mason@langchain.dev>
2025-12-27 02:36:28 -06:00
Christophe Bornet
a92c032ff6 style(core): fix mypy no-any-return violations (#34204)
* FIxed where possible
* Used `cast` when not possible to fix

---------

Co-authored-by: Mason Daugherty <github@mdrxy.com>
Co-authored-by: Mason Daugherty <mason@langchain.dev>
2025-12-26 21:35:27 -06:00
Mason Daugherty
78b2d51edc docs(core): image url docstring enhancement (#34488) 2025-12-25 23:10:48 -06:00
rari404
d84eef667a fix(core): use tool_calls instead of deprecated function_call in get_buffer_string (#34355)
## Summary

Fixes #33970

`get_buffer_string` was only checking for the deprecated `function_call`
field in `additional_kwargs`, which modern LLM providers no longer
return. This fix updates the function to check for the modern
`tool_calls` field first, falling back to `function_call` for legacy
compatibility.

## Changes

- Check `AIMessage.tool_calls` first (modern standard)
- Fall back to `additional_kwargs["function_call"]` (legacy support)
- Added 3 unit tests covering tool_calls, empty content, and precedence
behavior

## Testing

```python
# Before fix: tool_calls info was lost
msg = AIMessage(content="Hi", tool_calls=[{"name": "search", ...}])
get_buffer_string([msg])  # "AI: Hi" (no tool info)

# After fix: tool_calls are included
get_buffer_string([msg])  # "AI: Hi[{\"name\": \"search\", ...}]"
```

- All existing `get_buffer_string` tests pass
- Legacy `function_call` behavior preserved

---

> [!NOTE]
> This PR was developed with AI agent assistance (Factory/Droid).

---------

Co-authored-by: Mason Daugherty <github@mdrxy.com>
2025-12-19 22:37:56 -06:00
James
4fbeffcfee feat(core): add 'approximate' alias in place of count_tokens_approximately (#33045)
### Description: 
earlier we have to use like below:
```python
from langchain_core.messages import trim_messages
from langchain_core.messages.utils import count_tokens_approximately

trim_messages(..., token_counter=count_tokens_approximately)
```
Now can be used as like this also
```python
from langchain_core.messages import trim_messages

trim_messages(..., token_counter="approximate")
```
- [x] **Added tests**
- [x] **Lint and test**: Run this as I made change in langchain/core, uv
run --group test pytest tests/unit_tests/messages/test_utils.py -v
<img width="1006" height="66" alt="image"
src="https://github.com/user-attachments/assets/c6938c29-a781-4e7f-871b-8e888ee764b7"
/>

---------

Co-authored-by: Mason Daugherty <mason@langchain.dev>
Co-authored-by: Mason Daugherty <github@mdrxy.com>
2025-12-19 19:25:29 -06:00
Christophe Bornet
72f1d79022 chore(core): fix some ruff preview rules (#34425)
Co-authored-by: Mason Daugherty <mason@langchain.dev>
2025-12-19 14:33:42 -06:00
Christophe Bornet
8bca31f8c4 chore(core): fix some docstrings (#34426) 2025-12-19 13:08:10 -05:00
ccurme
6519a5675b fix(core): allow unknown blocks in convert_to_openai_messages (#34420) 2025-12-18 14:22:53 -05:00
Christophe Bornet
914730cf8d chore(core): fix some types related to ToolCallChunk (#34283) 2025-12-12 13:15:57 -05:00
Mason Daugherty
7542278997 feat(core,anthropic): extras on BaseTool (#34120) 2025-12-10 09:37:14 -05:00
Christophe Bornet
a64aee310c chore(core): improve typing of messages utils functions (#34225)
With this we get the correct types for `_runnable_support` annotated
functions.
* return list[BaseMessage] when messages is not None
* return Runnable when messages is None
* typing of function args
2025-12-08 09:59:43 -05:00
Mason Daugherty
9875ffbabc feat(core): support google maps grounding in genai block translator (#34244)
https://github.com/langchain-ai/langchain-google/pull/1330

---------

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
2025-12-08 09:44:43 -05:00
Mason Daugherty
3ace4e3680 docs(core,groq,openai): nits for ref docs (#34243) 2025-12-07 19:45:38 -05:00
Mason Daugherty
0a6d01e61d docs(anthropic,core,langchain): updates (#34106) 2025-11-25 17:58:09 -05:00
Mason Daugherty
2a863727f9 fix(infra,core): nits (#34079)
* Add missing `nits` to allowed PR linting scopes
* Ensure `MAJOR.MINOR.PATCH` consistency in admonitions
* Ensure valid spacing in admonitions
2025-11-23 20:00:07 -05:00
Mason Daugherty
47b79c30c0 chore(docs): fix a few refs syntax errors (#34044)
missing whitespace for some admonitions
2025-11-22 00:58:21 -05:00
Christophe Bornet
2bfbc29ccc chore(core): fix some ruff TC rules (#33929)
fix some ruff TC rules but still don't enforce them as Pydantic model
fields use type annotations at runtime.
2025-11-12 14:07:19 -05:00
Lê Nam Khánh
2fe0369049 docs: fix typos in some files (#33867) 2025-11-07 09:04:29 -05:00
Mason Daugherty
d40e340479 chore: attribute package change versions (#33854)
Needed to disambiguate for within inherited docs
2025-11-06 16:57:30 -05:00
Christophe Bornet
915c446c48 chore(core): add ruff rule PLR2004 (#33706)
Co-authored-by: Mason Daugherty <mason@langchain.dev>
2025-11-04 13:33:37 -05:00
Mason Daugherty
123e29dc26 style: more refs fixes (#33730) 2025-10-29 16:34:46 -04:00
Mason Daugherty
a2a9a02ecb style(core): more cleanup all around (#33711) 2025-10-28 22:58:19 -04:00
Mason Daugherty
e5e1d6c705 style: more refs work (#33707) 2025-10-28 14:43:28 -04:00
ccurme
3286a98b27 fix(core): translate Google GenAI text blocks to v1 (#33699) 2025-10-28 09:53:01 -04:00
Mason Daugherty
62769a0dac feat(langchain): export UsageMetadata (#33692)
as well as `InputTokenDetails`, and `OutputTokenDetails` from
`langchain_core.messages`
2025-10-27 19:47:41 -04:00
Mason Daugherty
f94108b4bc fix: links (#33691)
* X-ref to new docs
* Formatting updates
2025-10-27 19:04:29 -04:00
Mason Daugherty
e731ba1e47 style: more refs work (#33616) 2025-10-20 18:40:19 -04:00
Mason Daugherty
1d2273597a docs: more fixes for refs (#33554) 2025-10-16 22:54:16 -04:00
Mason Daugherty
c9018f81ec docs(anthropic): update extended thinking docs and fix urls (#33525)
new urls

extended thinking isn't just 3.7 anymore
2025-10-16 11:18:47 -04:00
ccurme
eb19e12527 feat(core): support vertexai standard content (#33521) 2025-10-16 10:08:58 -04:00
Mason Daugherty
707e96c541 style: more sweeping refs work (#33513) 2025-10-15 23:33:39 -04:00
Mason Daugherty
26e0a00c4c style: more work for refs (#33508)
Largely:
- Remove explicit `"Default is x"` since new refs show default inferred
from sig
- Inline code (useful for eventual parsing)
- Fix code block rendering (indentations)
2025-10-15 18:46:55 -04:00