Commit Graph

69 Commits

Author SHA1 Message Date
ccurme
e8e47b083e feat(core): count tokens from tool schemas in count_tokens_approximately (#35098) 2026-02-09 14:10:44 -05:00
ccurme
032d01dd0f fix(core): adjust cap when scaling approximate token counts (#35017) 2026-02-04 19:02:48 -05:00
ccurme
5981ee142c fix(core): apply cap when scaling approximate token counts (#35005) 2026-02-03 21:20:48 -05:00
ccurme
09654f4382 feat(core): allow scaling by reported usage when counting tokens approximately (#34996) 2026-02-03 15:19:18 -05:00
Lewis Whitehill
0c9d392d41 test(core): add tests for approximate token counting with multimodal messages (#34898) 2026-01-30 12:35:16 -08:00
cc
585b691c1d feat(core): add multimodal support to count_tokens_approximately (#34883) 2026-01-26 15:04:25 -08:00
Weichen Zhao
d899681040 feat(core): add XML format option for get_buffer_string (#34802)
## Summary

Add XML format option for `get_buffer_string()` to provide unambiguous
message serialization. This fixes role prefix ambiguity when message
content contains strings like "Human:" or "AI:".

  Fixes #34786

  ## Changes

- Add `format="xml"` parameter with proper XML escaping using
`quoteattr()` for attributes
- Add explicit validation for format parameter (raises `ValueError` for
invalid values)
  - Add comprehensive tests for XML format edge cases

<img width="1952" height="706" alt="image"
src="https://github.com/user-attachments/assets/1cd6f887-9365-43cf-a532-72d7addd8bad"
/>
<img width="2786" height="776" alt="image"
src="https://github.com/user-attachments/assets/a07b0db0-519c-46d7-b34b-b404237d812b"
/>

---------

Co-authored-by: Mason Daugherty <github@mdrxy.com>
Co-authored-by: Mason Daugherty <mason@langchain.dev>
2026-01-22 13:33:08 -05:00
Christophe Bornet
03ae39747b refactor(core): fix some missing generic types (#31658)
See
https://mypy.readthedocs.io/en/stable/config_file.html#confval-disallow_any_generics

---------

Co-authored-by: Mason Daugherty <mason@langchain.dev>
Co-authored-by: Mason Daugherty <github@mdrxy.com>
2025-12-27 16:53:08 -06:00
Christophe Bornet
a92c032ff6 style(core): fix mypy no-any-return violations (#34204)
* FIxed where possible
* Used `cast` when not possible to fix

---------

Co-authored-by: Mason Daugherty <github@mdrxy.com>
Co-authored-by: Mason Daugherty <mason@langchain.dev>
2025-12-26 21:35:27 -06:00
ccurme
5ec0fa69de fix(core): serialization patch (#34455)
- `allowed_objects` kwarg in `load`
- escape lc-ser formatted dicts on `dump`
- fix for jinja2

---------

Co-authored-by: Mason Daugherty <github@mdrxy.com>
2025-12-22 17:33:31 -06:00
rari404
d84eef667a fix(core): use tool_calls instead of deprecated function_call in get_buffer_string (#34355)
## Summary

Fixes #33970

`get_buffer_string` was only checking for the deprecated `function_call`
field in `additional_kwargs`, which modern LLM providers no longer
return. This fix updates the function to check for the modern
`tool_calls` field first, falling back to `function_call` for legacy
compatibility.

## Changes

- Check `AIMessage.tool_calls` first (modern standard)
- Fall back to `additional_kwargs["function_call"]` (legacy support)
- Added 3 unit tests covering tool_calls, empty content, and precedence
behavior

## Testing

```python
# Before fix: tool_calls info was lost
msg = AIMessage(content="Hi", tool_calls=[{"name": "search", ...}])
get_buffer_string([msg])  # "AI: Hi" (no tool info)

# After fix: tool_calls are included
get_buffer_string([msg])  # "AI: Hi[{\"name\": \"search\", ...}]"
```

- All existing `get_buffer_string` tests pass
- Legacy `function_call` behavior preserved

---

> [!NOTE]
> This PR was developed with AI agent assistance (Factory/Droid).

---------

Co-authored-by: Mason Daugherty <github@mdrxy.com>
2025-12-19 22:37:56 -06:00
James
4fbeffcfee feat(core): add 'approximate' alias in place of count_tokens_approximately (#33045)
### Description: 
earlier we have to use like below:
```python
from langchain_core.messages import trim_messages
from langchain_core.messages.utils import count_tokens_approximately

trim_messages(..., token_counter=count_tokens_approximately)
```
Now can be used as like this also
```python
from langchain_core.messages import trim_messages

trim_messages(..., token_counter="approximate")
```
- [x] **Added tests**
- [x] **Lint and test**: Run this as I made change in langchain/core, uv
run --group test pytest tests/unit_tests/messages/test_utils.py -v
<img width="1006" height="66" alt="image"
src="https://github.com/user-attachments/assets/c6938c29-a781-4e7f-871b-8e888ee764b7"
/>

---------

Co-authored-by: Mason Daugherty <mason@langchain.dev>
Co-authored-by: Mason Daugherty <github@mdrxy.com>
2025-12-19 19:25:29 -06:00
Christophe Bornet
72f1d79022 chore(core): fix some ruff preview rules (#34425)
Co-authored-by: Mason Daugherty <mason@langchain.dev>
2025-12-19 14:33:42 -06:00
ccurme
6519a5675b fix(core): allow unknown blocks in convert_to_openai_messages (#34420) 2025-12-18 14:22:53 -05:00
Mason Daugherty
7542278997 feat(core,anthropic): extras on BaseTool (#34120) 2025-12-10 09:37:14 -05:00
Christophe Bornet
a64aee310c chore(core): improve typing of messages utils functions (#34225)
With this we get the correct types for `_runnable_support` annotated
functions.
* return list[BaseMessage] when messages is not None
* return Runnable when messages is None
* typing of function args
2025-12-08 09:59:43 -05:00
Mason Daugherty
9875ffbabc feat(core): support google maps grounding in genai block translator (#34244)
https://github.com/langchain-ai/langchain-google/pull/1330

---------

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
2025-12-08 09:44:43 -05:00
Abbas Syed
462762f75b test(core): add comprehensive tests for groq block translator (#33906) 2025-11-10 15:45:36 -05:00
Mason Daugherty
62769a0dac feat(langchain): export UsageMetadata (#33692)
as well as `InputTokenDetails`, and `OutputTokenDetails` from
`langchain_core.messages`
2025-10-27 19:47:41 -04:00
ccurme
eb19e12527 feat(core): support vertexai standard content (#33521) 2025-10-16 10:08:58 -04:00
ccurme
4da5a8081f fix(core): propagate extras when aggregating tool calls in v1 content (#33494) 2025-10-15 10:38:16 -04:00
ccurme
961f965f0c feat(groq): support built-in tools in message content (#33459) 2025-10-13 15:06:01 -04:00
Mason Daugherty
d8a680ee57 style: address Sphinx double-backtick snippet syntax (#33389) 2025-10-09 13:35:51 -04:00
Mason Daugherty
6ea03ab46c style(core): drop python 39 linting target for 3.10 (#33286) 2025-10-05 23:22:34 -04:00
SN
99361e623a feat(core): add optional include_id param to convert_to_openai_messages function (#33242) 2025-10-03 08:22:43 -04:00
Mason Daugherty
eaa6dcce9e release: v1.0.0 (#32567)
Co-authored-by: Mohammad Mohtashim <45242107+keenborder786@users.noreply.github.com>
Co-authored-by: Caspar Broekhuizen <caspar@langchain.dev>
Co-authored-by: ccurme <chester.curme@gmail.com>
Co-authored-by: Christophe Bornet <cbornet@hotmail.com>
Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>
Co-authored-by: Sadra Barikbin <sadraqazvin1@yahoo.com>
Co-authored-by: Vadym Barda <vadim.barda@gmail.com>
2025-10-02 10:49:42 -04:00
Mohammad Mohtashim
33a6def762 fix(core): Support of 'reasoning' type in 'convert_to_openai_messages' (#33050) 2025-09-29 09:17:05 -04:00
niceg
0d6f915442 fix: LLM mimicking Unicode responses due to forced Unicode conversion of non-ASCII characters. (#32222)
fix: Fix LLM mimicking Unicode responses due to forced Unicode
conversion of non-ASCII characters.

- **Description:** This PR fixes an issue where the LLM would mimic
Unicode responses due to forced Unicode conversion of non-ASCII
characters in tool calls. The fix involves disabling the `ensure_ascii`
flag in `json.dumps()` when converting tool calls to OpenAI format.
- **Issue:** Fixes ↓↓↓
input:
```json
{'role': 'assistant', 'tool_calls': [{'type': 'function', 'id': 'call_nv9trcehdpihr21zj9po19vq', 'function': {'name': 'create_customer', 'arguments': '{"customer_name": "你好啊集团"}'}}]}
```
output:
```json
{'role': 'assistant', 'tool_calls': [{'type': 'function', 'id': 'call_nv9trcehdpihr21zj9po19vq', 'function': {'name': 'create_customer', 'arguments': '{"customer_name": "\\u4f60\\u597d\\u554a\\u96c6\\u56e2"}'}}]}
```
then:
llm will mimic outputting unicode. Unicode's vast number of symbols can
lengthen LLM responses, leading to slower performance.
<img width="686" height="277" alt="image"
src="https://github.com/user-attachments/assets/28f3b007-3964-4455-bee2-68f86ac1906d"
/>

---------

Co-authored-by: Mason Daugherty <github@mdrxy.com>
Co-authored-by: Mason Daugherty <mason@langchain.dev>
2025-07-24 17:01:31 -04:00
Christophe Bornet
03e8327e01 core: Ruff preview fixes (#31877)
Auto-fixes from `uv run ruff check --fix --unsafe-fixes --preview`

---------

Co-authored-by: Mason Daugherty <mason@langchain.dev>
2025-07-07 13:02:40 -04:00
Mikhail
6105a5841b core: fix get_buffer_string output for structured message content (#31600) 2025-06-20 23:21:50 +00:00
Christophe Bornet
c982573f1e core: Add ruff rules A (builtins shadowing) (#29312)
See https://docs.astral.sh/ruff/rules/#flake8-builtins-a
* Renamed vars where possible
* Added `noqa` where backward compatibility was needed
* Added `@override` when applicable
2025-05-16 15:19:37 -04:00
Jacob Lee
66d1ed6099 fix(core): Permit OpenAI style blocks to be passed into convert_to_openai_messages (#31140)
Should effectively be a noop, just shouldn't throw

CC @madams0013

---------

Co-authored-by: ccurme <chester.curme@gmail.com>
2025-05-07 10:57:37 -04:00
Jacob Lee
6b0b317cb5 feat(core): Autogenerate filenames for when converting file content blocks to OpenAI format (#30984)
CC @ccurme

---------

Co-authored-by: Chester Curme <chester.curme@gmail.com>
2025-04-24 13:36:31 +00:00
ccurme
4bc70766b5 core, openai: support standard multi-modal blocks in convert_to_openai_messages (#30968) 2025-04-23 11:20:44 -04:00
Christophe Bornet
a4ca1fe0ed core: Remove some noqa (#30855) 2025-04-15 13:08:40 -04:00
ccurme
9cfe6bcacd multiple: multi-modal content blocks (#30746)
Introduces standard content block format for images, audio, and files.

## Examples

Image from url:
```
{
    "type": "image",
    "source_type": "url",
    "url": "https://path.to.image.png",
}
```


Image, in-line data:
```
{
    "type": "image",
    "source_type": "base64",
    "data": "<base64 string>",
    "mime_type": "image/png",
}
```


PDF, in-line data:
```
{
    "type": "file",
    "source_type": "base64",
    "data": "<base64 string>",
    "mime_type": "application/pdf",
}
```


File from ID:
```
{
    "type": "file",
    "source_type": "id",
    "id": "file-abc123",
}
```


Plain-text file:
```
{
    "type": "file",
    "source_type": "text",
    "text": "foo bar",
}
```
2025-04-15 09:48:06 -04:00
Christophe Bornet
98f0016fc2 core: Add ruff rules ARG (#30732)
See https://docs.astral.sh/ruff/rules/#flake8-unused-arguments-arg
2025-04-09 14:39:36 -04:00
Christophe Bornet
8a33402016 core: Add ruff rules PT (pytest) (#29381)
See https://docs.astral.sh/ruff/rules/#flake8-pytest-style-pt
2025-04-01 13:31:07 -04:00
Adrián Panella
b75573e858 core: add tool_call exclusion in filter_message (#30289)
Extend functionallity to allow to filter pairs of tool calls (ai +
tool).

---------

Co-authored-by: vbarda <vadym@langchain.dev>
2025-03-21 23:05:29 +00:00
Vadym Barda
07823cd41c core[patch]: optimize trim_messages (#30327)
Refactored w/ Claude

Up to 20x speedup! (with theoretical max improvement of `O(n / log n)`)
2025-03-21 17:08:26 -04:00
Vadym Barda
4852ab8d0a core[patch]: more tests for trim_messages (#30421) 2025-03-21 16:19:52 +00:00
Vadym Barda
37190881d3 core[patch]: add util for approximate token counting (#30373) 2025-03-19 17:48:38 +00:00
ccurme
52b0570bec core, openai, standard-tests: improve OpenAI compatibility with Anthropic content blocks (#30128)
- Support thinking blocks in core's `convert_to_openai_messages` (pass
through instead of error)
- Ignore thinking blocks in ChatOpenAI (instead of error)
- Support Anthropic-style image blocks in ChatOpenAI

---

Standard integration tests include a `supports_anthropic_inputs`
property which is currently enabled only for tests on `ChatAnthropic`.
This test enforces compatibility with message histories of the form:
```
- system message
- human message
- AI message with tool calls specified only through `tool_use` content blocks
- human message containing `tool_result` and an additional `text` block
```
It additionally checks support for Anthropic-style image inputs if
`supports_image_inputs` is enabled.

Here we change this test, such that if you enable
`supports_anthropic_inputs`:
- You support AI messages with text and `tool_use` content blocks
- You support Anthropic-style image inputs (if `supports_image_inputs`
is enabled)
- You support thinking content blocks.

That is, we add a test case for thinking content blocks, but we also
remove the requirement of handling tool results within HumanMessages
(motivated by existing agent abstractions, which should all return
ToolMessage). We move that requirement to a ChatAnthropic-specific test.
2025-03-06 09:53:14 -05:00
Isaac Francisco
2bb2c9bfe8 change behavior for converting a string to openai messages (#29446) 2025-01-27 18:18:54 -08:00
Christophe Bornet
e4a78dfc2a core: Bump ruff version to 0.9 (#29201)
Also run some preview autofix and formatting

---------

Co-authored-by: Erick Friis <erick@langchain.dev>
2025-01-22 00:20:09 +00:00
Bagatur
4a531437bb core[patch], openai[patch]: Handle OpenAI developer msg (#28794)
- Convert developer openai messages to SystemMessage
- store additional_kwargs={"__openai_role__": "developer"} so that the
correct role can be reconstructed if needed
- update ChatOpenAI to read in openai_role

---------

Co-authored-by: Erick Friis <erick@langchain.dev>
2024-12-18 21:54:07 +00:00
William FH
ecee41ab72 fix: Handle response metadata in merge_messages_runs (#28453) 2024-12-02 13:56:23 -08:00
ccurme
1538ee17f9 anthropic[major]: support python 3.13 (#27916)
Last week Anthropic released version 0.39.0 of its python sdk, which
enabled support for Python 3.13. This release deleted a legacy
`client.count_tokens` method, which we currently access during init of
the `Anthropic` LLM. Anthropic has replaced this functionality with the
[client.beta.messages.count_tokens()
API](https://github.com/anthropics/anthropic-sdk-python/pull/726).

To enable support for `anthropic >= 0.39.0` and Python 3.13, here we
drop support for the legacy token counting method, and add support for
the new method via `ChatAnthropic.get_num_tokens_from_messages`.

To fully support the token counting API, we update the signature of
`get_num_tokens_from_message` to accept tools everywhere.

---------

Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
2024-11-12 14:31:07 -05:00
William FH
5a2cfb49e0 Support message trimming on single messages (#27729)
Permit trimming message lists of length 1
2024-10-30 04:27:52 +00:00
Bagatur
a4392b070d core[patch]: add convert_to_openai_messages util (#27263)
Co-authored-by: Erick Friis <erick@langchain.dev>
2024-10-16 17:10:10 +00:00