Commit Graph

14 Commits

Author SHA1 Message Date
Nick Hollon
138727c008 perf(core): memoize BaseTool.tool_call_schema subset model and cache model_json_schema (#38073) 2026-06-17 17:17:14 -04:00
Mason Daugherty
f89f4c5afe fix(core): support content block tokens in callbacks (#34739)
Supersedes #34727
Closes #30703

Related:
* langchain-ai/langchain-google#1460
* langchain-ai/langchain-google#1501

Fixing this at the `langchain-core` callback layer instead of
normalizing inside individual provider integrations, so structured
streaming content is preserved consistently.

---

Models are increasingly streaming structured content blocks instead of
plain text tokens. For example, Gemini 3 can stream text as
content-block lists, and Anthropic/tool-use flows can also produce
non-text message content. Today those values already reach
`on_llm_new_token`, but the callback API still advertises `token: str`,
which makes custom callbacks, tracers, and streaming helpers assume
every streamed value is text.

User story: as a LangChain user building a streaming callback for chat
models with tool calls, reasoning/thinking blocks, or provider-specific
structured content, I need `on_llm_new_token` to accept the same content
shape that chat model chunks can actually emit, so my callback can
observe the stream without providers flattening or dropping non-text
data.

Fixing this in `langchain-core` makes the existing runtime behavior
explicit at the shared callback boundary. Normalizing content blocks
inside each provider would duplicate logic, produce inconsistent
behavior across integrations, and in some cases lose required provider
metadata such as Gemini thought signatures.

## Changes

- Update the callback contract so streamed tokens can be either plain
text or structured content blocks
- Carry structured streamed content through tracing and event/log
streaming paths without forcing provider data into text too early
- Keep built-in text-oriented streaming callbacks working by converting
structured tokens only at the display/queue boundary
- Drop the now-incorrect `cast("str", ...)` on streamed content in
`BaseChatModel` so the producer side matches the widened callback
signature instead of asserting a string it doesn't always have (no
runtime change — `cast` is erased)
- Align Anthropic and Mistral content typing with the structured content
shapes already used by chat model messages
- Update callback tests to reflect that not every streamed value is text

## Compatibility

No runtime behavior change: no producer emits anything it wasn't already
emitting, and widening a parameter type is safe for existing callers and
handlers that pass or receive `str`. The one caveat is downstream code
that subclasses a callback handler or tracer and overrides
`on_llm_new_token` with a `token: str` annotation — under strict type
checking that override is now narrower than the base and will be flagged
as incompatible with the supertype. Such code still runs unchanged; the
fix is to widen the annotation to match.
2026-06-10 16:59:08 -04:00
Christophe Bornet
e03d6b80d5 chore(deps): bump mypy to v1.19 and ruff to v1.14 (#34521)
* Set mypy to >=1.19.1,<1.20
* Set ruff to >=0.14.10,<0.15
2025-12-29 18:07:55 -06:00
William FH
1867521d1a feat: Use uuid7 for run ids (#34172)
Co-authored-by: Sydney Runkle <54324534+sydney-runkle@users.noreply.github.com>
Co-authored-by: Sydney Runkle <sydneymarierunkle@gmail.com>
2025-12-03 10:09:10 -08:00
Mason Daugherty
6ea03ab46c style(core): drop python 39 linting target for 3.10 (#33286) 2025-10-05 23:22:34 -04:00
Christophe Bornet
4134b36db8 core: make ruff rule PLW1510 unfixable (#31868)
See
https://github.com/astral-sh/ruff/discussions/17087#discussioncomment-12675815

Tha autofix is misleading: it chooses to add `check=False` to keep the
runtime behavior but in reality it hides the fact that most probably the
user would prefer `check=True`.
2025-07-07 10:28:30 -04:00
Sydney Runkle
59f2c9e737 Tinkering with CodSpeed (#30824)
Fix CI to trigger benchmarks on `run-codspeed-benchmarks` label addition

Reduce scope of async benchmark to save time on CI

Waiting to merge this PR until we figure out how to use walltime on
local runners.
2025-04-15 08:49:09 -04:00
Christophe Bornet
42944f3499 core: Improve mypy config (#30737)
* Cleanup mypy config
* Add mypy `strict` rules except `disallow_any_generics`,
`warn_return_any` and `strict_equality` (TODO)
* Add mypy `strict_byte` rule
* Add mypy support for PEP702 `@deprecated` decorator
* Bump mypy version to 1.15

---------

Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>
2025-04-11 16:35:13 -04:00
William FH
2803a48661 core[patch]: Share executor for async callbacks run in sync context (#30779)
To avoid having to create ephemeral threads, grab the thread lock, etc.
2025-04-11 10:34:43 -07:00
Christophe Bornet
89f28a24d3 core[lint]: Fix typing in test_async_callbacks (#30788) 2025-04-11 07:26:38 -04:00
Christophe Bornet
dc19d42d37 core: Specify code when ignoring type issue (ruff PGH003) (#30675)
See https://docs.astral.sh/ruff/rules/blanket-type-ignore/
2025-04-10 22:23:52 -04:00
William FH
70532a65f8 Async callback benchmark (#30777) 2025-04-10 15:47:19 -07:00
Sydney Runkle
cd6a83117c Adding more import time benchmarks for langchain-core (#30770)
Plus minor typo fix in `ChatPromptTemplate` case id.
2025-04-10 11:50:12 -04:00
Sydney Runkle
78ec7d886d [performance]: Adding benchmarks for common langchain-core imports (#30747)
The first in a sequence of PRs focusing on improving performance in
core. We're starting with reducing import times for common structures,
hence the benchmarks here.

The benchmark looks a little bit complicated - we have to use a process
so that we don't suffer from Python's import caching system. I tried
doing manual modification of `sys.modules` between runs, but that's
pretty tricky / hacky to get right, hence the subprocess approach.

Motivated by extremely slow baseline for common imports (we're talking
2-5 seconds):

<img width="633" alt="Screenshot 2025-04-09 at 12 48 12 PM"
src="https://github.com/user-attachments/assets/994616fe-1798-404d-bcbe-48ad0eb8a9a0"
/>

Also added a `make benchmark` command to make local runs easy :).
Currently using walltimes so that we can track total time despite using
a manual proces.
2025-04-09 13:00:15 -04:00