PR #35788 added 7 new fields to the `langchain-profiles` CLI output
(`name`, `status`, `release_date`, `last_updated`, `open_weights`,
`attachment`, `temperature`) but didn't update `ModelProfile` in
`langchain-core`. Partner packages like `langchain-aws` that set
`extra="forbid"` on their Pydantic models hit `extra_forbidden`
validation errors when Pydantic encountered undeclared TypedDict keys at
construction time. This adds the missing fields, makes `ModelProfile`
forward-compatible, provides a base-class hook so partners can stop
duplicating model-profile validator boilerplate, migrates all in-repo
partners to the new hook, and adds runtime + CI-time warnings for schema
drift.
## Changes
### `langchain-core`
- Add `__pydantic_config__ = ConfigDict(extra="allow")` to
`ModelProfile` so unknown profile keys pass Pydantic validation even on
models with `extra="forbid"` — forward-compatibility for when the CLI
schema evolves ahead of core
- Declare the 7 missing fields on `ModelProfile`: `name`, `status`,
`release_date`, `last_updated`, `open_weights` (metadata) and
`attachment`, `temperature` (capabilities)
- Add `_warn_unknown_profile_keys()` in `model_profile.py` — emits a
`UserWarning` when a profile dict contains keys not in `ModelProfile`,
suggesting a core upgrade. Wrapped in a bare `except` so introspection
failures never crash model construction
- Add `BaseChatModel._resolve_model_profile()` hook that returns `None`
by default. Partners can override this single method instead of
redefining the full `_set_model_profile` validator — the base validator
calls it automatically
- Add `BaseChatModel._check_profile_keys` as a separate
`model_validator` that calls `_warn_unknown_profile_keys`. Uses a
distinct method name so partner overrides of `_set_model_profile` don't
inadvertently suppress the check
### `langchain-profiles` CLI
- Add `_warn_undeclared_profile_keys()` to the CLI (`cli.py`), called
after merging augmentations in `refresh()` — warns at profile-generation
time (not just runtime) when emitted keys aren't declared in
`ModelProfile`. Gracefully skips if `langchain-core` isn't installed
- Add guard test
`test_model_data_to_profile_keys_subset_of_model_profile` in
model-profiles — feeds a fully-populated model dict to
`_model_data_to_profile()` and asserts every emitted key exists in
`ModelProfile.__annotations__`. CI fails before any release if someone
adds a CLI field without updating the TypedDict
### Partner packages
- Migrate all 10 in-repo partners to the `_resolve_model_profile()`
hook, replacing duplicated `@model_validator` / `_set_model_profile`
overrides: anthropic, deepseek, fireworks, groq, huggingface, mistralai,
openai (base + azure), openrouter, perplexity, xai
- Anthropic retains custom logic (context-1m beta → `max_input_tokens`
override); all others reduce to a one-liner
- Add `pr_lint.yml` scope for the new `model-profiles` package
Closes#29530
---
Remove a stale BlockBuster allowlist entry in `conftest.py` referencing
`aconfig_with_context` — the function and its containing module
(`langchain_core/beta/runnables/context.py`) were deleted in `fded6c6b1`
(Sep 2025, #32850). Spotted by @antonio-mello-ai in #29530.
Fixes missing `run.metadata.usage_metadata` population in
`LangChainTracer` for real LLM/chat traces following #34414
- Fix extraction to read usage from serialized tracer message shape:
`outputs.generations[*][*].message.kwargs.usage_metadata`
- Remove non-serialized direct message shape handling
(`message.usage_metadata`) from extractor to match real tracer output
path
- Clarify tracer docstrings around chat callback naming
(`on_chat_model_start` + shared `on_llm_end`) to reduce ambiguity
## Why
#34414 introduced usage duplication into `run.metadata.usage_metadata`,
but the extractor read `message.usage_metadata`.
In real tracer flow, messages are serialized with `dumpd(...)` during
run completion, so usage metadata lives under
`message.kwargs.usage_metadata`. Because of this mismatch, duplication
did not trigger in real traces.
it looks scary but i promise it is not
improving documentation consistency across core. primarily update
docstrings and comments for better formatting, readability, and
accuracy, as well as add minor clarifications and formatting
improvements to user-facing documentation.
## Summary
Add XML format option for `get_buffer_string()` to provide unambiguous
message serialization. This fixes role prefix ambiguity when message
content contains strings like "Human:" or "AI:".
Fixes#34786
## Changes
- Add `format="xml"` parameter with proper XML escaping using
`quoteattr()` for attributes
- Add explicit validation for format parameter (raises `ValueError` for
invalid values)
- Add comprehensive tests for XML format edge cases
<img width="1952" height="706" alt="image"
src="https://github.com/user-attachments/assets/1cd6f887-9365-43cf-a532-72d7addd8bad"
/>
<img width="2786" height="776" alt="image"
src="https://github.com/user-attachments/assets/a07b0db0-519c-46d7-b34b-b404237d812b"
/>
---------
Co-authored-by: Mason Daugherty <github@mdrxy.com>
Co-authored-by: Mason Daugherty <mason@langchain.dev>
Updates `comma_list` in `libs/core/langchain_core/utils/strings.py` to
accept `Iterable[Any]` instead of `list[Any]`, making the utility more
flexible.
---------
Co-authored-by: Mason Daugherty <github@mdrxy.com>
# Add `tool_call_id` to `on_tool_error` event data
## Summary
This PR addresses issue #33597 by adding `tool_call_id` to the
`on_tool_error` callback event data. This enables users to link tool
errors to specific tool calls in stateless agent implementations, which
is essential for building OpenAI-compatible APIs and tracking tool
execution flows.
## Problem
When streaming events using `astream_events` with `version="v2"`, the
`on_tool_error` event only included the error and input data, but lacked
the `tool_call_id`. This made it difficult to:
- Link errors to specific tool calls in stateless agent scenarios
- Implement OpenAI-compatible APIs that require tool call tracking
- Track tool execution flows when using `run_id` is not sufficient
## Solution
The fix adds `tool_call_id` propagation through the callback chain:
1. **Pass `tool_call_id` to callbacks**: Updated `BaseTool.run()` and
`BaseTool.arun()` to pass `tool_call_id` to both `on_tool_start` and
`on_tool_error` callbacks
2. **Store in event stream handler**: Modified
`_AstreamEventsCallbackHandler` to store `tool_call_id` in run info
during `on_tool_start`
3. **Include in error events**: Updated `on_tool_error` handler to
extract and include `tool_call_id` in the event data
## Changes
- **`libs/core/langchain_core/tools/base.py`**:
- Pass `tool_call_id` to `on_tool_start` in both sync and async methods
- Pass `tool_call_id` to `on_tool_error` when errors occur
- **`libs/core/langchain_core/tracers/event_stream.py`**:
- Store `tool_call_id` in run info during `on_tool_start`
- Extract `tool_call_id` from kwargs or run info in `on_tool_error`
- Include `tool_call_id` in the `on_tool_error` event data
## Testing
The fix was verified by:
1. Direct tool invocation: Confirmed `tool_call_id` appears in
`on_tool_error` event data when calling tools directly
2. Agent integration: Tested with `create_agent` to ensure
`tool_call_id` is present in error events during agent execution
```python
# Example verification
async for event in agent.astream_events(
{"messages": "Please demonstrate a tool error"},
version="v2",
):
if event["event"] == "on_tool_error":
assert "tool_call_id" in event["data"] # ✓ Now passes
print(event["data"]["tool_call_id"])
```
## Backward Compatibility
- ✅ Fully backward compatible: `tool_call_id` is optional (can be
`None`)
- ✅ No breaking changes: All changes are additive
- ✅ Existing code continues to work without modification
## Related Issues
Fixes#33597
---------
Co-authored-by: Mason Daugherty <github@mdrxy.com>
Changes Created
I have fixed the issue where a generic and misleading error message was
displayed when a JSON schema was missing the top-level
title
key.
[Fix: Improve error message for missing title in JSON schema
functions](https://github.com/Bhavesh007Sharma/langchain/tree/fix-json-schema-title-error)
File Modified:
libs/core/langchain_core/utils/function_calling.py
I updated the
convert_to_openai_function
validation logic to specifically check for
dict
inputs that look like schemas (
type
or
properties
keys present) but are missing the
title
key.
# Before (Generic Error)
raise ValueError(
f"Unsupported function\n\n{function}\n\nFunctions must be passed in"
" as Dict, pydantic.BaseModel, or Callable. If they're a dict they must"
" either be in OpenAI function format or valid JSON schema with
top-level"
" 'title' and 'description' keys."
)
# After (Specific Error)
if isinstance(function, dict) and ("type" in function or "properties" in
function):
msg = (
"Unsupported function\n\nTo use a JSON schema as a function, "
"it must have a top-level 'title' key to be used as the function name."
)
raise ValueError(msg)
Verification Results
Automated Tests
I created a reproduction script
reproduce_issue.py
to confirm the behavior.
Before Fix: The script would have raised the generic "Unsupported
function" error claiming description was also required.
After Fix: The script now confirms that the new, specific error message
is raised when
title
is missing.
(Note: Verification was performed by inspecting the code logic and
running a lightweight reproduction script locally, as full suite
verification had environment dependency issues.)
---------
Co-authored-by: Mason Daugherty <github@mdrxy.com>
**Description:**
*Closes
#[33883](https://github.com/langchain-ai/langchain/issues/33883)*
Chat model cache keys are generated by serializing messages via
`dumps(messages)`. The optional `BaseMessage.id` field (a UUID used
solely for tracing/threading) is included in this serialization, causing
functionally identical messages to produce different cache keys. This
results in repeated API calls, cache bloat, and degraded performance in
production workloads (e.g., agents, RAG chains, long conversations).
This change normalizes messages **only for cache key generation** by
stripping the nonsemantic `id` field using Pydantic V2’s
`model_copy(update={"id": None})`. The normalization is applied in both
synchronous and asynchronous cache paths (`_generate_with_cache` /
`_agenerate_with_cache`) immediately before `dumps()`.
```python
normalized_messages = [
msg.model_copy(update={"id": None})
if getattr(msg, "id", None) is not None
else msg
for msg in messages
]
prompt = dumps(normalized_messages)
---------
Co-authored-by: Mason Daugherty <mason@langchain.dev>
Co-authored-by: Mason Daugherty <github@mdrxy.com>
### Description
`ChatPromptTemplate.from_messages` supports multiple tuple formats for
defining message templates. One documented format is `(message class,
template)`, which allows users to specify the message type using the
class directly:
```python
ChatPromptTemplate.from_messages([
(SystemMessage, "You are a helpful assistant named {name}."),
(HumanMessage, "{input}"),
])
```
However, this syntax was broken. Passing a tuple like `(HumanMessage,
"{input}")` would raise a Pydantic validation error because the
conversion logic in `_convert_to_message_template` didn't handle
`BaseMessage` subclasses—it only recognized string-based role
identifiers like `"human"` or `"system"`.
This PR adds the missing branch to detect when the first element of a
tuple is a message class (by checking for the `type` class attribute)
and routes it through `_create_template_from_message_type`, which
already knows how to create the appropriate `MessagePromptTemplate` for
each message type.
### Changes
- Updated `_convert_to_message_template` to properly support `(message
class, template)` tuples
### Testing
Added 16 comprehensive unit tests covering:
- Basic usage with `HumanMessage`, `AIMessage`, and `SystemMessage`
classes
- Integration with `invoke()` method
- Mixed syntax (message class tuples alongside string tuples)
- Multiple template variables
- Edge cases: empty templates, static text (no variables)
- Correct extraction of `input_variables`
- Partial variables support
- Combination with `MessagesPlaceholder`
- Mustache template format
- Template operations: `append()`, `extend()`, concatenation, and
slicing
- Special characters and unicode in templates
### Issue
Fixes#33791
### Dependencies
None
---------
Co-authored-by: Mason Daugherty <mason@langchain.dev>