it looks scary but i promise it is not
improving documentation consistency across core. primarily update
docstrings and comments for better formatting, readability, and
accuracy, as well as add minor clarifications and formatting
improvements to user-facing documentation.
## Summary
Add XML format option for `get_buffer_string()` to provide unambiguous
message serialization. This fixes role prefix ambiguity when message
content contains strings like "Human:" or "AI:".
Fixes#34786
## Changes
- Add `format="xml"` parameter with proper XML escaping using
`quoteattr()` for attributes
- Add explicit validation for format parameter (raises `ValueError` for
invalid values)
- Add comprehensive tests for XML format edge cases
<img width="1952" height="706" alt="image"
src="https://github.com/user-attachments/assets/1cd6f887-9365-43cf-a532-72d7addd8bad"
/>
<img width="2786" height="776" alt="image"
src="https://github.com/user-attachments/assets/a07b0db0-519c-46d7-b34b-b404237d812b"
/>
---------
Co-authored-by: Mason Daugherty <github@mdrxy.com>
Co-authored-by: Mason Daugherty <mason@langchain.dev>
Updates `comma_list` in `libs/core/langchain_core/utils/strings.py` to
accept `Iterable[Any]` instead of `list[Any]`, making the utility more
flexible.
---------
Co-authored-by: Mason Daugherty <github@mdrxy.com>
# Add `tool_call_id` to `on_tool_error` event data
## Summary
This PR addresses issue #33597 by adding `tool_call_id` to the
`on_tool_error` callback event data. This enables users to link tool
errors to specific tool calls in stateless agent implementations, which
is essential for building OpenAI-compatible APIs and tracking tool
execution flows.
## Problem
When streaming events using `astream_events` with `version="v2"`, the
`on_tool_error` event only included the error and input data, but lacked
the `tool_call_id`. This made it difficult to:
- Link errors to specific tool calls in stateless agent scenarios
- Implement OpenAI-compatible APIs that require tool call tracking
- Track tool execution flows when using `run_id` is not sufficient
## Solution
The fix adds `tool_call_id` propagation through the callback chain:
1. **Pass `tool_call_id` to callbacks**: Updated `BaseTool.run()` and
`BaseTool.arun()` to pass `tool_call_id` to both `on_tool_start` and
`on_tool_error` callbacks
2. **Store in event stream handler**: Modified
`_AstreamEventsCallbackHandler` to store `tool_call_id` in run info
during `on_tool_start`
3. **Include in error events**: Updated `on_tool_error` handler to
extract and include `tool_call_id` in the event data
## Changes
- **`libs/core/langchain_core/tools/base.py`**:
- Pass `tool_call_id` to `on_tool_start` in both sync and async methods
- Pass `tool_call_id` to `on_tool_error` when errors occur
- **`libs/core/langchain_core/tracers/event_stream.py`**:
- Store `tool_call_id` in run info during `on_tool_start`
- Extract `tool_call_id` from kwargs or run info in `on_tool_error`
- Include `tool_call_id` in the `on_tool_error` event data
## Testing
The fix was verified by:
1. Direct tool invocation: Confirmed `tool_call_id` appears in
`on_tool_error` event data when calling tools directly
2. Agent integration: Tested with `create_agent` to ensure
`tool_call_id` is present in error events during agent execution
```python
# Example verification
async for event in agent.astream_events(
{"messages": "Please demonstrate a tool error"},
version="v2",
):
if event["event"] == "on_tool_error":
assert "tool_call_id" in event["data"] # ✓ Now passes
print(event["data"]["tool_call_id"])
```
## Backward Compatibility
- ✅ Fully backward compatible: `tool_call_id` is optional (can be
`None`)
- ✅ No breaking changes: All changes are additive
- ✅ Existing code continues to work without modification
## Related Issues
Fixes#33597
---------
Co-authored-by: Mason Daugherty <github@mdrxy.com>
Changes Created
I have fixed the issue where a generic and misleading error message was
displayed when a JSON schema was missing the top-level
title
key.
[Fix: Improve error message for missing title in JSON schema
functions](https://github.com/Bhavesh007Sharma/langchain/tree/fix-json-schema-title-error)
File Modified:
libs/core/langchain_core/utils/function_calling.py
I updated the
convert_to_openai_function
validation logic to specifically check for
dict
inputs that look like schemas (
type
or
properties
keys present) but are missing the
title
key.
# Before (Generic Error)
raise ValueError(
f"Unsupported function\n\n{function}\n\nFunctions must be passed in"
" as Dict, pydantic.BaseModel, or Callable. If they're a dict they must"
" either be in OpenAI function format or valid JSON schema with
top-level"
" 'title' and 'description' keys."
)
# After (Specific Error)
if isinstance(function, dict) and ("type" in function or "properties" in
function):
msg = (
"Unsupported function\n\nTo use a JSON schema as a function, "
"it must have a top-level 'title' key to be used as the function name."
)
raise ValueError(msg)
Verification Results
Automated Tests
I created a reproduction script
reproduce_issue.py
to confirm the behavior.
Before Fix: The script would have raised the generic "Unsupported
function" error claiming description was also required.
After Fix: The script now confirms that the new, specific error message
is raised when
title
is missing.
(Note: Verification was performed by inspecting the code logic and
running a lightweight reproduction script locally, as full suite
verification had environment dependency issues.)
---------
Co-authored-by: Mason Daugherty <github@mdrxy.com>
**Description:**
*Closes
#[33883](https://github.com/langchain-ai/langchain/issues/33883)*
Chat model cache keys are generated by serializing messages via
`dumps(messages)`. The optional `BaseMessage.id` field (a UUID used
solely for tracing/threading) is included in this serialization, causing
functionally identical messages to produce different cache keys. This
results in repeated API calls, cache bloat, and degraded performance in
production workloads (e.g., agents, RAG chains, long conversations).
This change normalizes messages **only for cache key generation** by
stripping the nonsemantic `id` field using Pydantic V2’s
`model_copy(update={"id": None})`. The normalization is applied in both
synchronous and asynchronous cache paths (`_generate_with_cache` /
`_agenerate_with_cache`) immediately before `dumps()`.
```python
normalized_messages = [
msg.model_copy(update={"id": None})
if getattr(msg, "id", None) is not None
else msg
for msg in messages
]
prompt = dumps(normalized_messages)
---------
Co-authored-by: Mason Daugherty <mason@langchain.dev>
Co-authored-by: Mason Daugherty <github@mdrxy.com>
### Description
`ChatPromptTemplate.from_messages` supports multiple tuple formats for
defining message templates. One documented format is `(message class,
template)`, which allows users to specify the message type using the
class directly:
```python
ChatPromptTemplate.from_messages([
(SystemMessage, "You are a helpful assistant named {name}."),
(HumanMessage, "{input}"),
])
```
However, this syntax was broken. Passing a tuple like `(HumanMessage,
"{input}")` would raise a Pydantic validation error because the
conversion logic in `_convert_to_message_template` didn't handle
`BaseMessage` subclasses—it only recognized string-based role
identifiers like `"human"` or `"system"`.
This PR adds the missing branch to detect when the first element of a
tuple is a message class (by checking for the `type` class attribute)
and routes it through `_create_template_from_message_type`, which
already knows how to create the appropriate `MessagePromptTemplate` for
each message type.
### Changes
- Updated `_convert_to_message_template` to properly support `(message
class, template)` tuples
### Testing
Added 16 comprehensive unit tests covering:
- Basic usage with `HumanMessage`, `AIMessage`, and `SystemMessage`
classes
- Integration with `invoke()` method
- Mixed syntax (message class tuples alongside string tuples)
- Multiple template variables
- Edge cases: empty templates, static text (no variables)
- Correct extraction of `input_variables`
- Partial variables support
- Combination with `MessagesPlaceholder`
- Mustache template format
- Template operations: `append()`, `extend()`, concatenation, and
slicing
- Special characters and unicode in templates
### Issue
Fixes#33791
### Dependencies
None
---------
Co-authored-by: Mason Daugherty <mason@langchain.dev>
This PR adds a regression test covering the JSON Schema `$ref` pattern
found in
MCP-style schemas, where a `$ref` points into a list-based structure
such as:
#/properties/body/anyOf/1/properties/Message/properties/bccRecipients/items
This pattern historically failed due to incorrect handling of numeric
list
components in `_retrieve_ref`. The underlying bug has since been fixed,
and
this test ensures coverage so we don't regress on list-index `$ref`
resolution.
The new test (`test_dereference_refs_list_index_items_ref_mcp_like`)
verifies:
- correct traversal into `anyOf[1]`
- proper dereferencing of `items.$ref`
- no errors thrown
- `ccRecipients.items` is identical to the resolved schema of
`bccRecipients.items`
No code changes are included, just the one test — this PR adds coverage
to preserve the expected
behavior and documents support for this real-world MCP schema pattern.
Related to #32012.
---------
Co-authored-by: Mason Daugherty <mason@langchain.dev>
## Description
Fixed `BaseCallbackManager.merge()` method to correctly preserve the
distinction between `handlers` and `inheritable_handlers` during merge
operations.
Previously, the merge method was using `add_handler()` which incorrectly
added handlers to both lists when `inherit=True`, causing
cross-contamination between regular and inheritable handlers.
The fix directly passes the combined handler lists to the constructor
instead of using `add_handler()`, ensuring proper separation is
maintained.
## Issue
Fixes#32028
## Dependencies
None
## Testing
- Modified existing test `test_merge_preserves_handler_distinction()` to
verify handlers remain properly separated after merge
## Checklist
- [x] **Breaking Changes**: No breaking changes - only fixes incorrect
behavior
- [x] **Type Hints**: All functions have complete type annotations
- [x] **Tests**: Fix is fully tested with existing unit test
- [x] **Security**: No security implications
- [x] **Documentation**: No documentation changes needed - bug fix only
- [x] **Code Quality**: Passes lint and format checks
- [x] **Commit Message**: Follows Conventional Commits format
---------
Co-authored-by: Mason Daugherty <mason@langchain.dev>
* FIxed where possible
* Used `cast` when not possible to fix
---------
Co-authored-by: Mason Daugherty <github@mdrxy.com>
Co-authored-by: Mason Daugherty <mason@langchain.dev>
## Problem
The `draw_mermaid_png()` function fails with HTTP 400 when using named
background colors like `white`. This is because named colors get
prefixed with `!` (e.g., `!white`) but this special character is not
URL-encoded before being added to the API URL.
As reported in #34444, the URL parameter `bgColor=!white` causes
mermaid.ink to return a 400 Bad Request error.
## Solution
URL-encode the `background_color` parameter using `urllib.parse.quote()`
before constructing the API URL. This ensures special characters like
`!` are properly encoded as `%21`.
## Changes
- Added `import urllib.parse`
- URL-encode `background_color` value with
`urllib.parse.quote(str(background_color), safe="")`
- Added 2 unit tests:
- `test_mermaid_bgcolor_url_encoding`: Verifies named colors are
properly encoded
- `test_mermaid_bgcolor_hex_not_encoded`: Verifies hex colors work
correctly
## Testing
```bash
pytest tests/unit_tests/runnables/test_graph.py::test_mermaid_bgcolor_url_encoding -v
pytest tests/unit_tests/runnables/test_graph.py::test_mermaid_bgcolor_hex_not_encoded -v
```
Both tests pass.
Fixes#34444
---
*This contribution was made with AI assistance (Claude).*
Co-authored-by: Mr-Neutr0n <mrneutron@users.noreply.github.com>
Co-authored-by: Mason Daugherty <mason@langchain.dev>
## Summary
Fixes#34247
When using `Annotated[type, Field(description="...")]` syntax with the
`@tool` decorator, field descriptions were being lost during schema
generation. The `_get_annotation_description()` function only checked
for string annotations but not for Pydantic `FieldInfo` objects.
## Changes
- Extended `_get_annotation_description()` to also extract descriptions
from `FieldInfo` objects within `Annotated` types
- Added import for `pydantic.fields.FieldInfo`
- Added unit test to verify `Field(description=...)` is preserved
## Why this approach
The fix is minimal and targeted - it extends the existing description
extraction logic rather than restructuring the schema generation. This
maintains backward compatibility while supporting both annotation
styles:
```python
# Both now work correctly:
topic: Annotated[str, "The research topic"] # existing
topic: Annotated[str, Field(description="...")] # now fixed
```
## Known limitation
This fix only handles `pydantic.fields.FieldInfo` (Pydantic v2). The v1
compatibility layer (`pydantic.v1.fields.FieldInfo`) is a different
class and will not have descriptions extracted. This is intentional:
- Pydantic v1 is deprecated; users should migrate to v2
- The v1 compat layer exists for legacy model migration, not new tool
definitions
- Duck-typing on `description` attribute could match unintended objects
If v1 `Field` support is needed, it can be addressed in a follow-up PR
with explicit handling.
## Testing
- Added `test_tool_field_description_preserved()` covering required and
optional params
- Verified existing `test_tool_annotated_descriptions` still passes
- Lint and type checks pass
---
> [!NOTE]
> This PR was developed with AI agent assistance (Factory/Droid).
---------
Co-authored-by: Mason Daugherty <github@mdrxy.com>
## Summary
- Fixes issue where Pydantic default values from `args_schema` were not
passed to tool functions when the caller omits optional arguments
- Modified `_parse_input()` in `libs/core/langchain_core/tools/base.py`
to include fields with non-None defaults
- Added unit tests to verify default args behavior for both sync and
async tools
## Problem
When a tool has an `args_schema` with default values:
```python
class SearchArgs(BaseModel):
query: str = Field(..., description="Search query")
page: int = Field(default=1, description="Page number")
size: int = Field(default=10, description="Results per page")
@tool("search", args_schema=SearchArgs)
def search_tool(query: str, page: int, size: int) -> str:
return f"query={query}, page={page}, size={size}"
# This threw: TypeError: search_tool() missing 2 required positional arguments
search_tool.invoke({"query": "test"})
```
The defaults from `args_schema` were being discarded because
`_parse_input()` filtered validated results to only include keys from
the original input.
## Solution
Changed the filtering logic to:
1. Include all fields that were in the original input (validated)
2. Also include fields with non-None defaults from the Pydantic schema
This applies user-defined defaults (like `Field(default=1)`) while
excluding synthetic fields from `*args`/`**kwargs` which have
`default=None`.
## Test plan
- [x] Added `test_tool_args_schema_default_values` - tests sync tool
with defaults
- [x] Added `test_tool_args_schema_default_values_async` - tests async
tool with defaults
- [x] All existing tests pass (150 passed, 4 skipped)
- [x] Lint passes
Fixes#34384
---------
Co-authored-by: Mason Daugherty <mason@langchain.dev>
Co-authored-by: Mason Daugherty <github@mdrxy.com>
## Summary
Fixes#33970
`get_buffer_string` was only checking for the deprecated `function_call`
field in `additional_kwargs`, which modern LLM providers no longer
return. This fix updates the function to check for the modern
`tool_calls` field first, falling back to `function_call` for legacy
compatibility.
## Changes
- Check `AIMessage.tool_calls` first (modern standard)
- Fall back to `additional_kwargs["function_call"]` (legacy support)
- Added 3 unit tests covering tool_calls, empty content, and precedence
behavior
## Testing
```python
# Before fix: tool_calls info was lost
msg = AIMessage(content="Hi", tool_calls=[{"name": "search", ...}])
get_buffer_string([msg]) # "AI: Hi" (no tool info)
# After fix: tool_calls are included
get_buffer_string([msg]) # "AI: Hi[{\"name\": \"search\", ...}]"
```
- All existing `get_buffer_string` tests pass
- Legacy `function_call` behavior preserved
---
> [!NOTE]
> This PR was developed with AI agent assistance (Factory/Droid).
---------
Co-authored-by: Mason Daugherty <github@mdrxy.com>
Adds [PEP 702](https://peps.python.org/pep-0702/) `__deprecated__`
attribute support to the `@deprecated` decorator, enabling IDE and type
checker integration for deprecation warnings.
---
PEP 702 introduced the `__deprecated__` attribute convention, which type
checkers (Pyright, mypy) and IDEs (VS Code with Pylance, PyCharm) can
use to surface deprecations directly in the editor. This PR sets
`__deprecated__` on all objects decorated with `@deprecated`.
With this change, developers using supported IDEs will see:
- **Strikethrough text** on deprecated symbols
- **Hover messages** showing the deprecation reason and suggested
alternative
- **Diagnostic warnings** during type checking (e.g., `pyright`, `mypy`)
### References
- [PEP 702 – Marking deprecations using the type
system](https://peps.python.org/pep-0702/)
- [`typing.deprecated`
specification](https://typing.python.org/en/latest/spec/directives.html#deprecated)