Updates `comma_list` in `libs/core/langchain_core/utils/strings.py` to
accept `Iterable[Any]` instead of `list[Any]`, making the utility more
flexible.
---------
Co-authored-by: Mason Daugherty <github@mdrxy.com>
Changes Created
I have fixed the issue where a generic and misleading error message was
displayed when a JSON schema was missing the top-level
title
key.
[Fix: Improve error message for missing title in JSON schema
functions](https://github.com/Bhavesh007Sharma/langchain/tree/fix-json-schema-title-error)
File Modified:
libs/core/langchain_core/utils/function_calling.py
I updated the
convert_to_openai_function
validation logic to specifically check for
dict
inputs that look like schemas (
type
or
properties
keys present) but are missing the
title
key.
# Before (Generic Error)
raise ValueError(
f"Unsupported function\n\n{function}\n\nFunctions must be passed in"
" as Dict, pydantic.BaseModel, or Callable. If they're a dict they must"
" either be in OpenAI function format or valid JSON schema with
top-level"
" 'title' and 'description' keys."
)
# After (Specific Error)
if isinstance(function, dict) and ("type" in function or "properties" in
function):
msg = (
"Unsupported function\n\nTo use a JSON schema as a function, "
"it must have a top-level 'title' key to be used as the function name."
)
raise ValueError(msg)
Verification Results
Automated Tests
I created a reproduction script
reproduce_issue.py
to confirm the behavior.
Before Fix: The script would have raised the generic "Unsupported
function" error claiming description was also required.
After Fix: The script now confirms that the new, specific error message
is raised when
title
is missing.
(Note: Verification was performed by inspecting the code logic and
running a lightweight reproduction script locally, as full suite
verification had environment dependency issues.)
---------
Co-authored-by: Mason Daugherty <github@mdrxy.com>
This PR adds a regression test covering the JSON Schema `$ref` pattern
found in
MCP-style schemas, where a `$ref` points into a list-based structure
such as:
#/properties/body/anyOf/1/properties/Message/properties/bccRecipients/items
This pattern historically failed due to incorrect handling of numeric
list
components in `_retrieve_ref`. The underlying bug has since been fixed,
and
this test ensures coverage so we don't regress on list-index `$ref`
resolution.
The new test (`test_dereference_refs_list_index_items_ref_mcp_like`)
verifies:
- correct traversal into `anyOf[1]`
- proper dereferencing of `items.$ref`
- no errors thrown
- `ccRecipients.items` is identical to the resolved schema of
`bccRecipients.items`
No code changes are included, just the one test — this PR adds coverage
to preserve the expected
behavior and documents support for this real-world MCP schema pattern.
Related to #32012.
---------
Co-authored-by: Mason Daugherty <mason@langchain.dev>
* Fix detection of support of context in `asyncio.create_task`
* Fix: in Python 3.14 `asyncio.get_event_loop()` raises an exception if
there's no running loop
* Bump pydantic to version 2.12
* Skips tests with pydantic v1 models as they are not supported with
Python 3.14
* Run core tests with Python 3.14 in CI.
---------
Co-authored-by: Mason Daugherty <mason@langchain.dev>
Co-authored-by: Sydney Runkle <54324534+sydney-runkle@users.noreply.github.com>
Removed:
- `libs/core/langchain_core/chat_history.py`: `add_user_message` and
`add_ai_message` in favor of `add_messages` and `aadd_messages`
- `libs/core/langchain_core/language_models/base.py`: `predict`,
`predict_messages`, and async versions in favor of `invoke`. removed
`_all_required_field_names` since it was a wrapper on
`get_pydantic_field_names`
- `libs/core/langchain_core/language_models/chat_models.py`:
`callback_manager` param in favor of `callbacks`. `__call__` and
`call_as_llm` method in favor of `invoke`
- `libs/core/langchain_core/language_models/llms.py`: `callback_manager`
param in favor of `callbacks`. `__call__`, `predict`, `apredict`, and
`apredict_messages` methods in favor of `invoke`
- `libs/core/langchain_core/prompts/chat.py`: `from_role_strings` and
`from_strings` in favor of `from_messages`
- `libs/core/langchain_core/prompts/pipeline.py`: removed
`PipelinePromptTemplate`
- `libs/core/langchain_core/prompts/prompt.py`: `input_variables` param
on `from_file` as it wasn't used
- `libs/core/langchain_core/tools/base.py`: `callback_manager` param in
favor of `callbacks`
- `libs/core/langchain_core/tracers/context.py`: `tracing_enabled` in
favor of `tracing_enabled_v2`
- `libs/core/langchain_core/tracers/langchain_v1.py`: entire module
- `libs/core/langchain_core/utils/loading.py`: entire module,
`try_load_from_hub`
- `libs/core/langchain_core/vectorstores/in_memory.py`: `upsert` in
favor of `add_documents`
- `libs/standard-tests/langchain_tests/integration_tests/chat_models.py`
and `libs/standard-tests/langchain_tests/unit_tests/chat_models.py`:
`tool_choice_value` as models should accept `tool_choice="any"`
- `langchain` will consequently no longer expose these items if it was
previously
---------
Co-authored-by: Mohammad Mohtashim <45242107+keenborder786@users.noreply.github.com>
Co-authored-by: Caspar Broekhuizen <caspar@langchain.dev>
Co-authored-by: ccurme <chester.curme@gmail.com>
Co-authored-by: Christophe Bornet <cbornet@hotmail.com>
Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>
Co-authored-by: Sadra Barikbin <sadraqazvin1@yahoo.com>
Co-authored-by: Vadym Barda <vadim.barda@gmail.com>
## Summary
Adds test coverage for the `stringify_value` utility function to handle
complex nested data structures that weren't previously tested.
## Changes
- Added `test_stringify_value_nested_structures()` to `test_strings.py`
- Tests nested dictionaries within lists
- Tests mixed-type lists with various data types
- Verifies proper stringification of complex nested structures
## Why This Matters
- Fills a gap in test coverage for edge cases
- Ensures `stringify_value` handles complex data structures correctly
- Improves confidence in string utility functions used throughout the
codebase
- Low risk addition that strengthens existing test suite
## Testing
```bash
uv run --group test pytest libs/core/tests/unit_tests/utils/test_strings.py::test_stringify_value_nested_structures -v
```
This test addition follows the project's testing patterns and adds
meaningful coverage without introducing any breaking changes.
---------
Co-authored-by: Mason Daugherty <mason@langchain.dev>
**Description:** Fixes infinite recursion issue in JSON schema
dereferencing when objects contain both $ref and other properties (e.g.,
nullable, description, additionalProperties). This was causing Apollo
MCP server schemas to hang indefinitely during tool binding.
**Problem:**
- Commit fb5da8384 changed the condition from `set(obj.keys()) ==
{"$ref"}` to `"$ref" in set(obj.keys())`
- This caused objects with $ref + other properties to be treated as pure
$ref nodes
- Result: other properties were lost and infinite recursion occurred
with complex schemas
**Solution:**
- Restore pure $ref detection for objects with only $ref key
- Add proper handling for mixed $ref objects that preserves all
properties
- Merge resolved reference content with other properties
- Maintain cycle detection to prevent infinite recursion
**Impact:**
- Fixes Apollo MCP server schema integration
- Resolves tool binding infinite recursion with complex GraphQL schemas
- Preserves backward compatibility with existing functionality
- No performance impact - actually improves handling of complex schemas
**Issue:** Fixes#32511
**Dependencies:** None
**Testing:**
- Added comprehensive unit tests covering mixed $ref scenarios
- All existing tests pass (1326 passed, 0 failed)
- Tested with realistic Apollo GraphQL schemas
- Stress tested with 100 iterations of complex schemas
**Verification:**
- ✅ `make format` - All files properly formatted
- ✅ `make lint` - All linting checks pass
- ✅ `make test` - All 1326 unit tests pass
- ✅ No breaking changes - full backwards compatibility maintained
---------
Co-authored-by: Marcus <marcus@Marcus-M4-MAX.local>
Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>
# Description
This PR fixes a bug in _recursive_set_additional_properties_false used
in function_calling.convert_to_openai_function.
Previously, schemas with "additionalProperties=True" were not correctly
overridden when strict validation was expected, which could lead to
invalid OpenAI function schemas.
The updated implementation ensures that:
- Any schema with "additionalProperties" already set will now be forced
to False under strict mode.
- Recursive traversal of properties, items, and anyOf is preserved.
- Function signature remains unchanged for backward compatibility.
# Issue
When using tool calling in OpenAI structured output strict mode
(strict=True), 400: "Invalid schema for response_format XXXXX
'additionalProperties' is required to be supplied and to be false" error
raises for the parameter that contains dict type. OpenAI requires
additionalProperties to be set to False.
Some PRs try to resolved the issue.
- PR #25169 introduced _recursive_set_additional_properties_false to
recursively set additionalProperties=False.
- PR #26287 fixed handling of empty parameter tools for OpenAI function
generation.
- PR #30971 added support for Union type arguments in strict mode of
OpenAI function calling / structured output.
Despite these improvements, since Pydantic 2.11, it will always add
`additionalProperties: True` for arbitrary dictionary schemas dict or
Any (https://pydantic.dev/articles/pydantic-v2-11-release#changes).
Schemas that already had additionalProperties=True in such cases were
not being overridden, which this PR addresses to ensure strict mode
behaves correctly in all cases.
# Dependencies
No Changes
---------
Co-authored-by: Zhong, Yu <yzhong@freewheel.com>
This PR fixes the PostgreSQL NUL byte issue that causes
`psycopg.DataError` when inserting documents containing `\x00` bytes
into PostgreSQL-based vector stores.
## Problem
PostgreSQL text fields cannot contain NUL (0x00) bytes. When documents
with such characters are processed by PGVector or langchain-postgres
implementations, they fail with:
```
(psycopg.DataError) PostgreSQL text fields cannot contain NUL (0x00) bytes
```
This commonly occurs when processing PDFs, documents from various
loaders, or text extracted by libraries like unstructured that may
contain embedded NUL bytes.
## Solution
Added `sanitize_for_postgres()` utility function to
`langchain_core.utils.strings` that removes or replaces NUL bytes from
text content.
### Key Features
- **Simple API**: `sanitize_for_postgres(text, replacement="")`
- **Configurable**: Replace NUL bytes with empty string (default) or
space for readability
- **Comprehensive**: Handles all problematic examples from the original
issue
- **Well-tested**: Complete unit tests with real-world examples
- **Backward compatible**: No breaking changes, purely additive
### Usage Example
```python
from langchain_core.utils import sanitize_for_postgres
from langchain_core.documents import Document
# Before: This would fail with DataError
problematic_content = "Getting\x00Started with embeddings"
# After: Clean the content before database insertion
clean_content = sanitize_for_postgres(problematic_content)
# Result: "GettingStarted with embeddings"
# Or preserve readability with spaces
readable_content = sanitize_for_postgres(problematic_content, " ")
# Result: "Getting Started with embeddings"
# Use in Document processing
doc = Document(page_content=clean_content, metadata={...})
```
### Integration Pattern
PostgreSQL vector store implementations should sanitize content before
insertion:
```python
def add_documents(self, documents: List[Document]) -> List[str]:
# Sanitize documents before insertion
sanitized_docs = []
for doc in documents:
sanitized_content = sanitize_for_postgres(doc.page_content, " ")
sanitized_doc = Document(
page_content=sanitized_content,
metadata=doc.metadata,
id=doc.id
)
sanitized_docs.append(sanitized_doc)
return self._insert_documents_to_db(sanitized_docs)
```
## Changes Made
- Added `sanitize_for_postgres()` function in
`langchain_core/utils/strings.py`
- Updated `langchain_core/utils/__init__.py` to export the new function
- Added comprehensive unit tests in
`tests/unit_tests/utils/test_strings.py`
- Validated against all examples from the original issue report
## Testing
All tests pass, including:
- Basic NUL byte removal and replacement
- Multiple consecutive NUL bytes
- Empty string handling
- Real examples from the GitHub issue
- Backward compatibility with existing string utilities
This utility enables PostgreSQL integrations in both langchain-community
and langchain-postgres packages to handle documents with NUL bytes
reliably.
Fixes#26033.
<!-- START COPILOT CODING AGENT TIPS -->
---
💬 Share your feedback on Copilot coding agent for the chance to win a
$200 gift card! Click
[here](https://survey.alchemer.com/s3/8343779/Copilot-Coding-agent) to
start the survey.
---------
Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com>
Co-authored-by: mdrxy <61371264+mdrxy@users.noreply.github.com>
Co-authored-by: Mason Daugherty <github@mdrxy.com>
Fixes#32042
## Summary
Fixes a critical bug in JSON Schema reference resolution that prevented
correctly dereferencing numeric components in JSON pointer paths,
specifically for list indices in `anyOf`, `oneOf`, and `allOf` arrays.
## Changes
- Fixed `_retrieve_ref` function in
`libs/core/langchain_core/utils/json_schema.py` to properly handle
numeric components
- Added comprehensive test function `test_dereference_refs_list_index()`
in `libs/core/tests/unit_tests/utils/test_json_schema.py`
- Resolved line length formatting issues
- Improved type checking and index validation for list and dictionary
references
## Key Improvements
- Correctly handles list index references in JSON pointer paths
- Maintains backward compatibility with existing dictionary numeric key
functionality
- Adds robust error handling for out-of-bounds and invalid indices
- Passes all test cases covering various reference scenarios
## Test Coverage
- Verified fix for `#/properties/payload/anyOf/1/properties/startDate`
reference
- Tested edge cases including out-of-bounds and negative indices
- Ensured no regression in existing reference resolution functionality
Resolves the reported issue with JSON Schema reference dereferencing for
list indices.
---------
Co-authored-by: open-swe-dev[bot] <open-swe-dev@users.noreply.github.com>
Co-authored-by: Mason Daugherty <github@mdrxy.com>
Co-authored-by: Mason Daugherty <mason@langchain.dev>
* Simplified Pydantic handling since Pydantic v1 is not supported
anymore.
* Replace use of deprecated v1 methods by corresponding v2 methods.
* Remove use of other deprecated methods.
* Activate mypy errors on deprecated methods use.
---------
Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>
**Issue:**[
#309070](https://github.com/langchain-ai/langchain/issues/30970)
**Cause**
Arg type in python code
```
arg: Union[SubSchema1, SubSchema2]
```
is translated to `anyOf` in **json schema**
```
"anyOf" : [{sub schema 1 ...}, {sub schema 1 ...}]
```
The value of anyOf is a list sub schemas.
The bug is caused since the sub schemas inside `anyOf` list is not taken
care of.
The location where the issue happens is `convert_to_openai_function`
function -> `_recursive_set_additional_properties_false` function, that
recursively adds `"additionalProperties": false` to json schema which is
[required by OpenAI's strict function
calling](https://platform.openai.com/docs/guides/structured-outputs?api-mode=responses#additionalproperties-false-must-always-be-set-in-objects)
**Solution:**
This PR fixes this issue by iterating each sub schema inside `anyOf`
list.
A unit test is added.
**Twitter handle:** shengboma
If no one reviews your PR within a few days, please @-mention one of
baskaryan, eyurtsev, ccurme, vbarda, hwchase17.
---------
Co-authored-by: ccurme <chester.curme@gmail.com>
This pull request includes various changes to the `langchain_core`
library, focusing on improving compatibility with different versions of
Pydantic. The primary change involves replacing checks for Pydantic
major versions with boolean flags, which simplifies the code and
improves readability.
This also solves ruff rule checks for
[RUF048](https://docs.astral.sh/ruff/rules/map-int-version-parsing/) and
[PLR2004](https://docs.astral.sh/ruff/rules/magic-value-comparison/).
Key changes include:
### Compatibility Improvements:
*
[`libs/core/langchain_core/output_parsers/json.py`](diffhunk://#diff-5add0cf7134636ae4198a1e0df49ee332ae0c9123c3a2395101e02687c717646L22-R24):
Replaced `PYDANTIC_MAJOR_VERSION` with `IS_PYDANTIC_V1` to check for
Pydantic version 1.
*
[`libs/core/langchain_core/output_parsers/pydantic.py`](diffhunk://#diff-2364b5b4aee01c462aa5dbda5dc3a877dcd20f29df173ad540dc8adf8b192361L14-R14):
Updated version checks from `PYDANTIC_MAJOR_VERSION` to `IS_PYDANTIC_V2`
in the `PydanticOutputParser` class.
[[1]](diffhunk://#diff-2364b5b4aee01c462aa5dbda5dc3a877dcd20f29df173ad540dc8adf8b192361L14-R14)
[[2]](diffhunk://#diff-2364b5b4aee01c462aa5dbda5dc3a877dcd20f29df173ad540dc8adf8b192361L27-R27)
### Utility Enhancements:
*
[`libs/core/langchain_core/utils/pydantic.py`](diffhunk://#diff-ff28020c5f1073a8b63bcd9d8b756a187fd682cb81935295120c63b207071896R23):
Introduced `IS_PYDANTIC_V1` and `IS_PYDANTIC_V2` flags and deprecated
the `get_pydantic_major_version` function. Updated various functions to
use these flags instead of version numbers.
[[1]](diffhunk://#diff-ff28020c5f1073a8b63bcd9d8b756a187fd682cb81935295120c63b207071896R23)
[[2]](diffhunk://#diff-ff28020c5f1073a8b63bcd9d8b756a187fd682cb81935295120c63b207071896R42-R78)
[[3]](diffhunk://#diff-ff28020c5f1073a8b63bcd9d8b756a187fd682cb81935295120c63b207071896L90-R89)
[[4]](diffhunk://#diff-ff28020c5f1073a8b63bcd9d8b756a187fd682cb81935295120c63b207071896L104-R101)
[[5]](diffhunk://#diff-ff28020c5f1073a8b63bcd9d8b756a187fd682cb81935295120c63b207071896L120-R122)
[[6]](diffhunk://#diff-ff28020c5f1073a8b63bcd9d8b756a187fd682cb81935295120c63b207071896L135-R132)
[[7]](diffhunk://#diff-ff28020c5f1073a8b63bcd9d8b756a187fd682cb81935295120c63b207071896L149-R151)
[[8]](diffhunk://#diff-ff28020c5f1073a8b63bcd9d8b756a187fd682cb81935295120c63b207071896L164-R161)
[[9]](diffhunk://#diff-ff28020c5f1073a8b63bcd9d8b756a187fd682cb81935295120c63b207071896L248-R250)
[[10]](diffhunk://#diff-ff28020c5f1073a8b63bcd9d8b756a187fd682cb81935295120c63b207071896L330-R335)
[[11]](diffhunk://#diff-ff28020c5f1073a8b63bcd9d8b756a187fd682cb81935295120c63b207071896L356-R357)
[[12]](diffhunk://#diff-ff28020c5f1073a8b63bcd9d8b756a187fd682cb81935295120c63b207071896L393-R390)
[[13]](diffhunk://#diff-ff28020c5f1073a8b63bcd9d8b756a187fd682cb81935295120c63b207071896L403-R400)
### Test Updates:
*
[`libs/core/tests/unit_tests/output_parsers/test_openai_tools.py`](diffhunk://#diff-694cc0318edbd6bbca34f53304934062ad59ba9f5a788252ce6c5f5452489d67L19-R22):
Updated tests to use `IS_PYDANTIC_V1` and `IS_PYDANTIC_V2` for version
checks.
[[1]](diffhunk://#diff-694cc0318edbd6bbca34f53304934062ad59ba9f5a788252ce6c5f5452489d67L19-R22)
[[2]](diffhunk://#diff-694cc0318edbd6bbca34f53304934062ad59ba9f5a788252ce6c5f5452489d67L532-R535)
[[3]](diffhunk://#diff-694cc0318edbd6bbca34f53304934062ad59ba9f5a788252ce6c5f5452489d67L567-R570)
[[4]](diffhunk://#diff-694cc0318edbd6bbca34f53304934062ad59ba9f5a788252ce6c5f5452489d67L602-R605)
*
[`libs/core/tests/unit_tests/prompts/test_chat.py`](diffhunk://#diff-3e60e744842086a4f3c4b21bc83e819c3435720eab210078e77e2430fb8c7e84R7):
Replaced version tuple checks with `PYDANTIC_VERSION` comparisons.
[[1]](diffhunk://#diff-3e60e744842086a4f3c4b21bc83e819c3435720eab210078e77e2430fb8c7e84R7)
[[2]](diffhunk://#diff-3e60e744842086a4f3c4b21bc83e819c3435720eab210078e77e2430fb8c7e84L35-R38)
[[3]](diffhunk://#diff-3e60e744842086a4f3c4b21bc83e819c3435720eab210078e77e2430fb8c7e84L924-R927)
[[4]](diffhunk://#diff-3e60e744842086a4f3c4b21bc83e819c3435720eab210078e77e2430fb8c7e84L935-R938)
*
[`libs/core/tests/unit_tests/runnables/test_graph.py`](diffhunk://#diff-99a290330ef40103d0ce02e52e21310d6fadea142bfdea13c94d23fc81c0bb5dR3):
Simplified version checks using `PYDANTIC_VERSION`.
[[1]](diffhunk://#diff-99a290330ef40103d0ce02e52e21310d6fadea142bfdea13c94d23fc81c0bb5dR3)
[[2]](diffhunk://#diff-99a290330ef40103d0ce02e52e21310d6fadea142bfdea13c94d23fc81c0bb5dL15-R18)
[[3]](diffhunk://#diff-99a290330ef40103d0ce02e52e21310d6fadea142bfdea13c94d23fc81c0bb5dL234-L239)
*
[`libs/core/tests/unit_tests/runnables/test_runnable.py`](diffhunk://#diff-06bed920c0dad0cfd41d57a8d9e47a7b56832409649c10151061a791860d5bb5L18-R20):
Introduced `PYDANTIC_VERSION_AT_LEAST_29` and
`PYDANTIC_VERSION_AT_LEAST_210` for more readable version checks.
[[1]](diffhunk://#diff-06bed920c0dad0cfd41d57a8d9e47a7b56832409649c10151061a791860d5bb5L18-R20)
[[2]](diffhunk://#diff-06bed920c0dad0cfd41d57a8d9e47a7b56832409649c10151061a791860d5bb5L92-R99)
[[3]](diffhunk://#diff-06bed920c0dad0cfd41d57a8d9e47a7b56832409649c10151061a791860d5bb5L230-R233)
[[4]](diffhunk://#diff-06bed920c0dad0cfd41d57a8d9e47a7b56832409649c10151061a791860d5bb5L652-R655)
Thank you for contributing to LangChain!
- [ ] **PR title**: "core: google docstring parsing fix"
- [x] **PR message**:
- **Description:** Added a solution for invalid parsing of google
docstring such as:
Args:
net_annual_income (float): The user's net annual income (in current year
dollars).
- **Issue:** Previous code would return arg = "net_annual_income
(float)" which would cause exception in
_validate_docstring_args_against_annotations
- **Dependencies:** None
If no one reviews your PR within a few days, please @-mention one of
baskaryan, efriis, eyurtsev, ccurme, vbarda, hwchase17.
Co-authored-by: Erick Friis <erick@langchain.dev>
We have a test
[test_structured_few_shot_examples](ad4333ca03/libs/standard-tests/langchain_tests/integration_tests/chat_models.py (L546))
in standard integration tests that implements a version of tool-calling
few shot examples that works with ~all tested providers. The formulation
supported by ~all providers is: `human message, tool call, tool message,
AI reponse`.
Here we update
`langchain_core.utils.function_calling.tool_example_to_messages` to
support this formulation.
The `tool_example_to_messages` util is undocumented outside of our API
reference. IMO, if we are testing that this function works across all
providers, it can be helpful to feature it in our guides. The structured
few-shot examples we document at the moment require users to implement
this function and can be simplified.