## Description
`_parse_google_docstring` incorrectly parses multi-line argument
descriptions when a continuation line contains a colon. The continuation
line is treated as a new argument definition instead of being appended
to the current argument's description.
### Example
```python
def search(query: str, top_k: int = 5) -> str:
"""Search the knowledge base.
Args:
query: The search query to use
for finding things: important ones
top_k: Number of results to return
"""
```
**Before (broken):** The parser creates 3 args: `query`, `for finding
things`, `top_k`
**After (fixed):** The parser correctly creates 2 args: `query` (with
full description including "for finding things: important ones"),
`top_k`
### Root Cause
The parser used `if ":" in line` to detect new argument lines without
considering indentation. In Google-style docstrings, continuation lines
have deeper indentation than argument definition lines.
### Fix
Detect the base indentation level from the first argument line and treat
any line with deeper indentation as a continuation of the current
argument's description, regardless of whether it contains a colon.
## Issue
Fixes#35679
## Dependencies
None.
## Testing
Added 4 unit tests in
`test_function_calling.py::TestParseGoogleDocstring`:
- `test_continuation_line_with_colon` — the core bug scenario
- `test_simple_args_still_work` — regression check for basic args
- `test_continuation_line_without_colon` — multi-line descriptions
without colons
- `test_multiple_continuation_lines_with_colons` — multiple continuation
lines each containing colons
All tests pass locally with Python 3.12.
---------
Co-authored-by: gambletan <ethanchang32@gmail.com>
Co-authored-by: Mason Daugherty <github@mdrxy.com>
Enables mypy's `warn_unreachable` rule for `langchain-core`, bringing it
in line with the other strict libraries in the monorepo. Previously this
rule was intentionally disabled by a code comment, because under mypy
2.x it false-flags intentional defensive runtime checks — most notably
the SSRF / IP-policy guards in `langchain_core/_security/` — as
unreachable.
This PR resolves all of those warnings without deleting or
blanket-ignoring the defensive guards, so contributors get
unreachable-code coverage going forward and accidental dead code is
caught in CI.
The bulk of the change is mechanical: a targeted `# type:
ignore[unreachable]` on each defensive `else`/error branch that mypy
considers unreachable but that we deliberately keep as a runtime guard
against unexpected input. A few changes are more substantive and worth a
closer look:
- **`coro_with_context` (`runnables/utils.py`) — behavior change on
Python < 3.11.** The pre-3.11 path is rewritten to always route through
`context.run(asyncio.create_task, coro)`, so the supplied context is
reliably propagated to the task. Previously, on 3.10 the helper returned
the bare coroutine (run in the caller's context) when
`create_task=False`, and dropped the context entirely when
`create_task=True`. The new behavior matches 3.11+. The `create_task`
parameter is now inert but retained for signature compatibility. All
callers `await` the result, so returning a `Task` rather than a
coroutine is transparent.
- **`_create_template_from_message_type` (`prompts/chat.py`) — signature
widening.** This private helper's `template` parameter now accepts
`bool` inside the list, accurately reflecting the existing `["{var}",
is_optional]` placeholder form. No public-API impact.
- **`PydanticOutputFunctionsParser`
(`output_parsers/openai_functions.py`).** The `pydantic_schema` field is
typed as `TypeBaseModel` (which covers both v1 and v2 model classes,
unlike the prior annotation), and the `args_only` parse path now
dispatches explicitly on `BaseModel` vs `BaseModelV1` rather than
duck-typing via `hasattr`. This also yields clearer errors for
unsupported / dict schemas.
- **`_security/_policy.py`.** Loop variables are renamed so mypy can
narrow their types, which lets the old `# type: ignore[assignment]`
comments be dropped. The IP-blocklist logic is unchanged.
---------
Co-authored-by: Mason Daugherty <mason@langchain.dev>
Co-authored-by: Mason Daugherty <github@mdrxy.com>
`BaseTool.args_schema` is documented as accepting a Pydantic v1 model,
but several code paths assumed v2 and raised when handed a v1 schema
(e.g. an `AttributeError` from calling
`model_json_schema()`/`model_fields` on a v1 model). This affected
anyone using a v1 `args_schema`, and anyone composing runnables whose
input/output schema is a v1 model.
This PR makes the tool/runnable schema-derivation code version-agnostic.
## Type contract
`TypeBaseModel` (and `PydanticBaseModel`) now include
`pydantic.v1.BaseModel`, so the type honestly reflects what tools and
runnables already accept at runtime. The public schema accessors
(`Runnable.get_input_schema`/`get_output_schema` and the
`input_schema`/`output_schema` properties) return `TypeBaseModel`.
## Version-agnostic helpers
Added to `langchain_core.utils.pydantic`, each dispatching on the
model's Pydantic version so callers don't have to:
- `model_json_schema(model)` — JSON schema for either version.
- `model_validate(model, obj)` — validation for either version.
- `get_fields(model)` — field map for either version (existing helper,
now used consistently).
Internally, direct `.model_json_schema()` / `.model_fields` calls are
replaced with these helpers (or with `get_input_jsonschema()` /
`get_output_jsonschema()`).
## Behavior change worth a close look
When deriving a schema from a v1 model (in `RunnableParallel`,
`RunnableAssign`, and `RunnableSequence` output schemas), a **required**
v1 field is now correctly carried over as required. Previously the v1
path read the field's `default` — which is `None` for a required v1
field — and silently turned required fields into optional/nullable ones;
`default_factory` fields were dropped entirely. The new
`_get_schema_field_definition` helper translates a v1 `ModelField`
faithfully (required → `...`, factory preserved) and dispatches
explicitly on the field type.
---------
Co-authored-by: Mason Daugherty <mason@langchain.dev>
Co-authored-by: Mason Daugherty <github@mdrxy.com>
## Description
Fixes#35046
Two minor cleanups in `langchain-core`:
1. **Fix docstring mismatch in `mustache.render()`**: The docstring
incorrectly documented `partials_path` and `partials_ext` parameters
that do not exist in the function signature. These were likely carried
over from the original
[chevron](https://github.com/noahmorrison/chevron) library but were
never part of this adapted implementation.
2. **Remove redundant logic in `Blob.from_path()`**: The expression
`mimetypes.guess_type(path)[0] if guess_type else None` had a redundant
`if guess_type` ternary since the outer condition `if mime_type is None
and guess_type:` already guarantees `guess_type` is `True` at that
point. Simplified to just `mimetypes.guess_type(path)[0]`.
## AI Disclaimer
An AI coding assistant was used to help identify and implement these
changes.
it looks scary but i promise it is not
improving documentation consistency across core. primarily update
docstrings and comments for better formatting, readability, and
accuracy, as well as add minor clarifications and formatting
improvements to user-facing documentation.
Updates `comma_list` in `libs/core/langchain_core/utils/strings.py` to
accept `Iterable[Any]` instead of `list[Any]`, making the utility more
flexible.
---------
Co-authored-by: Mason Daugherty <github@mdrxy.com>
Changes Created
I have fixed the issue where a generic and misleading error message was
displayed when a JSON schema was missing the top-level
title
key.
[Fix: Improve error message for missing title in JSON schema
functions](https://github.com/Bhavesh007Sharma/langchain/tree/fix-json-schema-title-error)
File Modified:
libs/core/langchain_core/utils/function_calling.py
I updated the
convert_to_openai_function
validation logic to specifically check for
dict
inputs that look like schemas (
type
or
properties
keys present) but are missing the
title
key.
# Before (Generic Error)
raise ValueError(
f"Unsupported function\n\n{function}\n\nFunctions must be passed in"
" as Dict, pydantic.BaseModel, or Callable. If they're a dict they must"
" either be in OpenAI function format or valid JSON schema with
top-level"
" 'title' and 'description' keys."
)
# After (Specific Error)
if isinstance(function, dict) and ("type" in function or "properties" in
function):
msg = (
"Unsupported function\n\nTo use a JSON schema as a function, "
"it must have a top-level 'title' key to be used as the function name."
)
raise ValueError(msg)
Verification Results
Automated Tests
I created a reproduction script
reproduce_issue.py
to confirm the behavior.
Before Fix: The script would have raised the generic "Unsupported
function" error claiming description was also required.
After Fix: The script now confirms that the new, specific error message
is raised when
title
is missing.
(Note: Verification was performed by inspecting the code logic and
running a lightweight reproduction script locally, as full suite
verification had environment dependency issues.)
---------
Co-authored-by: Mason Daugherty <github@mdrxy.com>
# feat(core): add more file extensions to ignore in HTML link extraction
## Description
This PR enhances the HTML link extraction utility in
`libs/core/langchain_core/utils/html.py` by expanding the
`SUFFIXES_TO_IGNORE` list to include additional common binary file
extensions:
- `.webp`
- `.pdf`
- `.docx`
- `.xlsx`
- `.pptx`
- `.pptm`
These file types are non-HTML, non-crawlable resources. Ignoring them
prevents `find_all_links` and `extract_sub_links` from mistakenly
treating such binary assets as navigable links. This improves link
filtering, reduces unnecessary crawling, and aligns behavior with
typical web scraping expectations.
## Summary of Changes
- **Updated** `libs/core/langchain_core/utils/html.py`: Added `.webp`,
`.pdf`, `.docx`, `.xlsx`, `.pptx`, `.pptm` to `SUFFIXES_TO_IGNORE`.
## Related Issues
N/A
## Verification
- `ruff check libs/core/langchain_core/utils/html.py`: **Passed**
- `mypy libs/core/langchain_core/utils/html.py`: **Passed**
- `pytest libs/core/tests/unit_tests/utils/test_html.py`: **Passed** (11
tests)
---------
Co-authored-by: Mason Daugherty <mason@langchain.dev>
# refactor(core): improve docstrings for HTML link extraction utilities
## Description
This PR updates and clarifies the docstrings for `find_all_links` and
`extract_sub_links` in
`libs/core/langchain_core/utils/html.py`.
The previous return-value descriptions were vague (e.g., "all links",
"sub links"). They have now been revised to clearly describe the
behavior and output of each function:
- **find_all_links** → “A list of all links found in the HTML.”
- **extract_sub_links** → “A list of absolute paths to sub links.”
These improvements make the utilities more understandable and
developer-friendly without altering functionality.
## Verification
- `ruff check libs/core/langchain_core/utils/html.py`: **Passed**
- `pytest libs/core/tests/unit_tests/utils/test_html.py`: **Passed**
## Checklists
- PR title follows the required format: `TYPE(SCOPE): DESCRIPTION`
- Changes are limited to the `langchain-core` package
- `make format`, `make lint`, and `make test` pass
* FIxed where possible
* Used `cast` when not possible to fix
---------
Co-authored-by: Mason Daugherty <github@mdrxy.com>
Co-authored-by: Mason Daugherty <mason@langchain.dev>
Moves `_ORIGIN_MAP` dict from inside `_py_38_safe_origin()` to module
level constant. This avoids dict allocation on every function call,
reducing garbage collection pressure during frequent tool conversions.
The function is called during typed dict to pydantic model conversion
which happens during tool binding and invocation - a hot path in
LangChain.
**Testing:** `make lint` passes
---------
Co-authored-by: Mason Daugherty <mason@langchain.dev>
Co-authored-by: Mason Daugherty <github@mdrxy.com>
* Fixed a few TC
* Added a few Pydantic classes to
`flake8-type-checking.runtime-evaluated-base-classes` (not as much as I
would have imagined)
* Added a few `noqa: TC`
* Activated TC rules
Replace direct `__annotations__` access with `get_type_hints()` in
`_convert_any_typed_dicts_to_pydantic` to handle [PEP
649](https://peps.python.org/pep-0649/) deferred annotations in Python
3.14:
> [`Changed in version 3.14: Annotations are now lazily evaluated by
default`](https://docs.python.org/3/reference/compound_stmts.html#annotations)
Before:
```python
class MyTool(TypedDict):
name: str
MyTool.__annotations__ # {'name': 'str'} - string, not type
issubclass('str', ...) # TypeError: arg 1 must be a class
```
After:
```python
get_type_hints(MyTool) # {'name': <class 'str'>} - actual type
```
Fixes#34291
Largely:
- Remove explicit `"Default is x"` since new refs show default inferred
from sig
- Inline code (useful for eventual parsing)
- Fix code block rendering (indentations)