fix(core,model-profiles): add missing ModelProfile fields, warn on schema drift (#36129)

PR #35788 added 7 new fields to the `langchain-profiles` CLI output
(`name`, `status`, `release_date`, `last_updated`, `open_weights`,
`attachment`, `temperature`) but didn't update `ModelProfile` in
`langchain-core`. Partner packages like `langchain-aws` that set
`extra="forbid"` on their Pydantic models hit `extra_forbidden`
validation errors when Pydantic encountered undeclared TypedDict keys at
construction time. This adds the missing fields, makes `ModelProfile`
forward-compatible, provides a base-class hook so partners can stop
duplicating model-profile validator boilerplate, migrates all in-repo
partners to the new hook, and adds runtime + CI-time warnings for schema
drift.

## Changes

### `langchain-core`
- Add `__pydantic_config__ = ConfigDict(extra="allow")` to
`ModelProfile` so unknown profile keys pass Pydantic validation even on
models with `extra="forbid"` — forward-compatibility for when the CLI
schema evolves ahead of core
- Declare the 7 missing fields on `ModelProfile`: `name`, `status`,
`release_date`, `last_updated`, `open_weights` (metadata) and
`attachment`, `temperature` (capabilities)
- Add `_warn_unknown_profile_keys()` in `model_profile.py` — emits a
`UserWarning` when a profile dict contains keys not in `ModelProfile`,
suggesting a core upgrade. Wrapped in a bare `except` so introspection
failures never crash model construction
- Add `BaseChatModel._resolve_model_profile()` hook that returns `None`
by default. Partners can override this single method instead of
redefining the full `_set_model_profile` validator — the base validator
calls it automatically
- Add `BaseChatModel._check_profile_keys` as a separate
`model_validator` that calls `_warn_unknown_profile_keys`. Uses a
distinct method name so partner overrides of `_set_model_profile` don't
inadvertently suppress the check

### `langchain-profiles` CLI
- Add `_warn_undeclared_profile_keys()` to the CLI (`cli.py`), called
after merging augmentations in `refresh()` — warns at profile-generation
time (not just runtime) when emitted keys aren't declared in
`ModelProfile`. Gracefully skips if `langchain-core` isn't installed
- Add guard test
`test_model_data_to_profile_keys_subset_of_model_profile` in
model-profiles — feeds a fully-populated model dict to
`_model_data_to_profile()` and asserts every emitted key exists in
`ModelProfile.__annotations__`. CI fails before any release if someone
adds a CLI field without updating the TypedDict

### Partner packages
- Migrate all 10 in-repo partners to the `_resolve_model_profile()`
hook, replacing duplicated `@model_validator` / `_set_model_profile`
overrides: anthropic, deepseek, fireworks, groq, huggingface, mistralai,
openai (base + azure), openrouter, perplexity, xai
- Anthropic retains custom logic (context-1m beta → `max_input_tokens`
override); all others reduce to a one-liner
- Add `pr_lint.yml` scope for the new `model-profiles` package
This commit is contained in:
Mason Daugherty
2026-03-23 00:44:27 -04:00
committed by GitHub
parent 5ffece5c03
commit 2f64d80cc6
30 changed files with 488 additions and 122 deletions

View File

@@ -5,8 +5,9 @@ import json
import re
import sys
import tempfile
import warnings
from pathlib import Path
from typing import Any
from typing import Any, get_type_hints
import httpx
@@ -150,6 +151,38 @@ def _apply_overrides(
return merged
def _warn_undeclared_profile_keys(
profiles: dict[str, dict[str, Any]],
) -> None:
"""Warn if any profile keys are not declared in `ModelProfile`.
Args:
profiles: Mapping of model IDs to their profile dicts.
"""
try:
from langchain_core.language_models.model_profile import ModelProfile
except ImportError:
# langchain-core may not be installed or importable; skip check.
return
try:
declared = set(get_type_hints(ModelProfile).keys())
except (TypeError, NameError):
# get_type_hints raises NameError on unresolvable forward refs and
# TypeError when annotations evaluate to non-type objects.
return
extra = sorted({k for p in profiles.values() for k in p} - declared)
if extra:
warnings.warn(
f"Profile keys not declared in langchain_core ModelProfile: {extra}. "
f"Add these fields to "
f"langchain_core.language_models.model_profile.ModelProfile and "
f"release langchain-core before publishing partner packages that "
f"use these profiles.",
stacklevel=2,
)
def _ensure_safe_output_path(base_dir: Path, output_file: Path) -> None:
"""Ensure the resolved output path remains inside the expected directory."""
if base_dir.exists() and base_dir.is_symlink():
@@ -300,6 +333,8 @@ def refresh(provider: str, data_dir: Path) -> None: # noqa: C901, PLR0915
for model_id in sorted(extra_models):
profiles[model_id] = _apply_overrides({}, provider_aug, model_augs[model_id])
_warn_undeclared_profile_keys(profiles)
# Ensure directory exists
try:
data_dir.mkdir(parents=True, exist_ok=True, mode=0o755)