Commit Graph

9 Commits

Author SHA1 Message Date
Christophe Bornet
1de100f278 chore(infra): bump mypy to 2.1 and unify type-check config across the monorepo (#36470)
Originally a narrow bump of mypy to `1.20` in four packages. Expanded to
get the whole monorepo onto a single, current mypy and a consistent
type-check configuration, so contributors no longer hit different mypy
versions and divergent behavior depending on which package they touch.

### What changed

- **Unified the mypy pin to `>=2.1.0,<2.2.0`** in every mypy-using
package (6 libs + 14 partners), replacing the previously scattered pins
(`1.10`/`1.17`/`1.18`/`1.19`/`1.20`, with assorted upper bounds).
- **Unified the `[tool.mypy]` base per tier:**
- libs: `plugins = ["pydantic.mypy"]`, `strict = true`,
`enable_error_code = "deprecated"`, `warn_unreachable = true`
  - partners: `disallow_untyped_defs = true`
- Normalized style (`disallow_untyped_defs = "True"` string → bool,
quote/key consistency).
- **Fixed the 20 real errors** mypy 2.1 surfaces: `redundant-cast` from
improved narrowing (`core`, `langchain-classic`), a `var-annotated` for
`_LOGGED`, a return-type widening in `langchain-groq`'s
`_convert_from_v1_to_groq` (it can legitimately return a bare `str`),
and stale `type-arg`/`unused-ignore` in `langchain-model-profiles`
tests.

### Deliberate non-uniformity (documented inline in the relevant
`pyproject.toml`s)

Going fully byte-identical would surface ~196 additional errors that are
*not* real bugs, so two settings are kept package-appropriate:

- **`warn_unreachable`** is enabled on every strict lib **except
`core`**, where it false-flags intentional defensive code — including
the SSRF / IP-policy guards in `_security/` — as unreachable.
- **`pydantic.mypy` plugin** is used only on `anthropic` and
`perplexity` (their code is authored against it and reports ~99/~132
errors without it). It is *not* added to the other partners, where it
only flags the public alias constructor API (e.g. `ChatGroq(model=...)`)
in tests rather than finding bugs.
- **`ollama`** is left on its `ty` type checker; it does not use mypy.

---------

Co-authored-by: Mason Daugherty <github@mdrxy.com>
2026-06-11 00:24:59 -04:00
Mason Daugherty
2f64d80cc6 fix(core,model-profiles): add missing ModelProfile fields, warn on schema drift (#36129)
PR #35788 added 7 new fields to the `langchain-profiles` CLI output
(`name`, `status`, `release_date`, `last_updated`, `open_weights`,
`attachment`, `temperature`) but didn't update `ModelProfile` in
`langchain-core`. Partner packages like `langchain-aws` that set
`extra="forbid"` on their Pydantic models hit `extra_forbidden`
validation errors when Pydantic encountered undeclared TypedDict keys at
construction time. This adds the missing fields, makes `ModelProfile`
forward-compatible, provides a base-class hook so partners can stop
duplicating model-profile validator boilerplate, migrates all in-repo
partners to the new hook, and adds runtime + CI-time warnings for schema
drift.

## Changes

### `langchain-core`
- Add `__pydantic_config__ = ConfigDict(extra="allow")` to
`ModelProfile` so unknown profile keys pass Pydantic validation even on
models with `extra="forbid"` — forward-compatibility for when the CLI
schema evolves ahead of core
- Declare the 7 missing fields on `ModelProfile`: `name`, `status`,
`release_date`, `last_updated`, `open_weights` (metadata) and
`attachment`, `temperature` (capabilities)
- Add `_warn_unknown_profile_keys()` in `model_profile.py` — emits a
`UserWarning` when a profile dict contains keys not in `ModelProfile`,
suggesting a core upgrade. Wrapped in a bare `except` so introspection
failures never crash model construction
- Add `BaseChatModel._resolve_model_profile()` hook that returns `None`
by default. Partners can override this single method instead of
redefining the full `_set_model_profile` validator — the base validator
calls it automatically
- Add `BaseChatModel._check_profile_keys` as a separate
`model_validator` that calls `_warn_unknown_profile_keys`. Uses a
distinct method name so partner overrides of `_set_model_profile` don't
inadvertently suppress the check

### `langchain-profiles` CLI
- Add `_warn_undeclared_profile_keys()` to the CLI (`cli.py`), called
after merging augmentations in `refresh()` — warns at profile-generation
time (not just runtime) when emitted keys aren't declared in
`ModelProfile`. Gracefully skips if `langchain-core` isn't installed
- Add guard test
`test_model_data_to_profile_keys_subset_of_model_profile` in
model-profiles — feeds a fully-populated model dict to
`_model_data_to_profile()` and asserts every emitted key exists in
`ModelProfile.__annotations__`. CI fails before any release if someone
adds a CLI field without updating the TypedDict

### Partner packages
- Migrate all 10 in-repo partners to the `_resolve_model_profile()`
hook, replacing duplicated `@model_validator` / `_set_model_profile`
overrides: anthropic, deepseek, fireworks, groq, huggingface, mistralai,
openai (base + azure), openrouter, perplexity, xai
- Anthropic retains custom logic (context-1m beta → `max_input_tokens`
override); all others reduce to a one-liner
- Add `pr_lint.yml` scope for the new `model-profiles` package
2026-03-23 00:44:27 -04:00
Mason Daugherty
5d9568b5f5 feat(model-profiles): new fields + Makefile target (#35788)
Extract additional fields from models.dev into `_model_data_to_profile`:
`name`, `status`, `release_date`, `last_updated`, `open_weights`,
`attachment`, `temperature`

Move the model profile refresh logic from an inline bash script in the
GitHub Actions workflow into a `make refresh-profiles` target in
`libs/model-profiles/Makefile`. This makes it runnable locally with a
single command and keeps the provider map in one place instead of
duplicated between CI and developer docs.
2026-03-12 13:56:25 +00:00
Mason Daugherty
70192690b1 fix(model-profiles): sort generated profiles by model ID for stable diffs (#35344)
- Sort model profiles alphabetically by model ID (the top-level
`_PROFILES` dictionary keys, e.g. `claude-3-5-haiku-20241022`,
`gpt-4o-mini`) before writing `_profiles.py`, so that regenerating
profiles only shows actual data changes in diffs — not random reordering
from the models.dev API response order
- Regenerate all 10 partner profile files with the new sorted ordering
2026-02-19 23:11:22 -05:00
Mason Daugherty
4ca586b322 feat(model-profiles): add text_inputs and text_outputs (#35084)
- Add `text_inputs` and `text_outputs` fields to `ModelProfile`
- Regenerate `_profiles.py` for all providers

## Why

models.dev data includes `'text'` as both an input and output modality,
but we didn't capture it.

models.dev broadly contains models without text input (Whisper/ASR) and
without text output (image generators, TTS).

Without this, downstream consumers can't filter on model text support
(e.g. preventing users from passing text input to an audio-only model).

---

We'd need to also run for Google, AWS and cut releases for all to
propagate
2026-02-09 14:50:09 -05:00
ccurme
33e5d01f7c feat(model-profiles): distribute data across packages (#34024) 2025-11-21 15:47:05 -05:00
ccurme
7a3827471b fix(model-profiles): fix pdf_inputs field (#33797) 2025-11-03 11:10:33 -05:00
ccurme
424214041e feat(model-profiles): support more providers (#33766) 2025-10-31 13:48:56 -04:00
ccurme
493be259c3 feat(core): mint langchain-model-profiles and add profile property to BaseChatModel (#33728) 2025-10-31 09:44:46 -04:00