Commit Graph

3 Commits

Author SHA1 Message Date
Mason Daugherty
948f6cc58c feat(core,partners): add package version tracking to tracing metadata (#35295)
Following on the heels of #35293

TODO:
- Packages outside of this repo (e.g. LiteLLM, Nvidia, Google, AWS)

---

## Summary

Surface partner package versions in `metadata.versions` on LangSmith
traces. Mirrors the JS SDK's `_addVersion()` pattern
([langchainjs#10106](https://github.com/langchain-ai/langchainjs/pull/10106)).

Each model constructor records its package version via `_add_version()`
on `BaseLanguageModel`. The version dict accumulates through the class
hierarchy — `langchain-core` is added in
`BaseLanguageModel.model_post_init`, `langchain-openai` in
`BaseChatOpenAI._set_openai_chat_version`, and each leaf partner in its
uniquely-named `model_validator`. Traces end up with:

```json
{
  "metadata": {
    "versions": {
      "langchain-core": "1.4.5",
      "langchain-openai": "1.3.0",
      "langchain-xai": "1.2.2"
    }
  }
}
```

### Changes

- `BaseLanguageModel._add_version(pkg, version)` — appends to
`self.metadata["versions"]`; accepts any `Mapping` type; emits a warning
if a non-mapping value is found and replaced
- `BaseLanguageModel.model_post_init` — adds `langchain-core` version;
calls `super()` for MRO safety
- `_merge_metadata_dicts` — one-level-deep (non-recursive) merge for
nested dict metadata keys
- `CallbackManager.add_metadata` — uses `_merge_metadata_dicts` instead
of flat `dict.update()` so nested metadata dicts (like `versions`)
coexist rather than clobber
- `merge_configs` — uses `_merge_metadata_dicts` for config merging

**Partners:**
- Each now calls `self._add_version("langchain-<pkg>", __version__)`

### Design decisions

- **Constructor-based, not `_get_ls_params`-based** — versions flow
through `self.metadata` (local metadata on traces), not through
`LangSmithParams`. This matches JS and makes child-class version
inheritance automatic (no merge/clobber issues).
- **`versions` is local (non-inheritable) metadata** — `self.metadata`
is passed to `CallbackManager.configure` as `local_metadata`
(`add_metadata(..., inherit=False)`), so `versions` is attached **once
per chat-model run** and is **not** propagated to child runs or
duplicated onto every streaming chunk. This is intentionally the
opposite of the inheritable-per-chunk metadata that #36588 was reducing
for performance — `versions` does not regress that path.
- **`add_metadata` deep-merge is a correctness fix, not just for
versions** — previously `add_metadata`/`merge_configs` did a flat
top-level `dict.update`/spread, so any nested metadata dict baked into a
config (e.g. via `.with_config({"metadata": {...}})`) would be wholly
replaced when a caller also passed `metadata`. `_merge_metadata_dicts`
merges one level deep so user-provided `config.metadata.versions` and
model-set `versions` coexist instead of clobbering. The merge runs once
per `configure` (not per chunk), so it is off the streaming hot path.
- **One level deep only** — `_merge_metadata_dicts` is deliberately
*not* a recursive deep merge; values nested more than one level are
last-writer-wins. This covers the `versions` case without the
ambiguity/cost of arbitrary-depth merging.
- **Warn on non-dict `metadata["versions"]`** — if a user sets
`metadata={"versions": "some-string"}`, `_add_version` emits a warning
and replaces the value with the version dict rather than silently
discarding user data or crashing. This is a soft breaking change for
anyone who previously stored non-dict values at this key.

### Follow-ups (tracked separately, out of scope here)

- JS `mergeConfigs` still flat-spreads nested metadata, so
`metadata.versions` can still clobber on the JS side until an equivalent
deep-merge lands.

---

Made by [Open SWE](https://openswe.vercel.app)

---------

Co-authored-by: open-swe[bot] <open-swe@users.noreply.github.com>
2026-06-11 02:23:19 -04:00
Mason Daugherty
07fa576de1 ci: avoid unnecessary dep installs in lint targets (#36046)
CI lint jobs use `uv run --all-groups` for all tools, but ruff doesn't
need dependency resolution — only mypy does. By splitting into
`UV_RUN_LINT` (ruff) and `UV_RUN_TYPE` (mypy), the CI-facing targets run
ruff with `--group lint` only, giving fast-fail feedback before mypy
triggers the full environment sync.

For packages where source code only conditionally imports heavy deps
(text-splitters, huggingface), `lint_package` also overrides
`UV_RUN_TYPE` to `--group lint --group typing`, skipping the ~3.5GB
`test_integration` download entirely. `lint_tests` keeps `--all-groups`
since test code legitimately imports those deps.

Additionally, `lint_imports.sh` was inconsistently wired — most packages
had the script but weren't calling it.

## Changes

**Makefile optimization**
- Introduce `UV_RUN_LINT` and `UV_RUN_TYPE` Make variables, both
defaulting to `uv run --all-groups`. For `lint_package` and
`lint_tests`, `UV_RUN_LINT` is overridden to `uv run --group lint` so
ruff runs instantly without syncing heavy deps
- For `text-splitters` and `huggingface`, override `UV_RUN_TYPE` on
`lint_package` to `uv run --group lint --group typing` — mypy runs
without downloading torch, CUDA, spacy, etc.

**mypy config for lean groups**
- Add `transformers` and `transformers.*` to `ignore_missing_imports` in
`text-splitters` pyproject.toml (conditional `try/except` import, same
treatment as existing `konlpy`/`nltk` entries)
- Add `torch`, `torch.*`, `langchain_community`, `langchain_community.*`
to `ignore_missing_imports` in `huggingface` pyproject.toml
- Add dual `# type: ignore[unreachable, unused-ignore]` in
`text-splitters/base.py` to handle the `PreTrainedTokenizerBase`
isinstance check that behaves differently depending on whether
transformers is installed

**lint_imports.sh consistency**
- Add `./scripts/lint_imports.sh` to the lint recipe in every package
that wasn't calling it (standard-tests, model-profiles, all 15
partners), and create the script for the two packages missing it
entirely (`model-profiles`, `openrouter`)
- Update all `lint_imports.sh` scripts to allow `from langchain.agents`
and `from langchain.tools` imports (legitimate v1 middleware
dependencies used by `langchain-anthropic` and `langchain-openai`)
2026-03-17 21:23:29 -04:00
Sydney Runkle
3814bd1ea7 partners: Add Perplexity Chat Integration (#30618)
Perplexity's importance in the space has been growing, so we think it's
time to add an official integration!

Note: following the release of `langchain-perplexity` to `pypi`, we
should be able to add `perplexity` as an extra in
`libs/langchain/pyproject.toml`, but we're blocked by a circular import
for now.

---------

Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>
Co-authored-by: Chester Curme <chester.curme@gmail.com>
2025-04-03 16:09:14 +00:00