Compare commits

..

11 Commits

Author SHA1 Message Date
ccurme
9f6af21ce4 release(openai): 1.1.16 (#36927) 2026-04-21 15:04:23 -04:00
Mason Daugherty
488c6a73bb fix(openai): tolerate prompt_cache_retention drift in streaming (#36925) 2026-04-21 14:54:32 -04:00
Mason Daugherty
acc54987fa chore: update PR template (#36918) 2026-04-21 12:08:20 -04:00
Mason Daugherty
f5f715985a chore: rework PR title and description guidance (#36917)
Rework the PR and commit guidance in the agent guidelines so new
contributors (human and AI) produce descriptions and titles that age
well.
2026-04-21 12:03:54 -04:00
langchain-model-profile-bot[bot]
46df8365f2 chore(model-profiles): refresh model profile data (#36911)
Automated refresh of model profile data for all in-monorepo partner
integrations via `langchain-profiles refresh`.

🤖 Generated by the `refresh_model_profiles` workflow.

Co-authored-by: mdrxy <61371264+mdrxy@users.noreply.github.com>
2026-04-21 10:08:58 -04:00
Kanav Bansal
fb6ab993a7 docs(langchain): correct import path in ModelCallLimitMiddleware docstring (#36895)
## Summary

Updates the example in `ModelCallLimitMiddleware` docstring to use the
correct import path. The previous import referenced a non-existent
module, which could cause confusion for users following the
documentation.
2026-04-21 01:43:27 -04:00
Jacob Lee
40026a7282 feat(core): Update inheritance behavior for tracer metadata for special keys (#36900)
JS equivalent: https://github.com/langchain-ai/langchainjs/pull/10733
2026-04-20 14:58:01 -07:00
ccurme
37f0b37f1c release(openai): 1.1.15 (#36901) 2026-04-20 15:47:20 -04:00
ccurme
19b0805bc1 fix(openai): accommodate dict response items in streaming (#36899) 2026-04-20 15:44:01 -04:00
Thomas
8fec4e7cee fix(openai): infer azure chat profiles from model name (#36858) 2026-04-19 11:06:26 -04:00
langchain-model-profile-bot[bot]
02991cb4cf chore(model-profiles): refresh model profile data (#36864)
Automated refresh of model profile data for all in-monorepo partner
integrations via `langchain-profiles refresh`.

🤖 Generated by the `refresh_model_profiles` workflow.

Co-authored-by: mdrxy <61371264+mdrxy@users.noreply.github.com>
2026-04-18 15:32:37 -05:00
17 changed files with 555 additions and 232 deletions

View File

@@ -1,6 +1,8 @@
Fixes #
<!-- Replace everything above this line with a 1-2 sentence description of your change. Keep the "Fixes #xx" keyword and update the issue number. -->
---
<!-- Keep the `Fixes #xx` keyword at the very top and update the issue number — this auto-closes the issue on merge. Replace this comment with a 1-2 sentence description of your change. No `# Summary` header; the description is the summary. -->
Read the full contributing guidelines: https://docs.langchain.com/oss/python/contributing/overview

View File

@@ -79,23 +79,48 @@ uv run --group lint mypy .
- uv.lock: Locked dependencies for reproducible builds
- Makefile: Development tasks
#### Commit standards
#### PR and commit titles
Suggest PR titles that follow Conventional Commits format. Refer to .github/workflows/pr_lint for allowed types and scopes. Note that all commit/PR titles should be in lowercase with the exception of proper nouns/named entities. All PR titles should include a scope with no exceptions. For example:
Follow Conventional Commits. See `.github/workflows/pr_lint.yml` for allowed types and scopes. All titles must include a scope with no exceptions — even for the main `langchain` package.
- Start the text after `type(scope):` with a lowercase letter, unless the first word is a proper noun (e.g. `Azure`, `GitHub`, `OpenAI`) or a named entity (class, function, method, parameter, or variable name).
- Wrap named entities in backticks so they render as code. Proper nouns are left unadorned.
- Keep titles short and descriptive — save detail for the body.
Examples:
```txt
feat(langchain): add new chat completion feature
fix(core): resolve type hinting issue in vector store
chore(anthropic): update infrastructure dependencies
feat(langchain): `ls_agent_type` tag on `create_agent` calls
fix(openai): infer Azure chat profiles from model name
```
Note how `feat(langchain)` includes a scope even though it is the main package and name of the repo.
#### PR descriptions
#### Pull request guidelines
The description *is* the summary — do not add a `# Summary` header.
- Always add a disclaimer to the PR description mentioning how AI agents are involved with the contribution.
- Describe the "why" of the changes, why the proposed solution is the right one. Limit prose.
- Highlight areas of the proposed changes that require careful review.
- When the PR closes an issue, lead with the closing keyword on its own line at the very top, followed by a horizontal rule and then the body:
```txt
Closes #123
---
<rest of description>
```
Only `Closes`, `Fixes`, and `Resolves` auto-close the referenced issue on merge. `Related:` or similar labels are informational and do not close anything.
- Explain the *why*: the motivation and why this solution is the right one. Limit prose.
- Write for readers who may be unfamiliar with this area of the codebase. Avoid insider shorthand and prefer language that is friendly to public viewers — this aids interpretability.
- Do **not** cite line numbers; they go stale as soon as the file changes.
- Rarely include full file paths or filenames. Reference the affected symbol, class, or subsystem by name instead.
- Wrap class, function, method, parameter, and variable names in backticks.
- Skip dedicated "Test plan" or "Testing" sections in most cases. Mention tests only when coverage is non-obvious, risky, or otherwise notable.
- Call out areas of the change that require careful review.
- Add a brief disclaimer noting AI-agent involvement in the contribution.
## Core development principles

View File

@@ -79,23 +79,48 @@ uv run --group lint mypy .
- uv.lock: Locked dependencies for reproducible builds
- Makefile: Development tasks
#### Commit standards
#### PR and commit titles
Suggest PR titles that follow Conventional Commits format. Refer to .github/workflows/pr_lint for allowed types and scopes. Note that all commit/PR titles should be in lowercase with the exception of proper nouns/named entities. All PR titles should include a scope with no exceptions. For example:
Follow Conventional Commits. See `.github/workflows/pr_lint.yml` for allowed types and scopes. All titles must include a scope with no exceptions — even for the main `langchain` package.
- Start the text after `type(scope):` with a lowercase letter, unless the first word is a proper noun (e.g. `Azure`, `GitHub`, `OpenAI`) or a named entity (class, function, method, parameter, or variable name).
- Wrap named entities in backticks so they render as code. Proper nouns are left unadorned.
- Keep titles short and descriptive — save detail for the body.
Examples:
```txt
feat(langchain): add new chat completion feature
fix(core): resolve type hinting issue in vector store
chore(anthropic): update infrastructure dependencies
feat(langchain): `ls_agent_type` tag on `create_agent` calls
fix(openai): infer Azure chat profiles from model name
```
Note how `feat(langchain)` includes a scope even though it is the main package and name of the repo.
#### PR descriptions
#### Pull request guidelines
The description *is* the summary — do not add a `# Summary` header.
- Always add a disclaimer to the PR description mentioning how AI agents are involved with the contribution.
- Describe the "why" of the changes, why the proposed solution is the right one. Limit prose.
- Highlight areas of the proposed changes that require careful review.
- When the PR closes an issue, lead with the closing keyword on its own line at the very top, followed by a horizontal rule and then the body:
```txt
Closes #123
---
<rest of description>
```
Only `Closes`, `Fixes`, and `Resolves` auto-close the referenced issue on merge. `Related:` or similar labels are informational and do not close anything.
- Explain the *why*: the motivation and why this solution is the right one. Limit prose.
- Write for readers who may be unfamiliar with this area of the codebase. Avoid insider shorthand and prefer language that is friendly to public viewers — this aids interpretability.
- Do **not** cite line numbers; they go stale as soon as the file changes.
- Rarely include full file paths or filenames. Reference the affected symbol, class, or subsystem by name instead.
- Wrap class, function, method, parameter, and variable names in backticks.
- Skip dedicated "Test plan" or "Testing" sections in most cases. Mention tests only when coverage is non-obvious, risky, or otherwise notable.
- Call out areas of the change that require careful review.
- Add a brief disclaimer noting AI-agent involvement in the contribution.
## Core development principles

View File

@@ -36,6 +36,22 @@ logger = logging.getLogger(__name__)
_LOGGED = set()
_EXECUTOR: ThreadPoolExecutor | None = None
OVERRIDABLE_LANGSMITH_INHERITABLE_METADATA_KEYS: frozenset[str] = frozenset(
{"ls_agent_type"}
)
"""Allowlist of LangSmith-only tracing metadata keys that bypass the default
"first wins" merge semantics used when propagating tracer metadata to nested
runs.
Keys in this set are ALWAYS overridden by the nearest enclosing tracer config,
so nested callers (e.g. a subagent) can replace a value inherited from an
ancestor.
Keep this list very small: every key here loses the default "first wins"
protection and is always clobbered by the nearest enclosing tracer config.
Only keys that are strictly for LangSmith tracing bookkeeping should be added.
"""
def log_error_once(method: str, exception: Exception) -> None:
"""Log an error once.
@@ -176,7 +192,16 @@ class LangChainTracer(BaseTracer):
else:
merged_metadata = dict(base_metadata)
for key, value in metadata.items():
if key not in merged_metadata:
# For allowlisted LangSmith-only inheritable metadata keys
# (e.g. ``ls_agent_type``), nested callers are allowed to
# OVERRIDE the value inherited from an ancestor. For all
# other keys we keep the existing "first wins" behavior so
# that ancestor-provided tracing metadata is not accidentally
# clobbered by child runs.
if (
key not in merged_metadata
or key in OVERRIDABLE_LANGSMITH_INHERITABLE_METADATA_KEYS
):
merged_metadata[key] = value
merged_tags = sorted(set(self.tags + tags)) if tags else self.tags
@@ -448,7 +473,16 @@ def _patch_missing_metadata(self: LangChainTracer, run: Run) -> None:
metadata = run.metadata
patched = None
for k, v in self.tracing_metadata.items():
if k not in metadata:
# ``OVERRIDABLE_LANGSMITH_INHERITABLE_METADATA_KEYS`` are a small,
# LangSmith-only allowlist that bypasses the "first wins" merge
# so a nested caller (e.g. a subagent) can override a parent-set value.
if k not in metadata or k in OVERRIDABLE_LANGSMITH_INHERITABLE_METADATA_KEYS:
# Skip the copy when the value already matches (avoids cloning
# the shared dict in the common "already set" case). Use a
# ``k in metadata`` guard so a legitimate missing key whose
# tracer value happens to be ``None`` is still patched in.
if k in metadata and metadata[k] == v:
continue
if patched is None:
# Copy on first miss to avoid mutating the shared dict.
patched = {**metadata}

View File

@@ -807,6 +807,28 @@ class TestPatchMissingMetadata:
assert run.metadata["env"] == "staging"
assert run.metadata["extra"] == "from_tracer"
def test_allowlisted_key_overrides_existing_run_metadata(self) -> None:
"""Allowlisted LangSmith keys override existing run metadata."""
tracer = self._make_tracer(metadata={"ls_agent_type": "subagent"})
run = self._make_run(metadata={"ls_agent_type": "root", "other": "keep"})
_patch_missing_metadata(tracer, run)
assert run.metadata["ls_agent_type"] == "subagent"
assert run.metadata["other"] == "keep"
def test_allowlisted_key_noop_when_values_match(self) -> None:
"""Allowlisted keys do not clone run metadata when the value is unchanged."""
original = {"ls_agent_type": "root"}
tracer = self._make_tracer(metadata={"ls_agent_type": "root"})
run = self._make_run(metadata=original)
_patch_missing_metadata(tracer, run)
# No-op: the shared dict should not be replaced with a copy.
assert run.extra["metadata"] is original
assert run.metadata == {"ls_agent_type": "root"}
class TestTracerMetadataCloning:
"""Tests for LangChainTracer metadata cloning helpers."""
@@ -901,3 +923,29 @@ class TestTracerMetadataCloning:
if copied.tracing_metadata is not None
}
assert copied_services == {"api", "worker"}
def test_copy_with_metadata_defaults_regular_keys_first_wins(self) -> None:
"""Regular (non-allowlisted) metadata keys keep "first wins" semantics."""
tracer = self._make_tracer(metadata={"env": "staging", "service": "orig"})
copied = tracer.copy_with_metadata_defaults(
metadata={"env": "prod", "service": "new"},
)
assert copied.tracing_metadata == {"env": "staging", "service": "orig"}
def test_copy_with_metadata_defaults_allowlisted_key_overrides(self) -> None:
"""Allowlisted LangSmith keys are overridden by nested caller metadata."""
tracer = self._make_tracer(
metadata={"ls_agent_type": "root", "env": "staging"},
)
copied = tracer.copy_with_metadata_defaults(
metadata={"ls_agent_type": "subagent", "env": "prod"},
)
# Allowlisted key is overridden, non-allowlisted keeps first-wins.
assert copied.tracing_metadata == {
"ls_agent_type": "subagent",
"env": "staging",
}

View File

@@ -896,11 +896,9 @@ def create_agent(
wrap_tool_call_wrapper = None
if middleware_w_wrap_tool_call:
wrappers = [
traceable(
name=f"{m.name}.wrap_tool_call",
process_inputs=_scrub_inputs,
metadata={"ls_agent_type": "middleware"},
)(m.wrap_tool_call)
traceable(name=f"{m.name}.wrap_tool_call", process_inputs=_scrub_inputs)(
m.wrap_tool_call
)
for m in middleware_w_wrap_tool_call
]
wrap_tool_call_wrapper = _chain_tool_call_wrappers(wrappers)
@@ -919,11 +917,9 @@ def create_agent(
awrap_tool_call_wrapper = None
if middleware_w_awrap_tool_call:
async_wrappers = [
traceable(
name=f"{m.name}.awrap_tool_call",
process_inputs=_scrub_inputs,
metadata={"ls_agent_type": "middleware"},
)(m.awrap_tool_call)
traceable(name=f"{m.name}.awrap_tool_call", process_inputs=_scrub_inputs)(
m.awrap_tool_call
)
for m in middleware_w_awrap_tool_call
]
awrap_tool_call_wrapper = _chain_async_tool_call_wrappers(async_wrappers)
@@ -1009,11 +1005,9 @@ def create_agent(
wrap_model_call_handler = None
if middleware_w_wrap_model_call:
sync_handlers = [
traceable(
name=f"{m.name}.wrap_model_call",
process_inputs=_scrub_inputs,
metadata={"ls_agent_type": "middleware"},
)(m.wrap_model_call)
traceable(name=f"{m.name}.wrap_model_call", process_inputs=_scrub_inputs)(
m.wrap_model_call
)
for m in middleware_w_wrap_model_call
]
wrap_model_call_handler = _chain_model_call_handlers(sync_handlers)
@@ -1022,11 +1016,9 @@ def create_agent(
awrap_model_call_handler = None
if middleware_w_awrap_model_call:
async_handlers = [
traceable(
name=f"{m.name}.awrap_model_call",
process_inputs=_scrub_inputs,
metadata={"ls_agent_type": "middleware"},
)(m.awrap_model_call)
traceable(name=f"{m.name}.awrap_model_call", process_inputs=_scrub_inputs)(
m.awrap_model_call
)
for m in middleware_w_awrap_model_call
]
awrap_model_call_handler = _chain_async_model_call_handlers(async_handlers)

View File

@@ -108,7 +108,7 @@ class ModelCallLimitMiddleware(
Example:
```python
from langchain.agents.middleware.call_tracking import ModelCallLimitMiddleware
from langchain.agents.middleware import ModelCallLimitMiddleware
from langchain.agents import create_agent
# Create middleware with limits

View File

@@ -29,19 +29,11 @@ from langsmith import Client
from langsmith.run_helpers import tracing_context
from langchain.agents import create_agent
from langchain.agents.middleware.types import (
AgentMiddleware,
AgentState,
ModelCallResult,
ModelRequest,
ModelResponse,
)
from langchain.agents.middleware.types import AgentMiddleware, AgentState
from langchain.tools import InjectedState, ToolRuntime
from tests.unit_tests.agents.model import FakeToolCallingModel
if TYPE_CHECKING:
from collections.abc import Callable
from langgraph.runtime import Runtime
@@ -857,47 +849,54 @@ async def test_combined_injected_state_runtime_store_async() -> None:
assert injected_data["store_write_success"] is True
# ---------------------------------------------------------------------------
# ls_agent_type tracing metadata
# ---------------------------------------------------------------------------
def test_ls_agent_type_is_trace_only_metadata() -> None:
"""Test that ls_agent_type is added to metadata on tracing only, not in streamed chunks."""
# Capture metadata from regular callback handler (simulates streamed metadata)
captured_callback_metadata: list[dict[str, Any]] = []
class CaptureHandler(BaseCallbackHandler):
def on_chain_start(
self,
serialized: dict[str, Any],
inputs: dict[str, Any],
*,
run_id: str,
parent_run_id: str | None = None,
tags: list[str] | None = None,
metadata: dict[str, Any] | None = None,
**kwargs: Any,
) -> None:
captured_callback_metadata.append({"tags": tags, "metadata": metadata})
class _CaptureCallbackHandler(BaseCallbackHandler):
"""Records metadata observed on every ``on_chain_start`` callback."""
def __init__(self) -> None:
self.captured: list[dict[str, Any]] = []
def on_chain_start(
self,
serialized: dict[str, Any],
inputs: dict[str, Any],
*,
run_id: str,
parent_run_id: str | None = None,
tags: list[str] | None = None,
metadata: dict[str, Any] | None = None,
**kwargs: Any,
) -> None:
self.captured.append(
{
"name": kwargs.get("name") or (serialized or {}).get("name"),
"tags": tags,
"metadata": metadata or {},
}
)
def _build_mock_langsmith_client() -> tuple[MagicMock, Client]:
"""Return a (session, client) pair where the session records tracing POSTs."""
# Create a mock client to capture what gets sent to LangSmith
mock_session = MagicMock()
mock_client = Client(session=mock_session, api_key="test", auto_batch_tracing=False)
return mock_session, mock_client
agent = create_agent(
model=FakeToolCallingModel(tool_calls=[[], []]),
tools=[],
system_prompt="You are a helpful assistant.",
)
def _posted_runs(mock_session: MagicMock) -> list[dict[str, Any]]:
"""Extract the run dicts POSTed to the LangSmith API by the mock session."""
posts: list[dict[str, Any]] = []
# Use tracing_context to enable tracing with the mock client
with tracing_context(client=mock_client, enabled=True):
agent.invoke(
{"messages": [HumanMessage("hi?")]},
config={"callbacks": [CaptureHandler()]},
)
# Verify that ls_agent_type is NOT in the regular callback metadata
# (it should only go to the tracer via langsmith_inheritable_metadata)
assert len(captured_callback_metadata) > 0
for captured in captured_callback_metadata:
metadata = captured.get("metadata") or {}
assert metadata.get("ls_agent_type") is None, (
f"ls_agent_type should not be in callback metadata, but got: {metadata}"
)
# Verify that ls_agent_type IS in the tracer metadata (sent to LangSmith)
# Get the POST requests to the LangSmith API
posts = []
for call in mock_session.request.mock_calls:
if call.args and call.args[0] == "POST":
body = json.loads(call.kwargs["data"])
@@ -905,17 +904,21 @@ def _posted_runs(mock_session: MagicMock) -> list[dict[str, Any]]:
posts.extend(body["post"])
else:
posts.append(body)
return posts
assert len(posts) >= 1
# Find the root run (the agent execution)
root_post = posts[0]
metadata = root_post.get("extra", {}).get("metadata", {})
assert metadata.get("ls_agent_type") == "root", (
f"ls_agent_type should be 'root' in tracer metadata, but got: {metadata}"
)
def _run_metadata(post: dict[str, Any]) -> dict[str, Any]:
return post.get("extra", {}).get("metadata", {}) or {}
def test_ls_agent_type_root_is_trace_only_metadata() -> None:
"""``ls_agent_type='root'`` reaches the LangSmith tracer but not callback metadata."""
handler = _CaptureCallbackHandler()
mock_session, mock_client = _build_mock_langsmith_client()
def test_ls_agent_type_is_overridable() -> None:
"""Test that ls_agent_type can be overridden via configurable in invoke config."""
# Create a mock client to capture what gets sent to LangSmith
mock_session = MagicMock()
mock_client = Client(session=mock_session, api_key="test", auto_batch_tracing=False)
agent = create_agent(
model=FakeToolCallingModel(tool_calls=[[], []]),
@@ -923,104 +926,30 @@ def test_ls_agent_type_root_is_trace_only_metadata() -> None:
system_prompt="You are a helpful assistant.",
)
# Use tracing_context to enable tracing with the mock client
with tracing_context(client=mock_client, enabled=True):
agent.invoke(
{"messages": [HumanMessage("hi?")]},
config={"callbacks": [handler]},
config={"configurable": {"ls_agent_type": "subagent", "custom_key": "custom_value"}},
)
# ls_agent_type must not leak into callback metadata.
assert handler.captured, "expected on_chain_start to fire at least once"
for entry in handler.captured:
assert entry["metadata"].get("ls_agent_type") is None, (
f"ls_agent_type leaked into callback metadata: {entry['metadata']}"
)
# Verify that ls_agent_type is overridden and configurable is merged in the tracer metadata
posts = []
for call in mock_session.request.mock_calls:
if call.args and call.args[0] == "POST":
body = json.loads(call.kwargs["data"])
if "post" in body:
posts.extend(body["post"])
else:
posts.append(body)
# ls_agent_type='root' must reach the tracer on the root run.
posts = _posted_runs(mock_session)
assert posts, "expected at least one LangSmith POST"
assert _run_metadata(posts[0]).get("ls_agent_type") == "root"
def test_ls_agent_type_is_overridable_via_configurable() -> None:
"""A caller can override ``ls_agent_type`` (and add keys) via ``configurable``."""
mock_session, mock_client = _build_mock_langsmith_client()
agent = create_agent(
model=FakeToolCallingModel(tool_calls=[[], []]),
tools=[],
system_prompt="You are a helpful assistant.",
assert len(posts) >= 1
root_post = posts[0]
metadata = root_post.get("extra", {}).get("metadata", {})
assert metadata.get("ls_agent_type") == "subagent", (
f"ls_agent_type should be 'subagent' in tracer metadata, but got: {metadata}"
)
with tracing_context(client=mock_client, enabled=True):
agent.invoke(
{"messages": [HumanMessage("hi?")]},
config={
"configurable": {
"ls_agent_type": "subagent",
"custom_key": "custom_value",
}
},
)
posts = _posted_runs(mock_session)
assert posts, "expected at least one LangSmith POST"
root_metadata = _run_metadata(posts[0])
assert root_metadata.get("ls_agent_type") == "subagent"
# Extra configurable keys also flow into tracer metadata.
assert root_metadata.get("custom_key") == "custom_value"
def test_ls_agent_type_middleware_is_trace_only_metadata() -> None:
"""Middleware traceable runs are tagged with ``ls_agent_type='middleware'``.
The tag is attached via the ``metadata=`` argument of langsmith's
``traceable`` decorator, which routes it to ``run.extra.metadata`` for
LangSmith only -- it must not leak into on_chain_start callback metadata.
"""
class PassthroughMiddleware(AgentMiddleware):
name = "test-passthrough"
def wrap_model_call(
self,
request: ModelRequest,
handler: Callable[[ModelRequest], ModelResponse],
) -> ModelCallResult:
return handler(request)
handler = _CaptureCallbackHandler()
mock_session, mock_client = _build_mock_langsmith_client()
agent = create_agent(
model=FakeToolCallingModel(tool_calls=[[], []]),
tools=[],
system_prompt="You are a helpful assistant.",
middleware=[PassthroughMiddleware()],
)
with tracing_context(client=mock_client, enabled=True):
agent.invoke(
{"messages": [HumanMessage("hi?")]},
config={"callbacks": [handler]},
)
# (1) ls_agent_type='middleware' must not leak into callback metadata.
for entry in handler.captured:
assert entry["metadata"].get("ls_agent_type") != "middleware", (
f"ls_agent_type='middleware' leaked into callback metadata for "
f"run {entry['name']!r}: {entry['metadata']}"
)
# (2) ls_agent_type='middleware' must reach the LangSmith tracer, on a run
# named after the middleware's traceable (e.g. 'test-passthrough.wrap_model_call').
posts = _posted_runs(mock_session)
middleware_posts = [p for p in posts if _run_metadata(p).get("ls_agent_type") == "middleware"]
assert middleware_posts, (
f"expected a LangSmith post with ls_agent_type='middleware'; "
f"saw metadatas: {[_run_metadata(p) for p in posts]}"
)
assert any("test-passthrough" in (p.get("name") or "") for p in middleware_posts), (
f"expected a middleware run named like 'test-passthrough.wrap_model_call', "
f"got: {[p.get('name') for p in middleware_posts]}"
# Verify that the additional configurable key is merged into metadata
assert metadata.get("custom_key") == "custom_value", (
f"custom_key should be 'custom_value' in tracer metadata, but got: {metadata}"
)

View File

@@ -377,6 +377,26 @@ _PROFILES: dict[str, dict[str, Any]] = {
"attachment": True,
"temperature": True,
},
"moonshotai/Kimi-K2.6": {
"name": "Kimi-K2.6",
"release_date": "2026-04-20",
"last_updated": "2026-04-20",
"open_weights": True,
"max_input_tokens": 262144,
"max_output_tokens": 262144,
"text_inputs": True,
"image_inputs": True,
"audio_inputs": False,
"video_inputs": True,
"text_outputs": True,
"image_outputs": False,
"audio_outputs": False,
"video_outputs": False,
"reasoning_output": True,
"tool_calling": True,
"attachment": True,
"temperature": True,
},
"zai-org/GLM-4.7": {
"name": "GLM-4.7",
"release_date": "2025-12-22",

View File

@@ -705,6 +705,10 @@ class AzureChatOpenAI(BaseChatOpenAI):
return self
def _resolve_model_profile(self) -> ModelProfile | None:
if (self.model_name is not None) and (
profile := _get_default_model_profile(self.model_name) or None
):
return profile
if self.deployment_name is not None:
return _get_default_model_profile(self.deployment_name) or None
return None

View File

@@ -123,6 +123,7 @@ from pydantic import (
ConfigDict,
Field,
SecretStr,
ValidationError,
model_validator,
)
from pydantic.v1 import BaseModel as BaseModelV1
@@ -4589,6 +4590,36 @@ def _construct_lc_result_from_responses_api(
return ChatResult(generations=[ChatGeneration(message=message)])
def _coerce_chunk_response(resp: Any) -> Any:
# dict `response` items on stream events have been observed in the wild
if isinstance(resp, dict):
from openai.types.responses import Response
# Known mismatch: API emits `prompt_cache_retention="in_memory"` while
# older `openai` packages declare only `"in-memory"` in the Literal
# (openai-python#2883). Pre-normalize so validation succeeds on
# currently-released SDK versions.
if resp.get("prompt_cache_retention") == "in_memory":
resp = {**resp, "prompt_cache_retention": "in-memory"}
try:
return Response.model_validate(resp)
except ValidationError as e:
# API sometimes drifts ahead of the installed SDK's Literal
# declarations. Fall back to a non-validating construct so streams
# still complete, and surface the drift so operators can upgrade.
logger.warning(
"OpenAI Responses payload failed SDK validation "
"(response id=%s); falling back to non-validating construct. "
"This usually means the OpenAI API has drifted ahead of the "
"installed `openai` package. Details: %s",
resp.get("id"),
e,
)
return Response.model_construct(**resp)
return resp
def _convert_responses_chunk_to_generation_chunk(
chunk: Any,
current_index: int, # index in content
@@ -4686,14 +4717,16 @@ def _convert_responses_chunk_to_generation_chunk(
}
)
elif chunk.type == "response.created":
id = chunk.response.id
response_metadata["id"] = chunk.response.id # Backwards compatibility
response = _coerce_chunk_response(chunk.response)
id = response.id
response_metadata["id"] = response.id # Backwards compatibility
elif chunk.type in ("response.completed", "response.incomplete"):
response = _coerce_chunk_response(chunk.response)
msg = cast(
AIMessage,
(
_construct_lc_result_from_responses_api(
chunk.response, schema=schema, output_version=output_version
response, schema=schema, output_version=output_version
)
.generations[0]
.message

View File

@@ -41,31 +41,6 @@ _PROFILES: dict[str, dict[str, Any]] = {
"image_tool_message": True,
"tool_choice": True,
},
"codex-mini-latest": {
"name": "Codex Mini",
"release_date": "2025-05-16",
"last_updated": "2025-05-16",
"open_weights": False,
"max_input_tokens": 200000,
"max_output_tokens": 100000,
"text_inputs": True,
"image_inputs": False,
"audio_inputs": False,
"video_inputs": False,
"text_outputs": True,
"image_outputs": False,
"audio_outputs": False,
"video_outputs": False,
"reasoning_output": True,
"tool_calling": True,
"attachment": True,
"temperature": False,
"image_url_inputs": True,
"pdf_inputs": True,
"pdf_tool_message": True,
"image_tool_message": True,
"tool_choice": True,
},
"gpt-3.5-turbo": {
"name": "GPT-3.5-turbo",
"release_date": "2023-03-01",

View File

@@ -20,10 +20,10 @@ classifiers = [
"Topic :: Scientific/Engineering :: Artificial Intelligence",
]
version = "1.1.14"
version = "1.1.16"
requires-python = ">=3.10.0,<4.0.0"
dependencies = [
"langchain-core>=1.2.31,<2.0.0",
"langchain-core>=1.3.0,<2.0.0",
"openai>=2.26.0,<3.0.0",
"tiktoken>=0.7.0,<1.0.0",
]

View File

@@ -16,6 +16,7 @@ def test_initialize_azure_openai() -> None:
azure_deployment="35-turbo-dev",
openai_api_version="2023-05-15",
azure_endpoint="my-base-url",
api_key=SecretStr("test"),
)
assert llm.deployment_name == "35-turbo-dev"
assert llm.openai_api_version == "2023-05-15"
@@ -45,6 +46,92 @@ def test_initialize_more() -> None:
assert ls_params.get("ls_model_name") == "gpt-35-turbo-0125"
def test_profile_resolves_from_model_name() -> None:
llm = AzureChatOpenAI(
model="gpt-4o",
azure_endpoint="my-base-url",
api_key=SecretStr("test"),
api_version="2023-05-15",
)
assert llm.profile
assert llm.profile["name"] == "GPT-4o"
assert llm.profile["max_input_tokens"] == 128_000
def test_profile_resolves_from_model_name_with_custom_deployment_alias() -> None:
llm = AzureChatOpenAI(
model="gpt-4o",
azure_deployment="35-turbo-dev",
azure_endpoint="my-base-url",
api_key=SecretStr("test"),
api_version="2023-05-15",
)
assert llm.profile
assert llm.profile["name"] == "GPT-4o"
def test_profile_prefers_model_name_over_known_deployment_name() -> None:
llm = AzureChatOpenAI(
model="gpt-4o",
azure_deployment="gpt-4",
azure_endpoint="my-base-url",
api_key=SecretStr("test"),
api_version="2023-05-15",
)
assert llm.profile
assert llm.profile["name"] == "GPT-4o"
def test_profile_falls_back_to_deployment_name_with_unknown_model() -> None:
llm = AzureChatOpenAI(
model="unknown-model",
azure_deployment="gpt-4o",
azure_endpoint="my-base-url",
api_key=SecretStr("test"),
api_version="2023-05-15",
)
assert llm.profile
def test_profile_resolves_from_deployment_name_without_model() -> None:
llm = AzureChatOpenAI(
azure_deployment="gpt-4o",
azure_endpoint="my-base-url",
api_key=SecretStr("test"),
api_version="2023-05-15",
)
assert llm.profile
assert llm.profile["name"] == "GPT-4o"
def test_profile_respects_explicit_profile() -> None:
llm = AzureChatOpenAI(
model="gpt-4o",
azure_endpoint="my-base-url",
api_key=SecretStr("test"),
api_version="2023-05-15",
profile={"tool_calling": False},
)
assert llm.profile == {"tool_calling": False}
def test_profile_is_none_for_unknown_deployment_without_model() -> None:
llm = AzureChatOpenAI(
azure_deployment="unknown-deployment",
azure_endpoint="my-base-url",
api_key=SecretStr("test"),
api_version="2023-05-15",
)
assert llm.profile is None
def test_initialize_azure_openai_with_openai_api_base_set() -> None:
with mock.patch.dict(os.environ, {"OPENAI_API_BASE": "https://api.openai.com"}):
llm = AzureChatOpenAI( # type: ignore[call-arg, call-arg]
@@ -79,6 +166,7 @@ def test_structured_output_old_model() -> None:
azure_deployment="35-turbo-dev",
openai_api_version="2023-05-15",
azure_endpoint="my-base-url",
api_key=SecretStr("test"),
).with_structured_output(Output)
# assert tool calling was used instead of json_schema
@@ -91,6 +179,7 @@ def test_max_completion_tokens_in_payload() -> None:
azure_deployment="o1-mini",
api_version="2024-12-01-preview",
azure_endpoint="my-base-url",
api_key=SecretStr("test"),
model_kwargs={"max_completion_tokens": 300},
)
messages = [HumanMessage("Hello")]
@@ -148,6 +237,7 @@ def test_max_completion_tokens_parameter() -> None:
azure_deployment="gpt-5",
api_version="2024-12-01-preview",
azure_endpoint="my-base-url",
api_key=SecretStr("test"),
max_completion_tokens=1500,
)
messages = [HumanMessage("Hello")]
@@ -165,6 +255,7 @@ def test_max_tokens_converted_to_max_completion_tokens() -> None:
azure_deployment="gpt-5",
api_version="2024-12-01-preview",
azure_endpoint="my-base-url",
api_key=SecretStr("test"),
max_tokens=1000, # type: ignore[call-arg]
)
messages = [HumanMessage("Hello")]

View File

@@ -1,5 +1,6 @@
from __future__ import annotations
import copy
from typing import Any
from unittest.mock import MagicMock, patch
@@ -46,6 +47,8 @@ from openai.types.shared.response_format_text import ResponseFormatText
from langchain_openai import ChatOpenAI
from tests.unit_tests.chat_models.test_base import MockSyncContextManager
MODEL = "gpt-5.4"
responses_stream = [
ResponseCreatedEvent(
response=Response(
@@ -55,7 +58,7 @@ responses_stream = [
incomplete_details=None,
instructions=None,
metadata={},
model="o4-mini-2025-04-16",
model=MODEL,
object="response",
output=[],
parallel_tool_calls=True,
@@ -87,7 +90,7 @@ responses_stream = [
incomplete_details=None,
instructions=None,
metadata={},
model="o4-mini-2025-04-16",
model=MODEL,
object="response",
output=[],
parallel_tool_calls=True,
@@ -533,7 +536,7 @@ responses_stream = [
incomplete_details=None,
instructions=None,
metadata={},
model="o4-mini-2025-04-16",
model=MODEL,
object="response",
output=[
ResponseReasoningItem(
@@ -721,9 +724,7 @@ def _strip_none(obj: Any) -> Any:
],
)
def test_responses_stream(output_version: str, expected_content: list[dict]) -> None:
llm = ChatOpenAI(
model="o4-mini", use_responses_api=True, output_version=output_version
)
llm = ChatOpenAI(model=MODEL, use_responses_api=True, output_version=output_version)
mock_client = MagicMock()
def mock_create(*args: Any, **kwargs: Any) -> MockSyncContextManager:
@@ -772,7 +773,7 @@ def test_responses_stream_with_image_generation_multiple_calls() -> None:
{"type": "function", "name": "my_tool", "parameters": {}},
]
llm = ChatOpenAI(
model="gpt-4o",
model=MODEL,
use_responses_api=True,
streaming=True,
)
@@ -807,7 +808,7 @@ def test_responses_stream_function_call_preserves_namespace() -> None:
incomplete_details=None,
instructions=None,
metadata={},
model="gpt-4o-2025-01-01",
model=MODEL,
object="response",
output=[],
parallel_tool_calls=True,
@@ -837,7 +838,7 @@ def test_responses_stream_function_call_preserves_namespace() -> None:
incomplete_details=None,
instructions=None,
metadata={},
model="gpt-4o-2025-01-01",
model=MODEL,
object="response",
output=[],
parallel_tool_calls=True,
@@ -917,7 +918,7 @@ def test_responses_stream_function_call_preserves_namespace() -> None:
incomplete_details=None,
instructions=None,
metadata={},
model="gpt-4o-2025-01-01",
model=MODEL,
object="response",
output=[
ResponseFunctionToolCallItem(
@@ -957,9 +958,7 @@ def test_responses_stream_function_call_preserves_namespace() -> None:
),
]
llm = ChatOpenAI(
model="gpt-4o", use_responses_api=True, output_version="responses/v1"
)
llm = ChatOpenAI(model=MODEL, use_responses_api=True, output_version="responses/v1")
mock_client = MagicMock()
def mock_create(*args: Any, **kwargs: Any) -> MockSyncContextManager:
@@ -986,3 +985,106 @@ def test_responses_stream_function_call_preserves_namespace() -> None:
assert first_block.get("namespace") == "my_namespace", (
f"Expected namespace 'my_namespace', got {first_block.get('namespace')}"
)
def test_responses_stream_tolerates_dict_response_field() -> None:
"""Regression test for `AttributeError: 'dict' object has no attribute 'id'`.
The OpenAI SDK types `<event>.response` strictly as `Response`, but raw dicts
have been observed in the wild.
"""
stream = copy.deepcopy(responses_stream)
first_event = stream[0]
assert isinstance(first_event, ResponseCreatedEvent)
first_event.response = first_event.response.model_dump(mode="json") # type: ignore[assignment]
assert isinstance(first_event.response, dict)
llm = ChatOpenAI(model=MODEL, use_responses_api=True)
mock_client = MagicMock()
def mock_create(*args: Any, **kwargs: Any) -> MockSyncContextManager:
return MockSyncContextManager(stream)
mock_client.responses.create = mock_create
full: BaseMessageChunk | None = None
with patch.object(llm, "root_client", mock_client):
for chunk in llm.stream("test"):
assert isinstance(chunk, AIMessageChunk)
full = chunk if full is None else full + chunk
assert isinstance(full, AIMessageChunk)
assert full.id == "resp_123"
@pytest.mark.parametrize(
("event_index", "event_type"),
[(0, ResponseCreatedEvent), (46, ResponseCompletedEvent)],
)
def test_responses_stream_normalizes_in_memory_prompt_cache_retention(
event_index: int, event_type: type
) -> None:
"""`prompt_cache_retention="in_memory"` from the API must not abort streams.
The API emits the underscore form while older `openai` packages declare only
`"in-memory"` in the Literal (openai-python#2883). `_coerce_chunk_response`
should normalize so both the `response.created` and `response.completed`
handlers can validate successfully.
"""
stream = copy.deepcopy(responses_stream)
target = stream[event_index]
assert isinstance(target, event_type)
assert isinstance(target, (ResponseCreatedEvent, ResponseCompletedEvent))
dumped = target.response.model_dump(mode="json")
dumped["prompt_cache_retention"] = "in_memory"
target.response = dumped # type: ignore[assignment]
llm = ChatOpenAI(model=MODEL, use_responses_api=True)
mock_client = MagicMock()
def mock_create(*args: Any, **kwargs: Any) -> MockSyncContextManager:
return MockSyncContextManager(stream)
mock_client.responses.create = mock_create
full: BaseMessageChunk | None = None
with patch.object(llm, "root_client", mock_client):
for chunk in llm.stream("test"):
assert isinstance(chunk, AIMessageChunk)
full = chunk if full is None else full + chunk
assert isinstance(full, AIMessageChunk)
assert full.id == "resp_123"
# The completed event drives usage/metadata aggregation, so assert it
# survived coercion when that branch is exercised.
if event_type is ResponseCompletedEvent:
assert full.usage_metadata is not None
def test_responses_stream_tolerates_unknown_literal_drift() -> None:
"""API drift ahead of SDK Literal declarations must not abort streams.
When the API returns a value the installed SDK's Literal does not know
about, `_coerce_chunk_response` should fall back to a non-validating
construct so streaming still completes.
"""
stream = copy.deepcopy(responses_stream)
first_event = stream[0]
assert isinstance(first_event, ResponseCreatedEvent)
dumped = first_event.response.model_dump(mode="json")
dumped["status"] = "something_new"
first_event.response = dumped # type: ignore[assignment]
llm = ChatOpenAI(model=MODEL, use_responses_api=True)
mock_client = MagicMock()
def mock_create(*args: Any, **kwargs: Any) -> MockSyncContextManager:
return MockSyncContextManager(stream)
mock_client.responses.create = mock_create
full: BaseMessageChunk | None = None
with patch.object(llm, "root_client", mock_client):
for chunk in llm.stream("test"):
assert isinstance(chunk, AIMessageChunk)
full = chunk if full is None else full + chunk
assert isinstance(full, AIMessageChunk)
assert full.id == "resp_123"

View File

@@ -624,7 +624,7 @@ typing = [
[[package]]
name = "langchain-core"
version = "1.3.0a2"
version = "1.3.0"
source = { editable = "../../core" }
dependencies = [
{ name = "jsonpatch" },
@@ -684,7 +684,7 @@ typing = [
[[package]]
name = "langchain-openai"
version = "1.1.14"
version = "1.1.16"
source = { editable = "." }
dependencies = [
{ name = "langchain-core" },
@@ -1120,7 +1120,7 @@ wheels = [
[[package]]
name = "openai"
version = "2.29.0"
version = "2.32.0"
source = { registry = "https://pypi.org/simple" }
dependencies = [
{ name = "anyio" },
@@ -1132,9 +1132,9 @@ dependencies = [
{ name = "tqdm" },
{ name = "typing-extensions" },
]
sdist = { url = "https://files.pythonhosted.org/packages/b4/15/203d537e58986b5673e7f232453a2a2f110f22757b15921cbdeea392e520/openai-2.29.0.tar.gz", hash = "sha256:32d09eb2f661b38d3edd7d7e1a2943d1633f572596febe64c0cd370c86d52bec", size = 671128, upload-time = "2026-03-17T17:53:49.599Z" }
sdist = { url = "https://files.pythonhosted.org/packages/ed/59/bdcc6b759b8c42dd73afaf5bf8f902c04b37987a5514dbc1c64dba390fef/openai-2.32.0.tar.gz", hash = "sha256:c54b27a9e4cb8d51f0dd94972ffd1a04437efeb259a9e60d8922b8bd26fe55e0", size = 693286, upload-time = "2026-04-15T22:28:19.434Z" }
wheels = [
{ url = "https://files.pythonhosted.org/packages/d0/b1/35b6f9c8cf9318e3dbb7146cc82dab4cf61182a8d5406fc9b50864362895/openai-2.29.0-py3-none-any.whl", hash = "sha256:b7c5de513c3286d17c5e29b92c4c98ceaf0d775244ac8159aeb1bddf840eb42a", size = 1141533, upload-time = "2026-03-17T17:53:47.348Z" },
{ url = "https://files.pythonhosted.org/packages/1e/c1/d6e64ccd0536bf616556f0cad2b6d94a8125f508d25cfd814b1d2db4e2f1/openai-2.32.0-py3-none-any.whl", hash = "sha256:4dcc9badeb4bf54ad0d187453742f290226d30150890b7890711bda4f32f192f", size = 1162570, upload-time = "2026-04-15T22:28:17.714Z" },
]
[[package]]

View File

@@ -167,6 +167,28 @@ _PROFILES: dict[str, dict[str, Any]] = {
"attachment": True,
"temperature": True,
},
"anthropic/claude-opus-4.7": {
"name": "Claude Opus 4.7",
"release_date": "2026-04-16",
"last_updated": "2026-04-16",
"open_weights": False,
"max_input_tokens": 1000000,
"max_output_tokens": 128000,
"text_inputs": True,
"image_inputs": True,
"audio_inputs": False,
"pdf_inputs": True,
"video_inputs": False,
"text_outputs": True,
"image_outputs": False,
"audio_outputs": False,
"video_outputs": False,
"reasoning_output": True,
"tool_calling": True,
"structured_output": True,
"attachment": True,
"temperature": False,
},
"anthropic/claude-sonnet-4": {
"name": "Claude Sonnet 4",
"release_date": "2025-05-22",
@@ -1731,6 +1753,27 @@ _PROFILES: dict[str, dict[str, Any]] = {
"attachment": True,
"temperature": True,
},
"moonshotai/kimi-k2.6": {
"name": "Kimi K2.6",
"release_date": "2026-04-20",
"last_updated": "2026-04-20",
"open_weights": True,
"max_input_tokens": 262144,
"max_output_tokens": 262144,
"text_inputs": True,
"image_inputs": True,
"audio_inputs": False,
"video_inputs": False,
"text_outputs": True,
"image_outputs": False,
"audio_outputs": False,
"video_outputs": False,
"reasoning_output": True,
"tool_calling": True,
"structured_output": True,
"attachment": True,
"temperature": True,
},
"nousresearch/hermes-3-llama-3.1-405b:free": {
"name": "Hermes 3 405B Instruct (free)",
"release_date": "2024-08-16",