release(openai): 1.1.16 (#36927 )

fix(openai): tolerate prompt_cache_retention drift in streaming (#36925 )
chore: update PR template (#36918 )
2026-04-21 19:27:58 +00:00 · 2026-04-21 15:04:23 -04:00 · 2026-04-21 14:54:32 -04:00 · 2026-04-21 12:08:20 -04:00 · 2026-04-21 12:03:54 -04:00 · 2026-04-21 10:08:58 -04:00
14 changed files with 446 additions and 41 deletions
--- a/.github/PULL_REQUEST_TEMPLATE.md
+++ b/.github/PULL_REQUEST_TEMPLATE.md
@@ -1,6 +1,8 @@
 Fixes #

-<!-- Replace everything above this line with a 1-2 sentence description of your change. Keep the "Fixes #xx" keyword and update the issue number. -->
+---
+
+<!-- Keep the `Fixes #xx` keyword at the very top and update the issue number — this auto-closes the issue on merge. Replace this comment with a 1-2 sentence description of your change. No `# Summary` header; the description is the summary. -->

 Read the full contributing guidelines: https://docs.langchain.com/oss/python/contributing/overview

--- a/AGENTS.md
+++ b/AGENTS.md
@@ -79,23 +79,48 @@ uv run --group lint mypy .
 - uv.lock: Locked dependencies for reproducible builds
 - Makefile: Development tasks

-#### Commit standards
+#### PR and commit titles

-Suggest PR titles that follow Conventional Commits format. Refer to .github/workflows/pr_lint for allowed types and scopes. Note that all commit/PR titles should be in lowercase with the exception of proper nouns/named entities. All PR titles should include a scope with no exceptions. For example:
+Follow Conventional Commits. See `.github/workflows/pr_lint.yml` for allowed types and scopes. All titles must include a scope with no exceptions — even for the main `langchain` package.
+
+- Start the text after `type(scope):` with a lowercase letter, unless the first word is a proper noun (e.g. `Azure`, `GitHub`, `OpenAI`) or a named entity (class, function, method, parameter, or variable name).
+- Wrap named entities in backticks so they render as code. Proper nouns are left unadorned.
+- Keep titles short and descriptive — save detail for the body.
+
+Examples:

 ```txt
 feat(langchain): add new chat completion feature
 fix(core): resolve type hinting issue in vector store
 chore(anthropic): update infrastructure dependencies
+feat(langchain): `ls_agent_type` tag on `create_agent` calls
+fix(openai): infer Azure chat profiles from model name
 ```

-Note how `feat(langchain)` includes a scope even though it is the main package and name of the repo.
+#### PR descriptions

-#### Pull request guidelines
+The description *is* the summary — do not add a `# Summary` header.

- Always add a disclaimer to the PR description mentioning how AI agents are involved with the contribution.
- Describe the "why" of the changes, why the proposed solution is the right one. Limit prose.
- Highlight areas of the proposed changes that require careful review.
+- When the PR closes an issue, lead with the closing keyword on its own line at the very top, followed by a horizontal rule and then the body:
+
+  ```txt
+  Closes #123
+
+  ---
+
+  <rest of description>
+  ```
+
+  Only `Closes`, `Fixes`, and `Resolves` auto-close the referenced issue on merge. `Related:` or similar labels are informational and do not close anything.
+
+- Explain the *why*: the motivation and why this solution is the right one. Limit prose.
+- Write for readers who may be unfamiliar with this area of the codebase. Avoid insider shorthand and prefer language that is friendly to public viewers — this aids interpretability.
+- Do **not** cite line numbers; they go stale as soon as the file changes.
+- Rarely include full file paths or filenames. Reference the affected symbol, class, or subsystem by name instead.
+- Wrap class, function, method, parameter, and variable names in backticks.
+- Skip dedicated "Test plan" or "Testing" sections in most cases. Mention tests only when coverage is non-obvious, risky, or otherwise notable.
+- Call out areas of the change that require careful review.
+- Add a brief disclaimer noting AI-agent involvement in the contribution.

 ## Core development principles

--- a/CLAUDE.md
+++ b/CLAUDE.md
@@ -79,23 +79,48 @@ uv run --group lint mypy .
 - uv.lock: Locked dependencies for reproducible builds
 - Makefile: Development tasks

-#### Commit standards
+#### PR and commit titles

-Suggest PR titles that follow Conventional Commits format. Refer to .github/workflows/pr_lint for allowed types and scopes. Note that all commit/PR titles should be in lowercase with the exception of proper nouns/named entities. All PR titles should include a scope with no exceptions. For example:
+Follow Conventional Commits. See `.github/workflows/pr_lint.yml` for allowed types and scopes. All titles must include a scope with no exceptions — even for the main `langchain` package.
+
+- Start the text after `type(scope):` with a lowercase letter, unless the first word is a proper noun (e.g. `Azure`, `GitHub`, `OpenAI`) or a named entity (class, function, method, parameter, or variable name).
+- Wrap named entities in backticks so they render as code. Proper nouns are left unadorned.
+- Keep titles short and descriptive — save detail for the body.
+
+Examples:

 ```txt
 feat(langchain): add new chat completion feature
 fix(core): resolve type hinting issue in vector store
 chore(anthropic): update infrastructure dependencies
+feat(langchain): `ls_agent_type` tag on `create_agent` calls
+fix(openai): infer Azure chat profiles from model name
 ```

-Note how `feat(langchain)` includes a scope even though it is the main package and name of the repo.
+#### PR descriptions

-#### Pull request guidelines
+The description *is* the summary — do not add a `# Summary` header.

- Always add a disclaimer to the PR description mentioning how AI agents are involved with the contribution.
- Describe the "why" of the changes, why the proposed solution is the right one. Limit prose.
- Highlight areas of the proposed changes that require careful review.
+- When the PR closes an issue, lead with the closing keyword on its own line at the very top, followed by a horizontal rule and then the body:
+
+  ```txt
+  Closes #123
+
+  ---
+
+  <rest of description>
+  ```
+
+  Only `Closes`, `Fixes`, and `Resolves` auto-close the referenced issue on merge. `Related:` or similar labels are informational and do not close anything.
+
+- Explain the *why*: the motivation and why this solution is the right one. Limit prose.
+- Write for readers who may be unfamiliar with this area of the codebase. Avoid insider shorthand and prefer language that is friendly to public viewers — this aids interpretability.
+- Do **not** cite line numbers; they go stale as soon as the file changes.
+- Rarely include full file paths or filenames. Reference the affected symbol, class, or subsystem by name instead.
+- Wrap class, function, method, parameter, and variable names in backticks.
+- Skip dedicated "Test plan" or "Testing" sections in most cases. Mention tests only when coverage is non-obvious, risky, or otherwise notable.
+- Call out areas of the change that require careful review.
+- Add a brief disclaimer noting AI-agent involvement in the contribution.

 ## Core development principles

--- a/libs/core/langchain_core/tracers/langchain.py
+++ b/libs/core/langchain_core/tracers/langchain.py
@@ -36,6 +36,22 @@ logger = logging.getLogger(__name__)
 _LOGGED = set()
 _EXECUTOR: ThreadPoolExecutor | None = None

+OVERRIDABLE_LANGSMITH_INHERITABLE_METADATA_KEYS: frozenset[str] = frozenset(
+    {"ls_agent_type"}
+)
+"""Allowlist of LangSmith-only tracing metadata keys that bypass the default
+"first wins" merge semantics used when propagating tracer metadata to nested
+runs.
+
+Keys in this set are ALWAYS overridden by the nearest enclosing tracer config,
+so nested callers (e.g. a subagent) can replace a value inherited from an
+ancestor.
+
+Keep this list very small: every key here loses the default "first wins"
+protection and is always clobbered by the nearest enclosing tracer config.
+Only keys that are strictly for LangSmith tracing bookkeeping should be added.
+"""
+

 def log_error_once(method: str, exception: Exception) -> None:
    """Log an error once.
@@ -176,7 +192,16 @@ class LangChainTracer(BaseTracer):
        else:
            merged_metadata = dict(base_metadata)
            for key, value in metadata.items():
-                if key not in merged_metadata:
+                # For allowlisted LangSmith-only inheritable metadata keys
+                # (e.g. ``ls_agent_type``), nested callers are allowed to
+                # OVERRIDE the value inherited from an ancestor. For all
+                # other keys we keep the existing "first wins" behavior so
+                # that ancestor-provided tracing metadata is not accidentally
+                # clobbered by child runs.
+                if (
+                    key not in merged_metadata
+                    or key in OVERRIDABLE_LANGSMITH_INHERITABLE_METADATA_KEYS
+                ):
                    merged_metadata[key] = value

        merged_tags = sorted(set(self.tags + tags)) if tags else self.tags
@@ -448,7 +473,16 @@ def _patch_missing_metadata(self: LangChainTracer, run: Run) -> None:
    metadata = run.metadata
    patched = None
    for k, v in self.tracing_metadata.items():
-        if k not in metadata:
+        # ``OVERRIDABLE_LANGSMITH_INHERITABLE_METADATA_KEYS`` are a small,
+        # LangSmith-only allowlist that bypasses the "first wins" merge
+        # so a nested caller (e.g. a subagent) can override a parent-set value.
+        if k not in metadata or k in OVERRIDABLE_LANGSMITH_INHERITABLE_METADATA_KEYS:
+            # Skip the copy when the value already matches (avoids cloning
+            # the shared dict in the common "already set" case). Use a
+            # ``k in metadata`` guard so a legitimate missing key whose
+            # tracer value happens to be ``None`` is still patched in.
+            if k in metadata and metadata[k] == v:
+                continue
            if patched is None:
                # Copy on first miss to avoid mutating the shared dict.
                patched = {**metadata}
--- a/libs/core/tests/unit_tests/tracers/test_langchain.py
+++ b/libs/core/tests/unit_tests/tracers/test_langchain.py
@@ -807,6 +807,28 @@ class TestPatchMissingMetadata:
        assert run.metadata["env"] == "staging"
        assert run.metadata["extra"] == "from_tracer"

+    def test_allowlisted_key_overrides_existing_run_metadata(self) -> None:
+        """Allowlisted LangSmith keys override existing run metadata."""
+        tracer = self._make_tracer(metadata={"ls_agent_type": "subagent"})
+        run = self._make_run(metadata={"ls_agent_type": "root", "other": "keep"})
+
+        _patch_missing_metadata(tracer, run)
+
+        assert run.metadata["ls_agent_type"] == "subagent"
+        assert run.metadata["other"] == "keep"
+
+    def test_allowlisted_key_noop_when_values_match(self) -> None:
+        """Allowlisted keys do not clone run metadata when the value is unchanged."""
+        original = {"ls_agent_type": "root"}
+        tracer = self._make_tracer(metadata={"ls_agent_type": "root"})
+        run = self._make_run(metadata=original)
+
+        _patch_missing_metadata(tracer, run)
+
+        # No-op: the shared dict should not be replaced with a copy.
+        assert run.extra["metadata"] is original
+        assert run.metadata == {"ls_agent_type": "root"}
+

 class TestTracerMetadataCloning:
    """Tests for LangChainTracer metadata cloning helpers."""
@@ -901,3 +923,29 @@ class TestTracerMetadataCloning:
            if copied.tracing_metadata is not None
        }
        assert copied_services == {"api", "worker"}
+
+    def test_copy_with_metadata_defaults_regular_keys_first_wins(self) -> None:
+        """Regular (non-allowlisted) metadata keys keep "first wins" semantics."""
+        tracer = self._make_tracer(metadata={"env": "staging", "service": "orig"})
+
+        copied = tracer.copy_with_metadata_defaults(
+            metadata={"env": "prod", "service": "new"},
+        )
+
+        assert copied.tracing_metadata == {"env": "staging", "service": "orig"}
+
+    def test_copy_with_metadata_defaults_allowlisted_key_overrides(self) -> None:
+        """Allowlisted LangSmith keys are overridden by nested caller metadata."""
+        tracer = self._make_tracer(
+            metadata={"ls_agent_type": "root", "env": "staging"},
+        )
+
+        copied = tracer.copy_with_metadata_defaults(
+            metadata={"ls_agent_type": "subagent", "env": "prod"},
+        )
+
+        # Allowlisted key is overridden, non-allowlisted keeps first-wins.
+        assert copied.tracing_metadata == {
+            "ls_agent_type": "subagent",
+            "env": "staging",
+        }
--- a/libs/langchain_v1/langchain/agents/middleware/model_call_limit.py
+++ b/libs/langchain_v1/langchain/agents/middleware/model_call_limit.py
@@ -108,7 +108,7 @@ class ModelCallLimitMiddleware(

    Example:
        ```python
-        from langchain.agents.middleware.call_tracking import ModelCallLimitMiddleware
+        from langchain.agents.middleware import ModelCallLimitMiddleware
        from langchain.agents import create_agent

        # Create middleware with limits
--- a/libs/partners/huggingface/langchain_huggingface/data/_profiles.py
+++ b/libs/partners/huggingface/langchain_huggingface/data/_profiles.py
@@ -377,6 +377,26 @@ _PROFILES: dict[str, dict[str, Any]] = {
        "attachment": True,
        "temperature": True,
    },
+    "moonshotai/Kimi-K2.6": {
+        "name": "Kimi-K2.6",
+        "release_date": "2026-04-20",
+        "last_updated": "2026-04-20",
+        "open_weights": True,
+        "max_input_tokens": 262144,
+        "max_output_tokens": 262144,
+        "text_inputs": True,
+        "image_inputs": True,
+        "audio_inputs": False,
+        "video_inputs": True,
+        "text_outputs": True,
+        "image_outputs": False,
+        "audio_outputs": False,
+        "video_outputs": False,
+        "reasoning_output": True,
+        "tool_calling": True,
+        "attachment": True,
+        "temperature": True,
+    },
    "zai-org/GLM-4.7": {
        "name": "GLM-4.7",
        "release_date": "2025-12-22",
--- a/libs/partners/openai/langchain_openai/chat_models/azure.py
+++ b/libs/partners/openai/langchain_openai/chat_models/azure.py
@@ -705,6 +705,10 @@ class AzureChatOpenAI(BaseChatOpenAI):
        return self

    def _resolve_model_profile(self) -> ModelProfile | None:
+        if (self.model_name is not None) and (
+            profile := _get_default_model_profile(self.model_name) or None
+        ):
+            return profile
        if self.deployment_name is not None:
            return _get_default_model_profile(self.deployment_name) or None
        return None
--- a/libs/partners/openai/langchain_openai/chat_models/base.py
+++ b/libs/partners/openai/langchain_openai/chat_models/base.py
@@ -123,6 +123,7 @@ from pydantic import (
    ConfigDict,
    Field,
    SecretStr,
+    ValidationError,
    model_validator,
 )
 from pydantic.v1 import BaseModel as BaseModelV1
@@ -4589,6 +4590,36 @@ def _construct_lc_result_from_responses_api(
    return ChatResult(generations=[ChatGeneration(message=message)])


+def _coerce_chunk_response(resp: Any) -> Any:
+    # dict `response` items on stream events have been observed in the wild
+    if isinstance(resp, dict):
+        from openai.types.responses import Response
+
+        # Known mismatch: API emits `prompt_cache_retention="in_memory"` while
+        # older `openai` packages declare only `"in-memory"` in the Literal
+        # (openai-python#2883). Pre-normalize so validation succeeds on
+        # currently-released SDK versions.
+        if resp.get("prompt_cache_retention") == "in_memory":
+            resp = {**resp, "prompt_cache_retention": "in-memory"}
+
+        try:
+            return Response.model_validate(resp)
+        except ValidationError as e:
+            # API sometimes drifts ahead of the installed SDK's Literal
+            # declarations. Fall back to a non-validating construct so streams
+            # still complete, and surface the drift so operators can upgrade.
+            logger.warning(
+                "OpenAI Responses payload failed SDK validation "
+                "(response id=%s); falling back to non-validating construct. "
+                "This usually means the OpenAI API has drifted ahead of the "
+                "installed `openai` package. Details: %s",
+                resp.get("id"),
+                e,
+            )
+            return Response.model_construct(**resp)
+    return resp
+
+
 def _convert_responses_chunk_to_generation_chunk(
    chunk: Any,
    current_index: int,  # index in content
@@ -4686,14 +4717,16 @@ def _convert_responses_chunk_to_generation_chunk(
            }
        )
    elif chunk.type == "response.created":
-        id = chunk.response.id
-        response_metadata["id"] = chunk.response.id  # Backwards compatibility
+        response = _coerce_chunk_response(chunk.response)
+        id = response.id
+        response_metadata["id"] = response.id  # Backwards compatibility
    elif chunk.type in ("response.completed", "response.incomplete"):
+        response = _coerce_chunk_response(chunk.response)
        msg = cast(
            AIMessage,
            (
                _construct_lc_result_from_responses_api(
-                    chunk.response, schema=schema, output_version=output_version
+                    response, schema=schema, output_version=output_version
                )
                .generations[0]
                .message
--- a/libs/partners/openai/pyproject.toml
+++ b/libs/partners/openai/pyproject.toml
@@ -20,10 +20,10 @@ classifiers = [
    "Topic :: Scientific/Engineering :: Artificial Intelligence",
 ]

-version = "1.1.14"
+version = "1.1.16"
 requires-python = ">=3.10.0,<4.0.0"
 dependencies = [
-    "langchain-core>=1.2.31,<2.0.0",
+    "langchain-core>=1.3.0,<2.0.0",
    "openai>=2.26.0,<3.0.0",
    "tiktoken>=0.7.0,<1.0.0",
 ]
--- a/libs/partners/openai/tests/unit_tests/chat_models/test_azure.py
+++ b/libs/partners/openai/tests/unit_tests/chat_models/test_azure.py
@@ -16,6 +16,7 @@ def test_initialize_azure_openai() -> None:
        azure_deployment="35-turbo-dev",
        openai_api_version="2023-05-15",
        azure_endpoint="my-base-url",
+        api_key=SecretStr("test"),
    )
    assert llm.deployment_name == "35-turbo-dev"
    assert llm.openai_api_version == "2023-05-15"
@@ -45,6 +46,92 @@ def test_initialize_more() -> None:
    assert ls_params.get("ls_model_name") == "gpt-35-turbo-0125"


+def test_profile_resolves_from_model_name() -> None:
+    llm = AzureChatOpenAI(
+        model="gpt-4o",
+        azure_endpoint="my-base-url",
+        api_key=SecretStr("test"),
+        api_version="2023-05-15",
+    )
+
+    assert llm.profile
+    assert llm.profile["name"] == "GPT-4o"
+    assert llm.profile["max_input_tokens"] == 128_000
+
+
+def test_profile_resolves_from_model_name_with_custom_deployment_alias() -> None:
+    llm = AzureChatOpenAI(
+        model="gpt-4o",
+        azure_deployment="35-turbo-dev",
+        azure_endpoint="my-base-url",
+        api_key=SecretStr("test"),
+        api_version="2023-05-15",
+    )
+
+    assert llm.profile
+    assert llm.profile["name"] == "GPT-4o"
+
+
+def test_profile_prefers_model_name_over_known_deployment_name() -> None:
+    llm = AzureChatOpenAI(
+        model="gpt-4o",
+        azure_deployment="gpt-4",
+        azure_endpoint="my-base-url",
+        api_key=SecretStr("test"),
+        api_version="2023-05-15",
+    )
+
+    assert llm.profile
+    assert llm.profile["name"] == "GPT-4o"
+
+
+def test_profile_falls_back_to_deployment_name_with_unknown_model() -> None:
+    llm = AzureChatOpenAI(
+        model="unknown-model",
+        azure_deployment="gpt-4o",
+        azure_endpoint="my-base-url",
+        api_key=SecretStr("test"),
+        api_version="2023-05-15",
+    )
+
+    assert llm.profile
+
+
+def test_profile_resolves_from_deployment_name_without_model() -> None:
+    llm = AzureChatOpenAI(
+        azure_deployment="gpt-4o",
+        azure_endpoint="my-base-url",
+        api_key=SecretStr("test"),
+        api_version="2023-05-15",
+    )
+
+    assert llm.profile
+    assert llm.profile["name"] == "GPT-4o"
+
+
+def test_profile_respects_explicit_profile() -> None:
+    llm = AzureChatOpenAI(
+        model="gpt-4o",
+        azure_endpoint="my-base-url",
+        api_key=SecretStr("test"),
+        api_version="2023-05-15",
+        profile={"tool_calling": False},
+    )
+
+    assert llm.profile == {"tool_calling": False}
+
+
+def test_profile_is_none_for_unknown_deployment_without_model() -> None:
+    llm = AzureChatOpenAI(
+        azure_deployment="unknown-deployment",
+        azure_endpoint="my-base-url",
+        api_key=SecretStr("test"),
+        api_version="2023-05-15",
+    )
+
+    assert llm.profile is None
+
+
 def test_initialize_azure_openai_with_openai_api_base_set() -> None:
    with mock.patch.dict(os.environ, {"OPENAI_API_BASE": "https://api.openai.com"}):
        llm = AzureChatOpenAI(  # type: ignore[call-arg, call-arg]
@@ -79,6 +166,7 @@ def test_structured_output_old_model() -> None:
            azure_deployment="35-turbo-dev",
            openai_api_version="2023-05-15",
            azure_endpoint="my-base-url",
+            api_key=SecretStr("test"),
        ).with_structured_output(Output)

    # assert tool calling was used instead of json_schema
@@ -91,6 +179,7 @@ def test_max_completion_tokens_in_payload() -> None:
        azure_deployment="o1-mini",
        api_version="2024-12-01-preview",
        azure_endpoint="my-base-url",
+        api_key=SecretStr("test"),
        model_kwargs={"max_completion_tokens": 300},
    )
    messages = [HumanMessage("Hello")]
@@ -148,6 +237,7 @@ def test_max_completion_tokens_parameter() -> None:
        azure_deployment="gpt-5",
        api_version="2024-12-01-preview",
        azure_endpoint="my-base-url",
+        api_key=SecretStr("test"),
        max_completion_tokens=1500,
    )
    messages = [HumanMessage("Hello")]
@@ -165,6 +255,7 @@ def test_max_tokens_converted_to_max_completion_tokens() -> None:
        azure_deployment="gpt-5",
        api_version="2024-12-01-preview",
        azure_endpoint="my-base-url",
+        api_key=SecretStr("test"),
        max_tokens=1000,  # type: ignore[call-arg]
    )
    messages = [HumanMessage("Hello")]
--- a/libs/partners/openai/tests/unit_tests/chat_models/test_responses_stream.py
+++ b/libs/partners/openai/tests/unit_tests/chat_models/test_responses_stream.py
@@ -1,5 +1,6 @@
 from __future__ import annotations

+import copy
 from typing import Any
 from unittest.mock import MagicMock, patch

@@ -46,6 +47,8 @@ from openai.types.shared.response_format_text import ResponseFormatText
 from langchain_openai import ChatOpenAI
 from tests.unit_tests.chat_models.test_base import MockSyncContextManager

+MODEL = "gpt-5.4"
+
 responses_stream = [
    ResponseCreatedEvent(
        response=Response(
@@ -55,7 +58,7 @@ responses_stream = [
            incomplete_details=None,
            instructions=None,
            metadata={},
-            model="o4-mini-2025-04-16",
+            model=MODEL,
            object="response",
            output=[],
            parallel_tool_calls=True,
@@ -87,7 +90,7 @@ responses_stream = [
            incomplete_details=None,
            instructions=None,
            metadata={},
-            model="o4-mini-2025-04-16",
+            model=MODEL,
            object="response",
            output=[],
            parallel_tool_calls=True,
@@ -533,7 +536,7 @@ responses_stream = [
            incomplete_details=None,
            instructions=None,
            metadata={},
-            model="o4-mini-2025-04-16",
+            model=MODEL,
            object="response",
            output=[
                ResponseReasoningItem(
@@ -721,9 +724,7 @@ def _strip_none(obj: Any) -> Any:
    ],
 )
 def test_responses_stream(output_version: str, expected_content: list[dict]) -> None:
-    llm = ChatOpenAI(
-        model="o4-mini", use_responses_api=True, output_version=output_version
-    )
+    llm = ChatOpenAI(model=MODEL, use_responses_api=True, output_version=output_version)
    mock_client = MagicMock()

    def mock_create(*args: Any, **kwargs: Any) -> MockSyncContextManager:
@@ -772,7 +773,7 @@ def test_responses_stream_with_image_generation_multiple_calls() -> None:
        {"type": "function", "name": "my_tool", "parameters": {}},
    ]
    llm = ChatOpenAI(
-        model="gpt-4o",
+        model=MODEL,
        use_responses_api=True,
        streaming=True,
    )
@@ -807,7 +808,7 @@ def test_responses_stream_function_call_preserves_namespace() -> None:
                incomplete_details=None,
                instructions=None,
                metadata={},
-                model="gpt-4o-2025-01-01",
+                model=MODEL,
                object="response",
                output=[],
                parallel_tool_calls=True,
@@ -837,7 +838,7 @@ def test_responses_stream_function_call_preserves_namespace() -> None:
                incomplete_details=None,
                instructions=None,
                metadata={},
-                model="gpt-4o-2025-01-01",
+                model=MODEL,
                object="response",
                output=[],
                parallel_tool_calls=True,
@@ -917,7 +918,7 @@ def test_responses_stream_function_call_preserves_namespace() -> None:
                incomplete_details=None,
                instructions=None,
                metadata={},
-                model="gpt-4o-2025-01-01",
+                model=MODEL,
                object="response",
                output=[
                    ResponseFunctionToolCallItem(
@@ -957,9 +958,7 @@ def test_responses_stream_function_call_preserves_namespace() -> None:
        ),
    ]

-    llm = ChatOpenAI(
-        model="gpt-4o", use_responses_api=True, output_version="responses/v1"
-    )
+    llm = ChatOpenAI(model=MODEL, use_responses_api=True, output_version="responses/v1")
    mock_client = MagicMock()

    def mock_create(*args: Any, **kwargs: Any) -> MockSyncContextManager:
@@ -986,3 +985,106 @@ def test_responses_stream_function_call_preserves_namespace() -> None:
    assert first_block.get("namespace") == "my_namespace", (
        f"Expected namespace 'my_namespace', got {first_block.get('namespace')}"
    )
+
+
+def test_responses_stream_tolerates_dict_response_field() -> None:
+    """Regression test for `AttributeError: 'dict' object has no attribute 'id'`.
+
+    The OpenAI SDK types `<event>.response` strictly as `Response`, but raw dicts
+    have been observed in the wild.
+    """
+    stream = copy.deepcopy(responses_stream)
+    first_event = stream[0]
+    assert isinstance(first_event, ResponseCreatedEvent)
+    first_event.response = first_event.response.model_dump(mode="json")  # type: ignore[assignment]
+    assert isinstance(first_event.response, dict)
+
+    llm = ChatOpenAI(model=MODEL, use_responses_api=True)
+    mock_client = MagicMock()
+
+    def mock_create(*args: Any, **kwargs: Any) -> MockSyncContextManager:
+        return MockSyncContextManager(stream)
+
+    mock_client.responses.create = mock_create
+
+    full: BaseMessageChunk | None = None
+    with patch.object(llm, "root_client", mock_client):
+        for chunk in llm.stream("test"):
+            assert isinstance(chunk, AIMessageChunk)
+            full = chunk if full is None else full + chunk
+    assert isinstance(full, AIMessageChunk)
+    assert full.id == "resp_123"
+
+
+@pytest.mark.parametrize(
+    ("event_index", "event_type"),
+    [(0, ResponseCreatedEvent), (46, ResponseCompletedEvent)],
+)
+def test_responses_stream_normalizes_in_memory_prompt_cache_retention(
+    event_index: int, event_type: type
+) -> None:
+    """`prompt_cache_retention="in_memory"` from the API must not abort streams.
+
+    The API emits the underscore form while older `openai` packages declare only
+    `"in-memory"` in the Literal (openai-python#2883). `_coerce_chunk_response`
+    should normalize so both the `response.created` and `response.completed`
+    handlers can validate successfully.
+    """
+    stream = copy.deepcopy(responses_stream)
+    target = stream[event_index]
+    assert isinstance(target, event_type)
+    assert isinstance(target, (ResponseCreatedEvent, ResponseCompletedEvent))
+    dumped = target.response.model_dump(mode="json")
+    dumped["prompt_cache_retention"] = "in_memory"
+    target.response = dumped  # type: ignore[assignment]
+
+    llm = ChatOpenAI(model=MODEL, use_responses_api=True)
+    mock_client = MagicMock()
+
+    def mock_create(*args: Any, **kwargs: Any) -> MockSyncContextManager:
+        return MockSyncContextManager(stream)
+
+    mock_client.responses.create = mock_create
+
+    full: BaseMessageChunk | None = None
+    with patch.object(llm, "root_client", mock_client):
+        for chunk in llm.stream("test"):
+            assert isinstance(chunk, AIMessageChunk)
+            full = chunk if full is None else full + chunk
+    assert isinstance(full, AIMessageChunk)
+    assert full.id == "resp_123"
+    # The completed event drives usage/metadata aggregation, so assert it
+    # survived coercion when that branch is exercised.
+    if event_type is ResponseCompletedEvent:
+        assert full.usage_metadata is not None
+
+
+def test_responses_stream_tolerates_unknown_literal_drift() -> None:
+    """API drift ahead of SDK Literal declarations must not abort streams.
+
+    When the API returns a value the installed SDK's Literal does not know
+    about, `_coerce_chunk_response` should fall back to a non-validating
+    construct so streaming still completes.
+    """
+    stream = copy.deepcopy(responses_stream)
+    first_event = stream[0]
+    assert isinstance(first_event, ResponseCreatedEvent)
+    dumped = first_event.response.model_dump(mode="json")
+    dumped["status"] = "something_new"
+    first_event.response = dumped  # type: ignore[assignment]
+
+    llm = ChatOpenAI(model=MODEL, use_responses_api=True)
+    mock_client = MagicMock()
+
+    def mock_create(*args: Any, **kwargs: Any) -> MockSyncContextManager:
+        return MockSyncContextManager(stream)
+
+    mock_client.responses.create = mock_create
+
+    full: BaseMessageChunk | None = None
+    with patch.object(llm, "root_client", mock_client):
+        for chunk in llm.stream("test"):
+            assert isinstance(chunk, AIMessageChunk)
+            full = chunk if full is None else full + chunk
+    assert isinstance(full, AIMessageChunk)
+    assert full.id == "resp_123"
--- a/libs/partners/openai/uv.lock
+++ b/libs/partners/openai/uv.lock
@@ -624,7 +624,7 @@ typing = [

 [[package]]
 name = "langchain-core"
-version = "1.3.0a2"
+version = "1.3.0"
 source = { editable = "../../core" }
 dependencies = [
    { name = "jsonpatch" },
@@ -684,7 +684,7 @@ typing = [

 [[package]]
 name = "langchain-openai"
-version = "1.1.14"
+version = "1.1.16"
 source = { editable = "." }
 dependencies = [
    { name = "langchain-core" },
@@ -1120,7 +1120,7 @@ wheels = [

 [[package]]
 name = "openai"
-version = "2.29.0"
+version = "2.32.0"
 source = { registry = "https://pypi.org/simple" }
 dependencies = [
    { name = "anyio" },
@@ -1132,9 +1132,9 @@ dependencies = [
    { name = "tqdm" },
    { name = "typing-extensions" },
 ]
-sdist = { url = "https://files.pythonhosted.org/packages/b4/15/203d537e58986b5673e7f232453a2a2f110f22757b15921cbdeea392e520/openai-2.29.0.tar.gz", hash = "sha256:32d09eb2f661b38d3edd7d7e1a2943d1633f572596febe64c0cd370c86d52bec", size = 671128, upload-time = "2026-03-17T17:53:49.599Z" }
+sdist = { url = "https://files.pythonhosted.org/packages/ed/59/bdcc6b759b8c42dd73afaf5bf8f902c04b37987a5514dbc1c64dba390fef/openai-2.32.0.tar.gz", hash = "sha256:c54b27a9e4cb8d51f0dd94972ffd1a04437efeb259a9e60d8922b8bd26fe55e0", size = 693286, upload-time = "2026-04-15T22:28:19.434Z" }
 wheels = [
-    { url = "https://files.pythonhosted.org/packages/d0/b1/35b6f9c8cf9318e3dbb7146cc82dab4cf61182a8d5406fc9b50864362895/openai-2.29.0-py3-none-any.whl", hash = "sha256:b7c5de513c3286d17c5e29b92c4c98ceaf0d775244ac8159aeb1bddf840eb42a", size = 1141533, upload-time = "2026-03-17T17:53:47.348Z" },
+    { url = "https://files.pythonhosted.org/packages/1e/c1/d6e64ccd0536bf616556f0cad2b6d94a8125f508d25cfd814b1d2db4e2f1/openai-2.32.0-py3-none-any.whl", hash = "sha256:4dcc9badeb4bf54ad0d187453742f290226d30150890b7890711bda4f32f192f", size = 1162570, upload-time = "2026-04-15T22:28:17.714Z" },
 ]

 [[package]]
--- a/libs/partners/openrouter/langchain_openrouter/data/_profiles.py
+++ b/libs/partners/openrouter/langchain_openrouter/data/_profiles.py
@@ -1753,6 +1753,27 @@ _PROFILES: dict[str, dict[str, Any]] = {
        "attachment": True,
        "temperature": True,
    },
+    "moonshotai/kimi-k2.6": {
+        "name": "Kimi K2.6",
+        "release_date": "2026-04-20",
+        "last_updated": "2026-04-20",
+        "open_weights": True,
+        "max_input_tokens": 262144,
+        "max_output_tokens": 262144,
+        "text_inputs": True,
+        "image_inputs": True,
+        "audio_inputs": False,
+        "video_inputs": False,
+        "text_outputs": True,
+        "image_outputs": False,
+        "audio_outputs": False,
+        "video_outputs": False,
+        "reasoning_output": True,
+        "tool_calling": True,
+        "structured_output": True,
+        "attachment": True,
+        "temperature": True,
+    },
    "nousresearch/hermes-3-llama-3.1-405b:free": {
        "name": "Hermes 3 405B Instruct (free)",
        "release_date": "2024-08-16",
Author	SHA1	Message	Date
ccurme	9f6af21ce4	release(openai): 1.1.16 (#36927 )	2026-04-21 15:04:23 -04:00
Mason Daugherty	488c6a73bb	fix(openai): tolerate `prompt_cache_retention` drift in streaming (#36925 )	2026-04-21 14:54:32 -04:00
Mason Daugherty	acc54987fa	chore: update PR template (#36918 )	2026-04-21 12:08:20 -04:00
Mason Daugherty	f5f715985a	chore: rework PR title and description guidance (#36917 ) Rework the PR and commit guidance in the agent guidelines so new contributors (human and AI) produce descriptions and titles that age well.	2026-04-21 12:03:54 -04:00
langchain-model-profile-bot[bot]	46df8365f2	chore(model-profiles): refresh model profile data (#36911 ) Automated refresh of model profile data for all in-monorepo partner integrations via `langchain-profiles refresh`. 🤖 Generated by the `refresh_model_profiles` workflow. Co-authored-by: mdrxy <61371264+mdrxy@users.noreply.github.com>	2026-04-21 10:08:58 -04:00
Kanav Bansal	fb6ab993a7	docs(langchain): correct import path in `ModelCallLimitMiddleware` docstring (#36895 ) ## Summary Updates the example in `ModelCallLimitMiddleware` docstring to use the correct import path. The previous import referenced a non-existent module, which could cause confusion for users following the documentation.	2026-04-21 01:43:27 -04:00
Jacob Lee	40026a7282	feat(core): Update inheritance behavior for tracer metadata for special keys (#36900 ) JS equivalent: https://github.com/langchain-ai/langchainjs/pull/10733	2026-04-20 14:58:01 -07:00
ccurme	37f0b37f1c	release(openai): 1.1.15 (#36901 )	2026-04-20 15:47:20 -04:00
ccurme	19b0805bc1	fix(openai): accommodate dict `response` items in streaming (#36899 )	2026-04-20 15:44:01 -04:00
Thomas	8fec4e7cee	fix(openai): infer azure chat profiles from model name (#36858 )	2026-04-19 11:06:26 -04:00