Two unrelated nightly-CI failures rooted in upstream API drift. OpenAI
retired `gpt-4o-audio-preview` (now 404) and Azure embedding deployments
running `text-embedding-3-*` with truncated `dimensions` no longer
return unit-norm vectors.
Same shape as the merged anthropic patch in #37064, ported to
`libs/partners/openai`.
`_SyncHttpxClientWrapper.__del__` / `_AsyncHttpxClientWrapper.__del__`
check `self.is_closed`, which reads `self._state`. When a wrapper is
created without `__init__` running to completion — `copy.deepcopy` via
`__new__` + `__setstate__`, or a constructor that raised partway through
— `_state` is missing and the finalizer prints
```
Exception ignored in: <function _SyncHttpxClientWrapper.__del__ at 0x...>
Traceback (most recent call last):
File ".../langchain_openai/chat_models/_client_utils.py", line 366, in __del__
if self.is_closed:
File ".../httpx/_client.py", line 228, in is_closed
return self._state == ClientState.CLOSED
AttributeError: '_SyncHttpxClientWrapper' object has no attribute '_state'
```
at GC time. Same noise pattern that #37064 fixed for the anthropic
partner.
Hoist the `is_closed` access inside the existing `try/except` so the
`AttributeError` is swallowed alongside the `close()` / `aclose()`
exceptions that block already handles.
Tests: two new unit tests build the wrappers via `__new__` (no
`__init__` → no `_state`) and call `__del__` directly, mirroring the
tests added in #37064.
Verified:
- `cd libs/partners/openai && make format` -> all checks passed
- `cd libs/partners/openai && make test
TEST_FILE=tests/unit_tests/chat_models/test_client_utils.py` -> 37
passed, 1 skipped (linux-only)
- `cd libs/partners/openai && make lint` -> all checks passed, mypy
clean
> [!IMPORTANT]
> **Behavior change on upgrade — minor bump (`1.1.16` → `1.2.0`).**
>
> Streaming calls now raise `StreamChunkTimeoutError` (a `TimeoutError`
subclass — existing `except TimeoutError:` / `except
asyncio.TimeoutError:` handlers catch it) after 120s of content silence
instead of hanging forever. Opt out with `stream_chunk_timeout=None` or
`LANGCHAIN_OPENAI_STREAM_CHUNK_TIMEOUT_S=0`.
>
> Kernel-level TCP keepalive / `TCP_USER_TIMEOUT` are applied via a
custom `httpx` transport. `httpx` disables its env-proxy auto-detection
(`HTTP_PROXY` / `HTTPS_PROXY` / `ALL_PROXY` / `NO_PROXY` and
macOS/Windows system proxy) whenever a transport is supplied, so to
avoid silently breaking enterprise proxy users, `ChatOpenAI` now detects
the "proxy-env-shadow" shape at construction and **skips the custom
transport entirely** when **all** of these hold:
>
> - `http_socket_options` left at default (`None`)
> - No `http_client` or `http_async_client` supplied
> - No `openai_proxy` supplied
> - A proxy env var / system proxy is visible to httpx
>
> On that shape the instance falls back to pre-PR behavior and env-proxy
auto-detection still applies. A one-time `INFO` records the bypass.
>
> Users who explicitly set `http_socket_options=[...]` alongside an env
proxy still get the shadowed behavior with a one-time `WARNING` log —
they opted in. Full opt-outs below.
---
Streaming chat completions can hang forever when the underlying TCP
connection silently dies mid-stream (idle NAT/LB timeouts, sandboxed
runtimes killing long-lived connections, peer gone without a FIN or
RST). httpx's read timeout doesn't help here because it's reset by any
bytes arriving on the socket, including OpenAI's SSE keepalive comments,
so a stream that's quiet on content but still producing keepalives looks
alive forever.
This PR adds two knobs to `ChatOpenAI`, both on by default with
opt-outs:
- `stream_chunk_timeout` (default 120s): wraps the async streaming
iterator in `asyncio.wait_for` per chunk. Measures the gap between
*parsed* SSE chunks, so keepalives don't reset it. Fires on genuine
content silence and raises `StreamChunkTimeoutError` — a `TimeoutError`
subclass carrying `timeout_s`, `model_name`, and `chunks_received` as
structured attributes (mirrored in the WARNING log's `extra=`) for
alerting without message-regex. Override with the kwarg or
`LANGCHAIN_OPENAI_STREAM_CHUNK_TIMEOUT_S`.
- `http_socket_options`: applies `SO_KEEPALIVE` + `TCP_KEEPIDLE` /
`TCP_KEEPINTVL` / `TCP_KEEPCNT` + `TCP_USER_TIMEOUT` on Linux (macOS
equivalents where available). On platforms missing some options, they're
dropped silently and the remaining set still does useful work.
Pool limits are set explicitly on the custom transport to mirror the
`openai` SDK — without that, passing `transport=` to `httpx.AsyncClient`
silently shrinks the connection pool.
## Behavior change
The default-shape proxy-env bypass (above) covers the common enterprise
case. Beyond that:
- Connections that would previously have hung forever will now error out
via `StreamChunkTimeoutError`.
- Users who explicitly opt into `http_socket_options` while also relying
on env proxies will see a one-time `WARNING` and lose env-proxy
auto-detection — the custom transport shadows it. This is the original
shipped behavior, retained for anyone who *wants* socket tuning on top
of an env-proxied setup.
Full opt-outs:
- `stream_chunk_timeout=None` or
`LANGCHAIN_OPENAI_STREAM_CHUNK_TIMEOUT_S=0`
- `http_socket_options=()` or `LANGCHAIN_OPENAI_TCP_KEEPALIVE=0`
- Supply your own `http_client` **and** `http_async_client`.
`http_socket_options` is applied per side: passing only one still leaves
the other side's default builder getting socket options. Supply both (or
combine with `http_socket_options=()`) to take full control.
Unparseable or negative values for the `LANGCHAIN_OPENAI_*` env vars
fall back to the default with a `WARNING` log rather than silently being
accepted, so a misconfigured environment still boots but the fallback is
discoverable.
---------
Co-authored-by: Mason Daugherty <github@mdrxy.com>
Co-authored-by: Mason Daugherty <mason@langchain.dev>
CVE-2025-71176 (medium severity)
All are dev-only (test dependency group) — no impact on published
packages.
### Why syrupy was also bumped
syrupy 4.x (`<5.0.0`) constrains pytest to `<9.0.0`, blocking the CVE
fix. Widening to `<6.0.0` allows syrupy 5.x which supports pytest 9.x.
Fix broken VCR cassette playback in `langchain-openai` integration tests
and add a CI job to prevent regressions. Two independent bugs made all
VCR-backed tests fail: `before_record_request` redacts URIs to
`**REDACTED**` but `match_on` still included `uri` (so playback never
matched), and a typo-fix commit (`c9f51aef85`) changed test input
strings without re-recording cassettes (so `json_body` matching also
failed).
Python's `or` operator treats `0` as falsy, so
`token_usage.get("total_tokens") or fallback` silently replaces a
provider-reported `total_tokens=0` with the computed sum of input +
output tokens. Providers can legitimately report zero tokens (e.g.,
cached responses, empty completions).
The same pattern exists in the dual-key lookups for
`input_tokens`/`output_tokens` in Groq and OpenRouter. While current
APIs don't return both key formats simultaneously (making the `or`-chain
functionally correct today), the semantics are still wrong; `0` should
not fall through to a fallback.
## Changes
- Replace `x.get(key) or fallback` with explicit `is not None` checks in
`_create_usage_metadata` across `langchain-openai`, `langchain-groq`,
and `langchain-openrouter` for `input_tokens`, `output_tokens`, and
`total_tokens`
- Fix a concrete bug in the `total_tokens` path: a provider-reported `0`
was silently replaced by the computed sum
- Harden dual-key lookups in Groq and OpenRouter to correctly preserve
zero values from the preferred key, should both key formats ever coexist
- Update OpenAI's single-key extraction for consistency — the old `or 0`
pattern happened to produce correct results (`0 or 0 == 0`) but was
semantically wrong
During an automated code review of .github/scripts/get_min_versions.py,
the following issue was identified. Set a timeout on get min versions
HTTP calls. Network calls without a timeout can hang a worker
indefinitely. I kept the patch small and re-ran syntax checks after
applying it.
Issues with combining flex and nano
```shell
FAILED tests/integration_tests/chat_models/test_base.py::test_openai_invoke - openai.InternalServerError: Error code: 500 - {'error': {'message': 'The server had an error while processing your request. Sorry about that!', 'type': 'server_error', 'param': None, 'code': None}}
FAILED tests/integration_tests/chat_models/test_base.py::test_stream - openai.InternalServerError: Error code: 500 - {'error': {'message': 'The server had an error processing your request. Sorry about that! You can retry your request, or contact us through our help center at help.openai.com if you keep seeing this error. (Please include the request ID req_e726769d95994fd4bccbe55680a35f59 in your email.)', 'type': 'server_error', 'param': None, 'code': None}}
FAILED tests/integration_tests/chat_models/test_base.py::test_flex_usage_responses[False] - openai.InternalServerError: Error code: 500 - {'error': {'message': 'An error occurred while processing your request. You can retry your request, or contact us through our help center at help.openai.com if the error persists. Please include the request ID req_935316418319494d8682e4adcd67ab47 in your message.', 'type': 'server_error', 'param': None, 'code': 'server_error'}}
FAILED tests/integration_tests/chat_models/test_base.py::test_flex_usage_responses[True] - openai.APIError: An error occurred while processing your request. You can retry your request, or contact us through our help center at help.openai.com if the error persists. Please include the request ID req_f3c164d0d1f045a5a0f5965ab5c253bf in your message.
```
Use of the fixture `_base_vcr_config` is deprecated with alternative
function `base_vcr_config()`
This way:
* we don't need to import `_base_vcr_config` seen as unused (which leads
to ruff violations PLC0414 and F811)
* we don't need to make a copy since a new dict is created at each
function invocation
Co-authored-by: Mason Daugherty <mason@langchain.dev>
Fixed a bug where GPT-5 temperature validation was case-sensitive,
causing issues when users
specified Azure deployment names or model names in uppercase (e.g.,
`"GPT-5-2025-01-01"`, `"GPT-5-NANO"`). The validation now correctly
handles model names regardless of case.
Changes made:
- Updated `validate_temperature()` method in `BaseChatOpenAI` to perform
case-insensitive
model name comparisons
- Updated `_get_encoding_model()` method to use case-insensitive checks
for tiktoken encoder
selection
- Added comprehensive unit tests to verify case-insensitive behavior
with various case
combinations
**Issue:** Fixes#34003
**Dependencies:** None
**Test Coverage:**
- All existing tests pass
- New test `test_gpt_5_temperature_case_insensitive` covers uppercase,
lowercase, and
mixed-case model names
- Tests verify both non-chat GPT-5 models (temperature removed) and chat
models (temperature
preserved)
- Lint and format checks pass (`make lint`, `make format`)
---------
Co-authored-by: Mason Daugherty <github@mdrxy.com>
Now returns (`_iter`, `tokens`, `indices`, token_counts`). The
`token_counts` are calculated directly during tokenization, which is
more accurate and efficient than splitting strings later.