langchain

mirror of https://github.com/hwchase17/langchain.git synced 2026-06-09 10:17:00 +00:00

Author	SHA1	Message	Date
ccurme	2259d29231	fix(openai): broaden condition for ContextOverflowError to accommodate other providers (#37457 )	2026-05-15 22:03:28 -04:00
Nick Hollon	da380bccf8	chore(infra): merge v1.4 into master (#37350 )	2026-05-11 11:39:25 -07:00
Nick Hollon	9ce72eba9f	feat(core): add content-block-centric streaming (v2) (#36834 )	2026-04-24 11:36:17 -04:00
Asamu David	4000c22376	feat(openai): prevent silent streaming hangs in `ChatOpenAI` (#36949 ) > [!IMPORTANT] > Behavior change on upgrade — minor bump (`1.1.16` → `1.2.0`). > > Streaming calls now raise `StreamChunkTimeoutError` (a `TimeoutError` subclass — existing `except TimeoutError:` / `except asyncio.TimeoutError:` handlers catch it) after 120s of content silence instead of hanging forever. Opt out with `stream_chunk_timeout=None` or `LANGCHAIN_OPENAI_STREAM_CHUNK_TIMEOUT_S=0`. > > Kernel-level TCP keepalive / `TCP_USER_TIMEOUT` are applied via a custom `httpx` transport. `httpx` disables its env-proxy auto-detection (`HTTP_PROXY` / `HTTPS_PROXY` / `ALL_PROXY` / `NO_PROXY` and macOS/Windows system proxy) whenever a transport is supplied, so to avoid silently breaking enterprise proxy users, `ChatOpenAI` now detects the "proxy-env-shadow" shape at construction and skips the custom transport entirely when all of these hold: > > - `http_socket_options` left at default (`None`) > - No `http_client` or `http_async_client` supplied > - No `openai_proxy` supplied > - A proxy env var / system proxy is visible to httpx > > On that shape the instance falls back to pre-PR behavior and env-proxy auto-detection still applies. A one-time `INFO` records the bypass. > > Users who explicitly set `http_socket_options=[...]` alongside an env proxy still get the shadowed behavior with a one-time `WARNING` log — they opted in. Full opt-outs below. --- Streaming chat completions can hang forever when the underlying TCP connection silently dies mid-stream (idle NAT/LB timeouts, sandboxed runtimes killing long-lived connections, peer gone without a FIN or RST). httpx's read timeout doesn't help here because it's reset by any bytes arriving on the socket, including OpenAI's SSE keepalive comments, so a stream that's quiet on content but still producing keepalives looks alive forever. This PR adds two knobs to `ChatOpenAI`, both on by default with opt-outs: - `stream_chunk_timeout` (default 120s): wraps the async streaming iterator in `asyncio.wait_for` per chunk. Measures the gap between parsed SSE chunks, so keepalives don't reset it. Fires on genuine content silence and raises `StreamChunkTimeoutError` — a `TimeoutError` subclass carrying `timeout_s`, `model_name`, and `chunks_received` as structured attributes (mirrored in the WARNING log's `extra=`) for alerting without message-regex. Override with the kwarg or `LANGCHAIN_OPENAI_STREAM_CHUNK_TIMEOUT_S`. - `http_socket_options`: applies `SO_KEEPALIVE` + `TCP_KEEPIDLE` / `TCP_KEEPINTVL` / `TCP_KEEPCNT` + `TCP_USER_TIMEOUT` on Linux (macOS equivalents where available). On platforms missing some options, they're dropped silently and the remaining set still does useful work. Pool limits are set explicitly on the custom transport to mirror the `openai` SDK — without that, passing `transport=` to `httpx.AsyncClient` silently shrinks the connection pool. ## Behavior change The default-shape proxy-env bypass (above) covers the common enterprise case. Beyond that: - Connections that would previously have hung forever will now error out via `StreamChunkTimeoutError`. - Users who explicitly opt into `http_socket_options` while also relying on env proxies will see a one-time `WARNING` and lose env-proxy auto-detection — the custom transport shadows it. This is the original shipped behavior, retained for anyone who wants socket tuning on top of an env-proxied setup. Full opt-outs: - `stream_chunk_timeout=None` or `LANGCHAIN_OPENAI_STREAM_CHUNK_TIMEOUT_S=0` - `http_socket_options=()` or `LANGCHAIN_OPENAI_TCP_KEEPALIVE=0` - Supply your own `http_client` and `http_async_client`. `http_socket_options` is applied per side: passing only one still leaves the other side's default builder getting socket options. Supply both (or combine with `http_socket_options=()`) to take full control. Unparseable or negative values for the `LANGCHAIN_OPENAI_*` env vars fall back to the default with a `WARNING` log rather than silently being accepted, so a misconfigured environment still boots but the fallback is discoverable. --------- Co-authored-by: Mason Daugherty <github@mdrxy.com> Co-authored-by: Mason Daugherty <mason@langchain.dev>	2026-04-22 20:28:43 -04:00
Mason Daugherty	488c6a73bb	fix(openai): tolerate `prompt_cache_retention` drift in streaming (#36925 )	2026-04-21 14:54:32 -04:00
ccurme	19b0805bc1	fix(openai): accommodate dict `response` items in streaming (#36899 )	2026-04-20 15:44:01 -04:00
Thomas	8fec4e7cee	fix(openai): infer azure chat profiles from model name (#36858 )	2026-04-19 11:06:26 -04:00
Mason Daugherty	7e81d09f2a	chore(deps): bump pytest to `9.0.3` (#36801 ) CVE-2025-71176 (medium severity) All are dev-only (test dependency group) — no impact on published packages. ### Why syrupy was also bumped syrupy 4.x (`<5.0.0`) constrains pytest to `<9.0.0`, blocking the CVE fix. Widening to `<6.0.0` allows syrupy 5.x which supports pytest 9.x.	2026-04-15 21:46:40 -06:00
William FH	885f2c2c2d	fix(openai): handle content blocks without type key in responses api conversion (#36725 )	2026-04-14 15:13:40 -04:00
Mason Daugherty	deb85b6c4c	chore(openai): fix broken vcr cassette playback and add ci guard (#36502 ) Fix broken VCR cassette playback in `langchain-openai` integration tests and add a CI job to prevent regressions. Two independent bugs made all VCR-backed tests fail: `before_record_request` redacts URIs to `REDACTED` but `match_on` still included `uri` (so playback never matched), and a typo-fix commit (`c9f51aef85`) changed test input strings without re-recording cassettes (so `json_body` matching also failed).	2026-04-03 12:55:52 -04:00
Mason Daugherty	8c15649127	fix(openai,groq,openrouter): use is-not-None checks in usage metadata token extraction (#36500 ) Python's `or` operator treats `0` as falsy, so `token_usage.get("total_tokens") or fallback` silently replaces a provider-reported `total_tokens=0` with the computed sum of input + output tokens. Providers can legitimately report zero tokens (e.g., cached responses, empty completions). The same pattern exists in the dual-key lookups for `input_tokens`/`output_tokens` in Groq and OpenRouter. While current APIs don't return both key formats simultaneously (making the `or`-chain functionally correct today), the semantics are still wrong; `0` should not fall through to a fallback. ## Changes - Replace `x.get(key) or fallback` with explicit `is not None` checks in `_create_usage_metadata` across `langchain-openai`, `langchain-groq`, and `langchain-openrouter` for `input_tokens`, `output_tokens`, and `total_tokens` - Fix a concrete bug in the `total_tokens` path: a provider-reported `0` was silently replaced by the computed sum - Harden dual-key lookups in Groq and OpenRouter to correctly preserve zero values from the preferred key, should both key formats ever coexist - Update OpenAI's single-key extraction for consistency — the old `or 0` pattern happened to produce correct results (`0 or 0 == 0`) but was semantically wrong	2026-04-03 11:46:36 -04:00
jasiecky	c9f51aef85	fix(core): fixed typos in the documentation (#36459 ) Fixes #36458 Fixed typos in the documentation in the core module.	2026-04-02 11:32:12 -04:00
ccurme	bdfd4462ac	feat(core): impute placeholder filenames for OpenAI file inputs (#36433 )	2026-04-01 14:41:53 -04:00
ccurme	64bbcef37e	fix(openai): update computer call test (#36352 )	2026-03-29 12:59:30 -04:00
ccurme	6d07ef28a7	release(openai): 1.1.12 (#36178 )	2026-03-23 13:06:19 -04:00
ccurme	900f8a3513	fix(openai): support phase parameter (#36161 )	2026-03-22 14:23:24 -04:00
Jackjin	7d05cfb131	fix(openai): preserve namespace field in streaming function_call chunks (#36108 )	2026-03-20 12:51:13 -04:00
Giulio Leone	9e4a6013be	fix(openai): add type: message to Responses API input items (#35693 )	2026-03-15 12:43:16 -04:00
Tejas Attarde	d6dbcf6294	perf(.github): set a timeout on get min versions HTTP calls (#35851 ) During an automated code review of .github/scripts/get_min_versions.py, the following issue was identified. Set a timeout on get min versions HTTP calls. Network calls without a timeout can hang a worker indefinitely. I kept the patch small and re-ran syntax checks after applying it.	2026-03-13 17:24:32 -04:00
Mohammad Mohtashim	3af0bc0141	fix(openai): update responses API model detection for pro and codex models (#35594 )	2026-03-09 09:20:20 -04:00
ccurme	fbfe4b812d	feat(openai): support tool search (#35582 )	2026-03-08 08:53:13 -04:00
Jason Meng	f698b43b9a	fix(openai): avoid PydanticSerializationUnexpectedValue for structured output (#35543 )	2026-03-04 21:46:46 -05:00
John Kennedy	e939c96792	fix: compaction typo (#35467 ) _This PR applies 2/2 suggestions from code quality [AI findings](https://github.com/langchain-ai/langchain/security/quality/ai-findings)._ --------- Co-authored-by: Copilot Autofix powered by AI <223894421+github-code-quality[bot]@users.noreply.github.com>	2026-02-27 19:30:37 +00:00
ccurme	c600b932a6	fix(openai): add test for CSV and accommodate breaking changes in file url inputs (#35454 )	2026-02-26 13:50:39 -05:00
Mattijs Ugen	5c6f8fe0a6	fix(openai): accept valid responses that are falsy at runtime (#35307 )	2026-02-18 21:06:43 -05:00
ccurme	8f1bc0d3ae	feat(openai): support automatic server-side compaction (#35212 )	2026-02-17 10:48:52 -05:00
ccurme	32c6ab3033	fix(openai): add `model` property (#35284 )	2026-02-17 10:46:49 -05:00
ccurme	8e35924083	fix(openai): sanitize chat completions text content blocks (#35217 )	2026-02-15 15:31:02 -05:00
ccurme	7c41298355	feat(core): add ContextOverflowError, raise in anthropic and openai (#35099 )	2026-02-09 15:15:34 -05:00
Guofang.Tang	06a7d079b0	fix(openai): detect codex models for responses api preference (#35058 )	2026-02-08 13:15:48 -05:00
ccurme	19edaa8acb	chore(openai): delete outdated test (#34682 )	2026-01-09 12:37:44 -05:00
OysterMax	92afcaae60	fix(openai): raise proper exception `OpenAIRefusalError` on structured output refusal (#34619 )	2026-01-07 14:34:02 -05:00
ccurme	5ec0fa69de	fix(core): serialization patch (#34455 ) - `allowed_objects` kwarg in `load` - escape lc-ser formatted dicts on `dump` - fix for jinja2 --------- Co-authored-by: Mason Daugherty <github@mdrxy.com>	2025-12-22 17:33:31 -06:00
Saurav Sapkota	f6297ced67	fix(openai): handle function_call content in token counting (#34379 )	2025-12-19 15:17:40 -05:00
ccurme	e9f7cd3e0e	release(openai): 1.1.6: update max input tokens for gpt-5 series (#34419 )	2025-12-18 12:49:59 -05:00
Mason Daugherty	c286c06f16	revert(openai): switch model from nano to mini when using flex (#34394 ) Reverts langchain-ai/langchain#34336	2025-12-16 14:45:19 -05:00
Mason Daugherty	20b8342fdf	test(openai): switch model from nano to mini when using flex (#34336 ) Issues with combining flex and nano ```shell FAILED tests/integration_tests/chat_models/test_base.py::test_openai_invoke - openai.InternalServerError: Error code: 500 - {'error': {'message': 'The server had an error while processing your request. Sorry about that!', 'type': 'server_error', 'param': None, 'code': None}} FAILED tests/integration_tests/chat_models/test_base.py::test_stream - openai.InternalServerError: Error code: 500 - {'error': {'message': 'The server had an error processing your request. Sorry about that! You can retry your request, or contact us through our help center at help.openai.com if you keep seeing this error. (Please include the request ID req_e726769d95994fd4bccbe55680a35f59 in your email.)', 'type': 'server_error', 'param': None, 'code': None}} FAILED tests/integration_tests/chat_models/test_base.py::test_flex_usage_responses[False] - openai.InternalServerError: Error code: 500 - {'error': {'message': 'An error occurred while processing your request. You can retry your request, or contact us through our help center at help.openai.com if the error persists. Please include the request ID req_935316418319494d8682e4adcd67ab47 in your message.', 'type': 'server_error', 'param': None, 'code': 'server_error'}} FAILED tests/integration_tests/chat_models/test_base.py::test_flex_usage_responses[True] - openai.APIError: An error occurred while processing your request. You can retry your request, or contact us through our help center at help.openai.com if the error persists. Please include the request ID req_f3c164d0d1f045a5a0f5965ab5c253bf in your message. ```	2025-12-12 17:17:11 -05:00
Towseef Altaf	0e5e33ba03	fix(openai): correct image resize aspect ratio caps (#34192 )	2025-12-12 14:34:17 -05:00
Christophe Bornet	f5b6eecf72	refactor(standard-tests): improve VCR config (#33968 ) Use of the fixture `_base_vcr_config` is deprecated with alternative function `base_vcr_config()` This way: * we don't need to import `_base_vcr_config` seen as unused (which leads to ruff violations PLC0414 and F811) * we don't need to make a copy since a new dict is created at each function invocation Co-authored-by: Mason Daugherty <mason@langchain.dev>	2025-12-12 10:14:26 -05:00
Jacob Lee	a528ea1796	feat(openai): Use responses API if model is gpt-5.2-pro (#34306 )	2025-12-12 10:11:15 -05:00
j3r0lin	5720dea41b	fix(openai): handle missing 'text' key in responses API content blocks (#34198 )	2025-12-12 09:39:12 -05:00
Jacob Lee	badc0cf1b6	fix(openai): Allow temperature when reasoning is set to the string 'none' (#34298 ) Co-authored-by: Chester Curme <chester.curme@gmail.com>	2025-12-11 15:57:04 -05:00
Marlene	ff3353f02f	fix(openai): Fixing error that comes up using the Responses API with built-in tools and custom tools (#34136 )	2025-12-08 09:10:44 -05:00
Mason Daugherty	7ba3e80057	test(openai): mark `test_structured_output_and_tools` flaky (#34223 ) Often raises `KeyError: 'explanation'`	2025-12-05 11:26:17 -05:00
Abhinav	2ba3ce81a6	fix(openai): make GPT-5 temperature validation case-insensitive (#34012 ) Fixed a bug where GPT-5 temperature validation was case-sensitive, causing issues when users specified Azure deployment names or model names in uppercase (e.g., `"GPT-5-2025-01-01"`, `"GPT-5-NANO"`). The validation now correctly handles model names regardless of case. Changes made: - Updated `validate_temperature()` method in `BaseChatOpenAI` to perform case-insensitive model name comparisons - Updated `_get_encoding_model()` method to use case-insensitive checks for tiktoken encoder selection - Added comprehensive unit tests to verify case-insensitive behavior with various case combinations Issue: Fixes #34003 Dependencies: None Test Coverage: - All existing tests pass - New test `test_gpt_5_temperature_case_insensitive` covers uppercase, lowercase, and mixed-case model names - Tests verify both non-chat GPT-5 models (temperature removed) and chat models (temperature preserved) - Lint and format checks pass (`make lint`, `make format`) --------- Co-authored-by: Mason Daugherty <github@mdrxy.com>	2025-11-23 20:17:03 -05:00
ccurme	33e5d01f7c	feat(model-profiles): distribute data across packages (#34024 )	2025-11-21 15:47:05 -05:00
ccurme	328ba36601	chore(openai): skip Azure text completions tests (#34021 )	2025-11-19 09:29:12 -05:00
Mason Daugherty	099c042395	refactor(openai): embedding utils and calculations (#33982 ) Now returns (`_iter`, `tokens`, `indices`, token_counts`). The `token_counts` are calculated directly during tokenization, which is more accurate and efficient than splitting strings later.	2025-11-14 19:18:37 -05:00
Kaparthy Reddy	2d4f00a451	fix(openai): Respect 300k token limit for embeddings API requests (#33668 ) ## Description Fixes #31227 - Resolves the issue where `OpenAIEmbeddings` exceeds OpenAI's 300,000 token per request limit, causing 400 BadRequest errors. ## Problem When embedding large document sets, LangChain would send batches containing more than 300,000 tokens in a single API request, causing this error: ``` openai.BadRequestError: Error code: 400 - {'error': {'message': 'Requested 673477 tokens, max 300000 tokens per request'}} ``` The issue occurred because: - The code chunks texts by `embedding_ctx_length` (8191 tokens per chunk) - Then batches chunks by `chunk_size` (default 1000 chunks per request) - But didn't check: Total tokens per batch against OpenAI's 300k limit - Result: `1000 chunks × 8191 tokens = 8,191,000 tokens` → Exceeds limit! ## Solution This PR implements dynamic batching that respects the 300k token limit: 1. Added constant: `MAX_TOKENS_PER_REQUEST = 300000` 2. Track token counts: Calculate actual tokens for each chunk 3. Dynamic batching: Instead of fixed `chunk_size` batches, accumulate chunks until approaching the 300k limit 4. Applied to both sync and async: Fixed both `_get_len_safe_embeddings` and `_aget_len_safe_embeddings` ## Changes - Modified `langchain_openai/embeddings/base.py`: - Added `MAX_TOKENS_PER_REQUEST` constant - Replaced fixed-size batching with token-aware dynamic batching - Applied to both sync (line ~478) and async (line ~527) methods - Added test in `tests/unit_tests/embeddings/test_base.py`: - `test_embeddings_respects_token_limit()` - Verifies large document sets are properly batched ## Testing All existing tests pass (280 passed, 4 xfailed, 1 xpassed). New test verifies: - Large document sets (500 texts × 1000 tokens = 500k tokens) are split into multiple API calls - Each API call respects the 300k token limit ## Usage After this fix, users can embed large document sets without errors: ```python from langchain_openai import OpenAIEmbeddings from langchain_chroma import Chroma from langchain_text_splitters import CharacterTextSplitter # This will now work without exceeding token limits embeddings = OpenAIEmbeddings() documents = CharacterTextSplitter().split_documents(large_documents) Chroma.from_documents(documents, embeddings) ``` Resolves #31227 --------- Co-authored-by: Kaparthy Reddy <kaparthyreddy@Kaparthys-MacBook-Air.local> Co-authored-by: Chester Curme <chester.curme@gmail.com> Co-authored-by: Mason Daugherty <mason@langchain.dev> Co-authored-by: Mason Daugherty <github@mdrxy.com>	2025-11-14 18:12:07 -05:00
ccurme	74385e0ebd	fix(langchain, openai): fix create_agent / response_format for Responses API (#33939 )	2025-11-13 10:18:15 -05:00

1 2 3 4 5 ...

264 Commits