Commit Graph

178 Commits

Author SHA1 Message Date
ccurme
45a067509f
fix(core): fix tracing for PDFs in v1 messages (#32434) 2025-08-11 12:18:32 -04:00
Chester Curme
cfe13f673a Merge branch 'master' into wip-v0.4
# Conflicts:
#	libs/core/langchain_core/version.py
#	libs/core/pyproject.toml
#	libs/core/uv.lock
#	libs/partners/openai/tests/integration_tests/chat_models/test_responses_api.py
#	libs/partners/openai/uv.lock
2025-08-08 09:04:57 -04:00
ccurme
02001212b0
fix(openai): revert some changes (#32462)
Keep coverage on `output_version="v0"` (increasing coverage is being
managed in v0.4 branch).
2025-08-08 08:51:18 -04:00
Mason Daugherty
00244122bd
feat(openai): minimal and verbosity (#32455) 2025-08-08 02:24:21 +00:00
Mason Daugherty
5599c59d4a
chore: formatting across codebase (#32456)
To prevent polluting future PRs
2025-08-07 22:09:26 -04:00
ccurme
ec2b34a02d
feat(openai): custom tools (#32449) 2025-08-07 16:30:01 -04:00
Mason Daugherty
cbf4c0e565
Merge branch 'master' into wip-v0.4 2025-08-07 15:33:12 -04:00
Mason Daugherty
145d38f7dd
test(openai): add tests for prompt_cache_key parameter and update docs (#32363)
Introduce tests to validate the behavior and inclusion of the
`prompt_cache_key` parameter in request payloads for the `ChatOpenAI`
model.
2025-08-07 15:29:47 -04:00
ccurme
e02eed5489
feat: standard outputs (#32287)
Co-authored-by: Mason Daugherty <mason@langchain.dev>
Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>
Co-authored-by: Mason Daugherty <github@mdrxy.com>
Co-authored-by: Nuno Campos <nuno@langchain.dev>
2025-08-05 15:17:32 -04:00
ccurme
a9e52ca605
chore(openai): bump openai sdk (#32322) 2025-07-30 10:58:18 -04:00
Mason Daugherty
e79e0bd6b4
fix(openai): add max_retries parameter to ChatOpenAI for handling 503 capacity errors (#32286)
Some integration tests were failing
2025-07-28 13:58:23 -04:00
niceg
0d6f915442
fix: LLM mimicking Unicode responses due to forced Unicode conversion of non-ASCII characters. (#32222)
fix: Fix LLM mimicking Unicode responses due to forced Unicode
conversion of non-ASCII characters.

- **Description:** This PR fixes an issue where the LLM would mimic
Unicode responses due to forced Unicode conversion of non-ASCII
characters in tool calls. The fix involves disabling the `ensure_ascii`
flag in `json.dumps()` when converting tool calls to OpenAI format.
- **Issue:** Fixes ↓↓↓
input:
```json
{'role': 'assistant', 'tool_calls': [{'type': 'function', 'id': 'call_nv9trcehdpihr21zj9po19vq', 'function': {'name': 'create_customer', 'arguments': '{"customer_name": "你好啊集团"}'}}]}
```
output:
```json
{'role': 'assistant', 'tool_calls': [{'type': 'function', 'id': 'call_nv9trcehdpihr21zj9po19vq', 'function': {'name': 'create_customer', 'arguments': '{"customer_name": "\\u4f60\\u597d\\u554a\\u96c6\\u56e2"}'}}]}
```
then:
llm will mimic outputting unicode. Unicode's vast number of symbols can
lengthen LLM responses, leading to slower performance.
<img width="686" height="277" alt="image"
src="https://github.com/user-attachments/assets/28f3b007-3964-4455-bee2-68f86ac1906d"
/>

---------

Co-authored-by: Mason Daugherty <github@mdrxy.com>
Co-authored-by: Mason Daugherty <mason@langchain.dev>
2025-07-24 17:01:31 -04:00
Copilot
54542b9385
docs(openai): add comprehensive documentation and examples for extra_body + others (#32149)
This PR addresses the common issue where users struggle to pass custom
parameters to OpenAI-compatible APIs like LM Studio, vLLM, and others.
The problem occurs when users try to use `model_kwargs` for custom
parameters, which causes API errors.

## Problem

Users attempting to pass custom parameters (like LM Studio's `ttl`
parameter) were getting errors:

```python
#  This approach fails
llm = ChatOpenAI(
    base_url="http://localhost:1234/v1",
    model="mlx-community/QwQ-32B-4bit",
    model_kwargs={"ttl": 5}  # Causes TypeError: unexpected keyword argument 'ttl'
)
```

## Solution

The `extra_body` parameter is the correct way to pass custom parameters
to OpenAI-compatible APIs:

```python
#  This approach works correctly
llm = ChatOpenAI(
    base_url="http://localhost:1234/v1",
    model="mlx-community/QwQ-32B-4bit",
    extra_body={"ttl": 5}  # Custom parameters go in extra_body
)
```

## Changes Made

1. **Enhanced Documentation**: Updated the `extra_body` parameter
docstring with comprehensive examples for LM Studio, vLLM, and other
providers

2. **Added Documentation Section**: Created a new "OpenAI-compatible
APIs" section in the main class docstring with practical examples

3. **Unit Tests**: Added tests to verify `extra_body` functionality
works correctly:
- `test_extra_body_parameter()`: Verifies custom parameters are included
in request payload
- `test_extra_body_with_model_kwargs()`: Ensures `extra_body` and
`model_kwargs` work together

4. **Clear Guidance**: Documented when to use `extra_body` vs
`model_kwargs`

## Examples Added

**LM Studio with TTL (auto-eviction):**
```python
ChatOpenAI(
    base_url="http://localhost:1234/v1",
    api_key="lm-studio",
    model="mlx-community/QwQ-32B-4bit",
    extra_body={"ttl": 300}  # Auto-evict after 5 minutes
)
```

**vLLM with custom sampling:**
```python
ChatOpenAI(
    base_url="http://localhost:8000/v1",
    api_key="EMPTY",
    model="meta-llama/Llama-2-7b-chat-hf",
    extra_body={
        "use_beam_search": True,
        "best_of": 4
    }
)
```

## Why This Works

- `model_kwargs` parameters are passed directly to the OpenAI client's
`create()` method, causing errors for non-standard parameters
- `extra_body` parameters are included in the HTTP request body, which
is exactly what OpenAI-compatible APIs expect for custom parameters

Fixes #32115.

<!-- START COPILOT CODING AGENT TIPS -->
---

💬 Share your feedback on Copilot coding agent for the chance to win a
$200 gift card! Click
[here](https://survey.alchemer.com/s3/8343779/Copilot-Coding-agent) to
start the survey.

---------

Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com>
Co-authored-by: mdrxy <61371264+mdrxy@users.noreply.github.com>
Co-authored-by: Mason Daugherty <github@mdrxy.com>
Co-authored-by: Mason Daugherty <mason@langchain.dev>
2025-07-24 16:43:16 -04:00
ccurme
de13f6ae4f
fix(openai): support acknowledged safety checks in computer use (#31984) 2025-07-14 07:33:37 -03:00
Mason Daugherty
33c9bf1adc
langchain-openai[patch]: Add ruff bandit rules to linter (#31788) 2025-06-30 14:01:32 -04:00
Andrew Jaeger
0189c50570
openai[fix]: Correctly set usage metadata for OpenAI Responses API (#31756) 2025-06-27 15:35:14 +00:00
ccurme
ea1345a58b
openai[patch]: update cassette (#31752)
Following changes in `openai==1.92`.
2025-06-26 14:52:12 -04:00
ccurme
066be383e3
openai[patch]: update test following release of openai 1.92 (#31751)
Added new required fields for `ResponseFunctionWebSearch`
2025-06-26 18:22:58 +00:00
ccurme
88d5f3edcc
openai[patch]: allow specification of output format for Responses API (#31686) 2025-06-26 13:41:43 -04:00
ccurme
0bf223d6cf
openai[patch]: add attribute to always use previous_response_id (#31734) 2025-06-25 19:01:43 +00:00
ccurme
b268ab6a28
openai[patch]: fix client caching when request_timeout is specified via httpx.Timeout (#31698)
Resolves https://github.com/langchain-ai/langchain/issues/31697
2025-06-23 14:37:49 +00:00
ccurme
6409498f6c
openai[patch]: route to Responses API if relevant attributes are set (#31645)
Following https://github.com/langchain-ai/langchain/pull/30329.
2025-06-17 16:04:38 -04:00
ccurme
c1c3e13a54
openai[patch]: add Responses API attributes to BaseChatOpenAI (#30329)
`reasoning`, `include`, `store`, `truncation`.

Previously these had to be added through `model_kwargs`.
2025-06-17 14:45:50 -04:00
ccurme
b610859633
openai[patch]: support Responses streaming in AzureChatOpenAI (#31641)
Resolves https://github.com/langchain-ai/langchain/issues/31303,
https://github.com/langchain-ai/langchain/issues/31624
2025-06-17 14:41:09 -04:00
ccurme
b9357d456e
openai[patch]: refactor handling of Responses API (#31587) 2025-06-16 14:01:39 -04:00
ccurme
575662d5f1
openai[patch]: accommodate change in image generation API (#31522)
OpenAI changed their API to require the `partial_images` parameter when
using image generation + streaming.

As described in https://github.com/langchain-ai/langchain/pull/31424, we
are ignoring partial images. Here, we accept the `partial_images`
parameter (as required by OpenAI), but emit a warning and continue to
ignore partial images.
2025-06-09 14:57:46 -04:00
ccurme
ece9e31a7a
openai[patch]: VCR some tests (#31524) 2025-06-06 23:00:57 +00:00
ccurme
4cc2f6b807
openai[patch]: guard against None text completions in BaseOpenAI (#31514)
Some chat completions APIs will return null `text` output (even though
this is typed as string).
2025-06-06 09:14:37 -04:00
Eugene Yurtsev
17f34baa88
openai[minor]: add image generation to responses api (#31424)
Does not support partial images during generation at the moment. Before
doing that I'd like to figure out how to specify the aggregation logic
without requiring changes in core.

---------

Co-authored-by: Chester Curme <chester.curme@gmail.com>
2025-06-02 10:03:54 -04:00
ccurme
3db1aa0ba6
standard-tests: migrate to pytest-recording (#31425)
Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>
2025-05-31 15:21:15 -04:00
ccurme
c8951ca124
infra: drop azure from streaming benchmarks (#31421)
Covered by BaseChatOpenAI
2025-05-29 15:06:12 -04:00
ccurme
49eeb0f3c3
standard-tests: add benchmarks (#31302)
Co-authored-by: Sydney Runkle <sydneymarierunkle@gmail.com>
2025-05-29 15:21:37 +00:00
ccurme
ab8b4003be
openai[patch]: add test case for code interpreter (#31383) 2025-05-27 19:11:31 +00:00
ccurme
851fd438cf
openai[patch]: relax Azure llm streaming callback test (#31319)
Effectively reverts
https://github.com/langchain-ai/langchain/pull/29302, but check that
counts are "less than" instead of equal to an expected count.
2025-05-22 16:14:53 +00:00
ccurme
053a1246da
openai[patch]: support built-in code interpreter and remote MCP tools (#31304) 2025-05-22 11:47:57 -04:00
ccurme
beacedd6b3
openai[patch]: update tests for strict schemas (#31306)
Following recent [changes](https://platform.openai.com/docs/changelog).
2025-05-21 22:06:17 +00:00
ccurme
dcb5aba999
openai[patch]: reduce tested constraints on strict schema adherence for Responses API (#31290)
Scheduled testing started failing today because the Responses API
stopped raising `BadRequestError` for a schema that was previously
invalid when `strict=True`.

Although docs still say that [some type-specific keywords are not yet
supported](https://platform.openai.com/docs/guides/structured-outputs#some-type-specific-keywords-are-not-yet-supported)
(including `minimum` and `maximum` for numbers), the below appears to
run and correctly respect the constraints:
```python
import json
import openai

maximums = list(range(1, 11))
arg_values = []
for maximum in maximums:

    tool = {
        "type": "function",
        "name": "magic_function",
        "description": "Applies a magic function to an input.",
        "parameters": {
            "properties": {
                "input": {"maximum": maximum, "minimum": 0, "type": "integer"}
            },
            "required": ["input"],
            "type": "object",
            "additionalProperties": False
        },
        "strict": True
    }
    
    client = openai.OpenAI()
    
    response = client.responses.create(
        model="gpt-4.1",
        input=[{"role": "user", "content": "What is the value of magic_function(3)? Use the tool."}],
        tools=[tool],
    )
    function_call = next(item for item in response.output if item.type == "function_call")
    args = json.loads(function_call.arguments)
    arg_values.append(args["input"])


print(maximums)
print(arg_values)

# [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]
# [1, 2, 3, 3, 3, 3, 3, 3, 3, 3]
```
Until yesterday this raised BadRequestError.

The same is not true of Chat Completions, which appears to still raise
BadRequestError
```python
tool = {
    "type": "function",
    "function": {
        "name": "magic_function",
        "description": "Applies a magic function to an input.",
        "parameters": {
            "properties": {
                "input": {"maximum": 5, "minimum": 0, "type": "integer"}
            },
            "required": ["input"],
            "type": "object",
            "additionalProperties": False
        },
        "strict": True
    }
}

response = client.chat.completions.create(
    model="gpt-4.1",
    messages=[{"role": "user", "content": "What is the value of magic_function(3)? Use the tool."}],
    tools=[tool],
)
response  # raises BadRequestError
```

Here we update tests accordingly.
2025-05-20 14:50:31 +00:00
ccurme
32fcc97a90
openai[patch]: compat with Bedrock Converse (#31280)
ChatBedrockConverse passes through reasoning content blocks in [Bedrock
Converse
format](https://docs.aws.amazon.com/bedrock/latest/APIReference/API_runtime_ContentBlock.html).

Similar to how we handle Anthropic thinking blocks, here we ensure these
are filtered out of OpenAI request payloads.

Resolves https://github.com/langchain-ai/langchain/issues/31279.
2025-05-19 10:35:26 -04:00
ccurme
0b8837a0cc
openai: support runtime kwargs in embeddings (#31195) 2025-05-14 09:14:40 -04:00
ccurme
868cfc4a8f
openai: ignore function_calls if tool_calls are present (#31198)
Some providers include (legacy) function calls in `additional_kwargs` in
addition to tool calls. We currently unpack both function calls and tool
calls if present, but OpenAI will raise 400 in this case.

This can come up if providers are mixed in a tool-calling loop. Example:
```python
from langchain.chat_models import init_chat_model
from langchain_core.messages import HumanMessage
from langchain_core.tools import tool


@tool
def get_weather(location: str) -> str:
    """Get weather at a location."""
    return "It's sunny."



gemini = init_chat_model("google_genai:gemini-2.0-flash-001").bind_tools([get_weather])
openai = init_chat_model("openai:gpt-4.1-mini").bind_tools([get_weather])

input_message = HumanMessage("What's the weather in Boston?")
tool_call_message = gemini.invoke([input_message])

assert len(tool_call_message.tool_calls) == 1
tool_call = tool_call_message.tool_calls[0]
tool_message = get_weather.invoke(tool_call)

response = openai.invoke(  # currently raises 400 / BadRequestError
    [input_message, tool_call_message, tool_message]
)
```

Here we ignore function calls if tool calls are present.
2025-05-12 13:50:56 -04:00
ccurme
94139ffcd3
openai[patch]: format system content blocks for Responses API (#31096)
```python
from langchain_core.messages import HumanMessage, SystemMessage
from langchain_openai import ChatOpenAI


llm = ChatOpenAI(model="gpt-4.1", use_responses_api=True)

messages = [
    SystemMessage("test"),                                   # Works
    HumanMessage("test"),                                    # Works
    SystemMessage([{"type": "text", "text": "test"}]),       # Bug in this case
    HumanMessage([{"type": "text", "text": "test"}]),        # Works
    SystemMessage([{"type": "input_text", "text": "test"}])  # Works
]

llm._get_request_payload(messages)
```
2025-05-02 15:22:30 +00:00
ccurme
26ad239669
core, openai[patch]: prefer provider-assigned IDs when aggregating message chunks (#31080)
When aggregating AIMessageChunks in a stream, core prefers the leftmost
non-null ID. This is problematic because:
- Core assigns IDs when they are null to `f"run-{run_manager.run_id}"`
- The desired meaningful ID might not be available until midway through
the stream, as is the case for the OpenAI Responses API.

For the OpenAI Responses API, we assign message IDs to the top-level
`AIMessage.id`. This works in `.(a)invoke`, but during `.(a)stream` the
IDs get overwritten by the defaults assigned in langchain-core. These
IDs
[must](https://community.openai.com/t/how-to-solve-badrequesterror-400-item-rs-of-type-reasoning-was-provided-without-its-required-following-item-error-in-responses-api/1151686/9)
be available on the AIMessage object to support passing reasoning items
back to the API (e.g., if not using OpenAI's `previous_response_id`
feature). We could add them elsewhere, but seeing as we've already made
the decision to store them in `.id` during `.(a)invoke`, addressing the
issue in core lets us fix the problem with no interface changes.
2025-05-02 11:18:18 -04:00
ccurme
c51eadd54f
openai[patch]: propagate service_tier to response metadata (#31089) 2025-05-01 13:50:48 -04:00
ccurme
629b7a5a43
openai[patch]: add explicit attribute for service tier (#31005) 2025-04-25 18:38:23 +00:00
ccurme
a7903280dd
openai[patch]: delete redundant tests (#31004)
These are covered by standard tests.
2025-04-24 17:56:32 +00:00
ccurme
faef3e5d50
core, standard-tests: support PDF and audio input in Chat Completions format (#30979)
Chat models currently implement support for:
- images in OpenAI Chat Completions format
- other multimodal types (e.g., PDF and audio) in a cross-provider
[standard
format](https://python.langchain.com/docs/how_to/multimodal_inputs/)

Here we update core to extend support to PDF and audio input in Chat
Completions format. **If an OAI-format PDF or audio content block is
passed into any chat model, it will be transformed to the LangChain
standard format**. We assume that any chat model supporting OAI-format
PDF or audio has implemented support for the standard format.
2025-04-23 18:32:51 +00:00
ccurme
a7c1bccd6a
openai[patch]: remove xfails from image token counting tests (#30963)
These appear to be passing again.
2025-04-22 15:55:33 +00:00
Dmitrii Rashchenko
a43df006de
Support of openai reasoning summary streaming (#30909)
**langchain_openai: Support of reasoning summary streaming**

**Description:**
OpenAI API now supports streaming reasoning summaries for reasoning
models (o1, o3, o3-mini, o4-mini). More info about it:
https://platform.openai.com/docs/guides/reasoning#reasoning-summaries

It is supported only in Responses API (not Completion API), so you need
to create LangChain Open AI model as follows to support reasoning
summaries streaming:

```
llm = ChatOpenAI(
    model="o4-mini", # also o1, o3, o3-mini support reasoning streaming
    use_responses_api=True,  # reasoning streaming works only with responses api, not completion api
    model_kwargs={
        "reasoning": {
            "effort": "high",  # also "low" and "medium" supported
            "summary": "auto"  # some models support "concise" summary, some "detailed", but auto will always work
        }
    }
)
```

Now, if you stream events from llm:

```
async for event in llm.astream_events(prompt, version="v2"):
    print(event)
```

or

```
for chunk in llm.stream(prompt):
    print (chunk)
```

OpenAI API will send you new types of events:
`response.reasoning_summary_text.added`
`response.reasoning_summary_text.delta`
`response.reasoning_summary_text.done`

These events are new, so they were ignored. So I have added support of
these events in function `_convert_responses_chunk_to_generation_chunk`,
so reasoning chunks or full reasoning added to the chunk
additional_kwargs.

Example of how this reasoning summary may be printed:

```
    async for event in llm.astream_events(prompt, version="v2"):
        if event["event"] == "on_chat_model_stream":
            chunk: AIMessageChunk = event["data"]["chunk"]
            if "reasoning_summary_chunk" in chunk.additional_kwargs:
                print(chunk.additional_kwargs["reasoning_summary_chunk"], end="")
            elif "reasoning_summary" in chunk.additional_kwargs:
                print("\n\nFull reasoning step summary:", chunk.additional_kwargs["reasoning_summary"])
            elif chunk.content and chunk.content[0]["type"] == "text":
                print(chunk.content[0]["text"], end="")
```

or

```
    for chunk in llm.stream(prompt):
        if "reasoning_summary_chunk" in chunk.additional_kwargs:
            print(chunk.additional_kwargs["reasoning_summary_chunk"], end="")
        elif "reasoning_summary" in chunk.additional_kwargs:
            print("\n\nFull reasoning step summary:", chunk.additional_kwargs["reasoning_summary"])
        elif chunk.content and chunk.content[0]["type"] == "text":
            print(chunk.content[0]["text"], end="")
```

---------

Co-authored-by: Chester Curme <chester.curme@gmail.com>
2025-04-22 14:51:13 +00:00
Aubrey Ford
b344f34635
partners/openai: OpenAIEmbeddings not respecting chunk_size argument (#30757)
When calling `embed_documents` and providing a `chunk_size` argument,
that argument is ignored when `OpenAIEmbeddings` is instantiated with
its default configuration (where `check_embedding_ctx_length=True`).

`_get_len_safe_embeddings` specifies a `chunk_size` parameter but it's
not being passed through in `embed_documents`, which is its only caller.
This appears to be an oversight, especially given that the
`_get_len_safe_embeddings` docstring states it should respect "the set
embedding context length and chunk size."

Developers typically expect method parameters to take effect (also, take
precedence) when explicitly provided, especially when instantiating
using defaults. I was confused as to why my API calls were being
rejected regardless of the chunk size I provided.

This bug also exists in langchain_community package. I can add that to
this PR if requested otherwise I will create a new one once this passes.
2025-04-18 15:27:27 -04:00
ccurme
add6a78f98
standard-tests, openai[patch]: add support standard audio inputs (#30904) 2025-04-17 10:30:57 -04:00