docs(openai): add comprehensive documentation and examples for extra_body + others (#32149)

This PR addresses the common issue where users struggle to pass custom
parameters to OpenAI-compatible APIs like LM Studio, vLLM, and others.
The problem occurs when users try to use `model_kwargs` for custom
parameters, which causes API errors.

## Problem

Users attempting to pass custom parameters (like LM Studio's `ttl`
parameter) were getting errors:

```python
#  This approach fails
llm = ChatOpenAI(
    base_url="http://localhost:1234/v1",
    model="mlx-community/QwQ-32B-4bit",
    model_kwargs={"ttl": 5}  # Causes TypeError: unexpected keyword argument 'ttl'
)
```

## Solution

The `extra_body` parameter is the correct way to pass custom parameters
to OpenAI-compatible APIs:

```python
#  This approach works correctly
llm = ChatOpenAI(
    base_url="http://localhost:1234/v1",
    model="mlx-community/QwQ-32B-4bit",
    extra_body={"ttl": 5}  # Custom parameters go in extra_body
)
```

## Changes Made

1. **Enhanced Documentation**: Updated the `extra_body` parameter
docstring with comprehensive examples for LM Studio, vLLM, and other
providers

2. **Added Documentation Section**: Created a new "OpenAI-compatible
APIs" section in the main class docstring with practical examples

3. **Unit Tests**: Added tests to verify `extra_body` functionality
works correctly:
- `test_extra_body_parameter()`: Verifies custom parameters are included
in request payload
- `test_extra_body_with_model_kwargs()`: Ensures `extra_body` and
`model_kwargs` work together

4. **Clear Guidance**: Documented when to use `extra_body` vs
`model_kwargs`

## Examples Added

**LM Studio with TTL (auto-eviction):**
```python
ChatOpenAI(
    base_url="http://localhost:1234/v1",
    api_key="lm-studio",
    model="mlx-community/QwQ-32B-4bit",
    extra_body={"ttl": 300}  # Auto-evict after 5 minutes
)
```

**vLLM with custom sampling:**
```python
ChatOpenAI(
    base_url="http://localhost:8000/v1",
    api_key="EMPTY",
    model="meta-llama/Llama-2-7b-chat-hf",
    extra_body={
        "use_beam_search": True,
        "best_of": 4
    }
)
```

## Why This Works

- `model_kwargs` parameters are passed directly to the OpenAI client's
`create()` method, causing errors for non-standard parameters
- `extra_body` parameters are included in the HTTP request body, which
is exactly what OpenAI-compatible APIs expect for custom parameters

Fixes #32115.

<!-- START COPILOT CODING AGENT TIPS -->
---

💬 Share your feedback on Copilot coding agent for the chance to win a
$200 gift card! Click
[here](https://survey.alchemer.com/s3/8343779/Copilot-Coding-agent) to
start the survey.

---------

Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com>
Co-authored-by: mdrxy <61371264+mdrxy@users.noreply.github.com>
Co-authored-by: Mason Daugherty <github@mdrxy.com>
Co-authored-by: Mason Daugherty <mason@langchain.dev>
This commit is contained in:
Copilot
2025-07-24 16:43:16 -04:00
committed by GitHub
parent 7d2a13f519
commit 54542b9385
17 changed files with 237 additions and 119 deletions

View File

@@ -553,7 +553,22 @@ class BaseChatOpenAI(BaseChatModel):
"""Default stop sequences."""
extra_body: Optional[Mapping[str, Any]] = None
"""Optional additional JSON properties to include in the request parameters when
making requests to OpenAI compatible APIs, such as vLLM."""
making requests to OpenAI compatible APIs, such as vLLM, LM Studio, or other
providers.
This is the recommended way to pass custom parameters that are specific to your
OpenAI-compatible API provider but not part of the standard OpenAI API.
Examples:
- LM Studio TTL parameter: ``extra_body={"ttl": 300}``
- vLLM custom parameters: ``extra_body={"use_beam_search": True}``
- Any other provider-specific parameters
.. note::
Do NOT use ``model_kwargs`` for custom parameters that are not part of the
standard OpenAI API, as this will cause errors when making API calls. Use
``extra_body`` instead.
"""
include_response_headers: bool = False
"""Whether to include response headers in the output message response_metadata."""
disabled_params: Optional[dict[str, Any]] = Field(default=None)
@@ -579,11 +594,11 @@ class BaseChatOpenAI(BaseChatModel):
Supported values:
- ``"file_search_call.results"``
- ``"message.input_image.image_url"``
- ``"computer_call_output.output.image_url"``
- ``"reasoning.encrypted_content"``
- ``"code_interpreter_call.outputs"``
- ``'file_search_call.results'``
- ``'message.input_image.image_url'``
- ``'computer_call_output.output.image_url'``
- ``'reasoning.encrypted_content'``
- ``'code_interpreter_call.outputs'``
.. versionadded:: 0.3.24
"""
@@ -658,8 +673,8 @@ class BaseChatOpenAI(BaseChatModel):
Supported values:
- ``"v0"``: AIMessage format as of langchain-openai 0.3.x.
- ``"responses/v1"``: Formats Responses API output
- ``'v0'``: AIMessage format as of langchain-openai 0.3.x.
- ``'responses/v1'``: Formats Responses API output
items into AIMessage content blocks.
Currently only impacts the Responses API. ``output_version="responses/v1"`` is
@@ -1560,8 +1575,9 @@ class BaseChatOpenAI(BaseChatModel):
Assumes model is compatible with OpenAI function-calling API.
NOTE: Using bind_tools is recommended instead, as the `functions` and
`function_call` request parameters are officially marked as deprecated by
.. note::
Using ``bind_tools()`` is recommended instead, as the ``functions`` and
``function_call`` request parameters are officially marked as deprecated by
OpenAI.
Args:
@@ -1622,10 +1638,10 @@ class BaseChatOpenAI(BaseChatModel):
:meth:`langchain_core.utils.function_calling.convert_to_openai_tool`.
tool_choice: Which tool to require the model to call. Options are:
- str of the form ``"<<tool_name>>"``: calls <<tool_name>> tool.
- ``"auto"``: automatically selects a tool (including no tool).
- ``"none"``: does not call a tool.
- ``"any"`` or ``"required"`` or ``True``: force at least one tool to be called.
- str of the form ``'<<tool_name>>'``: calls <<tool_name>> tool.
- ``'auto'``: automatically selects a tool (including no tool).
- ``'none'``: does not call a tool.
- ``'any'`` or ``'required'`` or ``True``: force at least one tool to be called.
- dict of the form ``{"type": "function", "function": {"name": <<tool_name>>}}``: calls <<tool_name>> tool.
- ``False`` or ``None``: no effect, default OpenAI behavior.
strict: If True, model output is guaranteed to exactly match the JSON Schema
@@ -1760,12 +1776,12 @@ class BaseChatOpenAI(BaseChatModel):
tools:
A list of tool-like objects to bind to the chat model. Requires that:
- ``method`` is ``"json_schema"`` (default).
- ``method`` is ``'json_schema'`` (default).
- ``strict=True``
- ``include_raw=True``
If a model elects to call a
tool, the resulting ``AIMessage`` in ``"raw"`` will include tool calls.
tool, the resulting ``AIMessage`` in ``'raw'`` will include tool calls.
.. dropdown:: Example
@@ -2628,6 +2644,91 @@ class ChatOpenAI(BaseChatOpenAI): # type: ignore[override]
See OpenAI `docs <https://platform.openai.com/docs/guides/flex-processing>`_
for more detail.
.. dropdown:: OpenAI-compatible APIs
``ChatOpenAI`` can be used with OpenAI-compatible APIs like LM Studio, vLLM,
Ollama, and others. To use custom parameters specific to these providers,
use the ``extra_body`` parameter.
**LM Studio example** with TTL (auto-eviction):
.. code-block:: python
from langchain_openai import ChatOpenAI
llm = ChatOpenAI(
base_url="http://localhost:1234/v1",
api_key="lm-studio", # Can be any string
model="mlx-community/QwQ-32B-4bit",
temperature=0,
extra_body={
"ttl": 300
}, # Auto-evict model after 5 minutes of inactivity
)
**vLLM example** with custom parameters:
.. code-block:: python
llm = ChatOpenAI(
base_url="http://localhost:8000/v1",
api_key="EMPTY",
model="meta-llama/Llama-2-7b-chat-hf",
extra_body={"use_beam_search": True, "best_of": 4},
)
.. dropdown:: model_kwargs vs extra_body
Use the correct parameter for different types of API arguments:
**Use `model_kwargs` for:**
- Standard OpenAI API parameters not explicitly defined as class parameters
- Parameters that should be flattened into the top-level request payload
- Examples: ``max_completion_tokens``, ``stream_options``, ``modalities``, ``audio``
.. code-block:: python
# Standard OpenAI parameters
llm = ChatOpenAI(
model="gpt-4o",
model_kwargs={
"stream_options": {"include_usage": True},
"max_completion_tokens": 300,
"modalities": ["text", "audio"],
"audio": {"voice": "alloy", "format": "wav"},
},
)
**Use `extra_body` for:**
- Custom parameters specific to OpenAI-compatible providers (vLLM, LM Studio, etc.)
- Parameters that need to be nested under ``extra_body`` in the request
- Any non-standard OpenAI API parameters
.. code-block:: python
# Custom provider parameters
llm = ChatOpenAI(
base_url="http://localhost:8000/v1",
model="custom-model",
extra_body={
"use_beam_search": True, # vLLM parameter
"best_of": 4, # vLLM parameter
"ttl": 300, # LM Studio parameter
},
)
**Key Differences:**
- ``model_kwargs``: Parameters are **merged into top-level** request payload
- ``extra_body``: Parameters are **nested under ``extra_body``** key in request
.. important::
Always use ``extra_body`` for custom parameters, **not** ``model_kwargs``.
Using ``model_kwargs`` for non-OpenAI parameters will cause API errors.
""" # noqa: E501
max_tokens: Optional[int] = Field(default=None, alias="max_completion_tokens")
@@ -2780,17 +2881,17 @@ class ChatOpenAI(BaseChatOpenAI): # type: ignore[override]
If schema is specified via TypedDict or JSON schema, ``strict`` is not
enabled by default. Pass ``strict=True`` to enable it.
Note: ``strict`` can only be non-null if ``method`` is
``"json_schema"`` or ``"function_calling"``.
.. note::
``strict`` can only be non-null if ``method`` is ``'json_schema'`` or ``'function_calling'``.
tools:
A list of tool-like objects to bind to the chat model. Requires that:
- ``method`` is ``"json_schema"`` (default).
- ``method`` is ``'json_schema'`` (default).
- ``strict=True``
- ``include_raw=True``
If a model elects to call a
tool, the resulting ``AIMessage`` in ``"raw"`` will include tool calls.
tool, the resulting ``AIMessage`` in ``'raw'`` will include tool calls.
.. dropdown:: Example