Following https://github.com/langchain-ai/langchain/pull/30909: need to
retain "empty" reasoning output when streaming, e.g.,
```python
{'id': 'rs_...', 'summary': [], 'type': 'reasoning'}
```
Tested by existing integration tests, which are currently failing.
Chat models currently implement support for:
- images in OpenAI Chat Completions format
- other multimodal types (e.g., PDF and audio) in a cross-provider
[standard
format](https://python.langchain.com/docs/how_to/multimodal_inputs/)
Here we update core to extend support to PDF and audio input in Chat
Completions format. **If an OAI-format PDF or audio content block is
passed into any chat model, it will be transformed to the LangChain
standard format**. We assume that any chat model supporting OAI-format
PDF or audio has implemented support for the standard format.
**langchain_openai: Support of reasoning summary streaming**
**Description:**
OpenAI API now supports streaming reasoning summaries for reasoning
models (o1, o3, o3-mini, o4-mini). More info about it:
https://platform.openai.com/docs/guides/reasoning#reasoning-summaries
It is supported only in Responses API (not Completion API), so you need
to create LangChain Open AI model as follows to support reasoning
summaries streaming:
```
llm = ChatOpenAI(
model="o4-mini", # also o1, o3, o3-mini support reasoning streaming
use_responses_api=True, # reasoning streaming works only with responses api, not completion api
model_kwargs={
"reasoning": {
"effort": "high", # also "low" and "medium" supported
"summary": "auto" # some models support "concise" summary, some "detailed", but auto will always work
}
}
)
```
Now, if you stream events from llm:
```
async for event in llm.astream_events(prompt, version="v2"):
print(event)
```
or
```
for chunk in llm.stream(prompt):
print (chunk)
```
OpenAI API will send you new types of events:
`response.reasoning_summary_text.added`
`response.reasoning_summary_text.delta`
`response.reasoning_summary_text.done`
These events are new, so they were ignored. So I have added support of
these events in function `_convert_responses_chunk_to_generation_chunk`,
so reasoning chunks or full reasoning added to the chunk
additional_kwargs.
Example of how this reasoning summary may be printed:
```
async for event in llm.astream_events(prompt, version="v2"):
if event["event"] == "on_chat_model_stream":
chunk: AIMessageChunk = event["data"]["chunk"]
if "reasoning_summary_chunk" in chunk.additional_kwargs:
print(chunk.additional_kwargs["reasoning_summary_chunk"], end="")
elif "reasoning_summary" in chunk.additional_kwargs:
print("\n\nFull reasoning step summary:", chunk.additional_kwargs["reasoning_summary"])
elif chunk.content and chunk.content[0]["type"] == "text":
print(chunk.content[0]["text"], end="")
```
or
```
for chunk in llm.stream(prompt):
if "reasoning_summary_chunk" in chunk.additional_kwargs:
print(chunk.additional_kwargs["reasoning_summary_chunk"], end="")
elif "reasoning_summary" in chunk.additional_kwargs:
print("\n\nFull reasoning step summary:", chunk.additional_kwargs["reasoning_summary"])
elif chunk.content and chunk.content[0]["type"] == "text":
print(chunk.content[0]["text"], end="")
```
---------
Co-authored-by: Chester Curme <chester.curme@gmail.com>
When calling `embed_documents` and providing a `chunk_size` argument,
that argument is ignored when `OpenAIEmbeddings` is instantiated with
its default configuration (where `check_embedding_ctx_length=True`).
`_get_len_safe_embeddings` specifies a `chunk_size` parameter but it's
not being passed through in `embed_documents`, which is its only caller.
This appears to be an oversight, especially given that the
`_get_len_safe_embeddings` docstring states it should respect "the set
embedding context length and chunk size."
Developers typically expect method parameters to take effect (also, take
precedence) when explicitly provided, especially when instantiating
using defaults. I was confused as to why my API calls were being
rejected regardless of the chunk size I provided.
This bug also exists in langchain_community package. I can add that to
this PR if requested otherwise I will create a new one once this passes.
Generally, this PR is CI performance focused + aims to clean up some
dependencies at the same time.
1. Unpins upper bounds for `numpy` in all `pyproject.toml` files where
`numpy` is specified
2. Requires `numpy >= 2.1.0` for Python 3.13 and `numpy > v1.26.0` for
Python 3.12, plus a `numpy` min version bump for `chroma`
3. Speeds up CI by minutes - linting on Python 3.13, installing `numpy <
2.1.0` was taking [~3
minutes](https://github.com/langchain-ai/langchain/actions/runs/14316342925/job/40123305868?pr=30713),
now the entire env setup takes a few seconds
4. Deleted the `numpy` test dependency from partners where that was not
used, specifically `huggingface`, `voyageai`, `xai`, and `nomic`.
It's a bit unfortunate that `langchain-community` depends on `numpy`, we
might want to try to fix that in the future...
Closes https://github.com/langchain-ai/langchain/issues/26026
Fixes https://github.com/langchain-ai/langchain/issues/30555
Related to https://github.com/langchain-ai/langchain/issues/30344https://github.com/langchain-ai/langchain/pull/30542 introduced an
erroneous test for token counts for o-series models. tiktoken==0.8 does
not support o-series models in
`tiktoken.encoding_for_model(model_name)`, and this is the version of
tiktoken we had in the lock file. So we would default to `cl100k_base`
for o-series, which is the wrong encoding model. The test tested against
this wrong encoding (so it passed with tiktoken 0.8).
Here we update tiktoken to 0.9 in the lock file, and fix the expected
counts in the test. Verified that we are pulling
[o200k_base](https://github.com/openai/tiktoken/blob/main/tiktoken/model.py#L8),
as expected.
When OpenAI originally released `stream_options` to enable token usage
during streaming, it was not supported in AzureOpenAI. It is now
supported.
Like the [OpenAI
SDK](f66d2e6fdc/src/openai/resources/completions.py (L68)),
ChatOpenAI does not return usage metadata during streaming by default
(which adds an extra chunk to the stream). The OpenAI SDK requires users
to pass `stream_options={"include_usage": True}`. ChatOpenAI implements
a convenience argument `stream_usage: Optional[bool]`, and an attribute
`stream_usage: bool = False`.
Here we extend this to AzureChatOpenAI by moving the `stream_usage`
attribute and `stream_usage` kwarg (on `_(a)stream`) from ChatOpenAI to
BaseChatOpenAI.
---
Additional consideration: we must be sensitive to the number of users
using BaseChatOpenAI to interact with other APIs that do not support the
`stream_options` parameter.
Suppose OpenAI in the future updates the default behavior to stream
token usage. Currently, BaseChatOpenAI only passes `stream_options` if
`stream_usage` is True, so there would be no way to disable this new
default behavior.
To address this, we could update the `stream_usage` attribute to
`Optional[bool] = None`, but this is technically a breaking change (as
currently values of False are not passed to the client). IMO: if / when
this change happens, we could accompany it with this update in a minor
bump.
---
Related previous PRs:
- https://github.com/langchain-ai/langchain/pull/22628
- https://github.com/langchain-ai/langchain/pull/22854
- https://github.com/langchain-ai/langchain/pull/23552
---------
Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>
**Description:**
This PR fixes a minor typo in the comments within
`libs/partners/openai/langchain_openai/chat_models/base.py`. The word
"ben" has been corrected to "be" for clarity and professionalism.
**Issue:**
N/A
**Dependencies:**
None
- Support thinking blocks in core's `convert_to_openai_messages` (pass
through instead of error)
- Ignore thinking blocks in ChatOpenAI (instead of error)
- Support Anthropic-style image blocks in ChatOpenAI
---
Standard integration tests include a `supports_anthropic_inputs`
property which is currently enabled only for tests on `ChatAnthropic`.
This test enforces compatibility with message histories of the form:
```
- system message
- human message
- AI message with tool calls specified only through `tool_use` content blocks
- human message containing `tool_result` and an additional `text` block
```
It additionally checks support for Anthropic-style image inputs if
`supports_image_inputs` is enabled.
Here we change this test, such that if you enable
`supports_anthropic_inputs`:
- You support AI messages with text and `tool_use` content blocks
- You support Anthropic-style image inputs (if `supports_image_inputs`
is enabled)
- You support thinking content blocks.
That is, we add a test case for thinking content blocks, but we also
remove the requirement of handling tool results within HumanMessages
(motivated by existing agent abstractions, which should all return
ToolMessage). We move that requirement to a ChatAnthropic-specific test.
- **Description:** Fix typo in code samples for max_tokens_for_prompt.
Code blocks had singular "token" but the method has plural "tokens".
- **Issue:** N/A
- **Dependencies:** N/A
- **Twitter handle:** N/A
Resolves https://github.com/langchain-ai/langchain/issues/29003,
https://github.com/langchain-ai/langchain/issues/27264
Related: https://github.com/langchain-ai/langchain-redis/issues/52
```python
from langchain.chat_models import init_chat_model
from langchain.globals import set_llm_cache
from langchain_community.cache import SQLiteCache
from pydantic import BaseModel
cache = SQLiteCache()
set_llm_cache(cache)
class Temperature(BaseModel):
value: int
city: str
llm = init_chat_model("openai:gpt-4o-mini")
structured_llm = llm.with_structured_output(Temperature)
```
```python
# 681 ms
response = structured_llm.invoke("What is the average temperature of Rome in May?")
```
```python
# 6.98 ms
response = structured_llm.invoke("What is the average temperature of Rome in May?")
```
Some o-series models will raise a 400 error for `"role": "system"`
(`o1-mini` and `o1-preview` will raise, `o1` and `o3-mini` will not).
Here we update `ChatOpenAI` to update the role to `"developer"` for all
model names matching `^o\d`.
We only make this change on the ChatOpenAI class (not BaseChatOpenAI).
**Description:** Update docstring for `reasoning_effort` argument to
specify that it applies to reasoning models only (e.g., OpenAI o1 and
o3-mini), clarifying its supported models.
**Issue:** None
**Dependencies:** None