Release core 0.3.63
Small update just to expand the list of well known tools. This is
necessary while the logic lives in langchain-core.
---------
Co-authored-by: Chester Curme <chester.curme@gmail.com>
Add image generation tool to the list of well known tools. This is needed for changes in the ChatOpenAI client.
TODO: Some of this logic needs to be moved from core directly into the client as changes in core should not be required to add a new tool to the openai chat client.
* It is possible to chain a `Runnable` with an `AsyncIterator` as seen
in `test_runnable.py`.
* Iterator and AsyncIterator Input/Output of Callables must be put
before `Callable[[Other], Any]` otherwise the pattern matching picks the
latter.
**Issue:**[
#309070](https://github.com/langchain-ai/langchain/issues/30970)
**Cause**
Arg type in python code
```
arg: Union[SubSchema1, SubSchema2]
```
is translated to `anyOf` in **json schema**
```
"anyOf" : [{sub schema 1 ...}, {sub schema 1 ...}]
```
The value of anyOf is a list sub schemas.
The bug is caused since the sub schemas inside `anyOf` list is not taken
care of.
The location where the issue happens is `convert_to_openai_function`
function -> `_recursive_set_additional_properties_false` function, that
recursively adds `"additionalProperties": false` to json schema which is
[required by OpenAI's strict function
calling](https://platform.openai.com/docs/guides/structured-outputs?api-mode=responses#additionalproperties-false-must-always-be-set-in-objects)
**Solution:**
This PR fixes this issue by iterating each sub schema inside `anyOf`
list.
A unit test is added.
**Twitter handle:** shengboma
If no one reviews your PR within a few days, please @-mention one of
baskaryan, eyurtsev, ccurme, vbarda, hwchase17.
---------
Co-authored-by: ccurme <chester.curme@gmail.com>
`aindex` function should check not only `adelete` method, but `delete`
method too
**PR title**: "core: fix async indexing issue with adelete/delete
checking"
**PR message**: Currently `langchain.indexes.aindex` checks if vector
store has overrided adelete method. But due to `adelete` default
implementation store can have just `delete` overrided to make `adelete`
working.
---------
Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>
* Remove unnecessary cast of id -> str (can do with a field setting)
* Remove unnecessary `set_text` model validator (can be done with a
computed field - though we had to make some changes to the `Generation`
class to make this possible
Before: ~2.4s
Blue circles represent time spent in custom validators :(
<img width="1337" alt="Screenshot 2025-05-14 at 10 10 12 AM"
src="https://github.com/user-attachments/assets/bb4f477f-4ee3-4870-ae93-14ca7f197d55"
/>
After: ~2.2s
<img width="1344" alt="Screenshot 2025-05-14 at 10 11 03 AM"
src="https://github.com/user-attachments/assets/99f97d80-49de-462f-856f-9e7e8662adbc"
/>
We still want to optimize the backwards compatible tool calls model
validator, though I think this might involve breaking changes, so wanted
to separate that into a different PR. This is circled in green.
**Description:** Before this commit, if one record is batched in more
than 32k rows for sqlite3 >= 3.32 or more than 999 rows for sqlite3 <
3.31, the `record_manager.delete_keys()` will fail, as we are creating a
query with too many variables.
This commit ensures that we are batching the delete operation leveraging
the `cleanup_batch_size` as it is already done for `full` cleanup.
Added unit tests for incremental mode as well on different deleting
batch size.
1. Removes summation of `ChatGenerationChunk` from hot loops in `stream`
and `astream`
2. Removes run id gen from loop as well (minor impact)
Again, benchmarking on processing ~200k chunks (a poem about broccoli).
Before: ~4.2s
Blue circle is all the time spent adding up gen chunks
<img width="1345" alt="Screenshot 2025-05-14 at 7 48 33 AM"
src="https://github.com/user-attachments/assets/08a59d78-134d-4cd3-9d54-214de689df51"
/>
After: ~2.3s
Blue circle is remaining time spent on adding chunks, which can be
minimized in a future PR by optimizing the `merge_content`,
`merge_dicts`, and `merge_lists` utilities.
<img width="1353" alt="Screenshot 2025-05-14 at 7 50 08 AM"
src="https://github.com/user-attachments/assets/df6b3506-929e-4b6d-b198-7c4e992c6d34"
/>
1. Remove `shielded` decorator from non-end event handlers
2. Exit early with a `self.handlers` check instead of doing unnecessary
asyncio work
Using a benchmark that processes ~200k chunks (a poem about broccoli).
Before: ~15s
Circled in blue is unnecessary event handling time. This is addressed by
point 2 above
<img width="1347" alt="Screenshot 2025-05-14 at 7 37 53 AM"
src="https://github.com/user-attachments/assets/675e0fed-8f37-46c0-90b3-bef3cb9a1e86"
/>
After: ~4.2s
The total time is largely reduced by the removal of the `shielded`
decorator, which holds little significance for non-end handlers.
<img width="1348" alt="Screenshot 2025-05-14 at 7 37 22 AM"
src="https://github.com/user-attachments/assets/54be8a3e-5827-4136-a87b-54b0d40fe331"
/>
**Description**: The 'inspect' package in python skips over the aliases
set in the schema of a pydantic model. This is a workound to include the
aliases from the original input.
**issue**: #31035
Cc: @ccurme @eyurtsev
---------
Co-authored-by: Chester Curme <chester.curme@gmail.com>
When aggregating AIMessageChunks in a stream, core prefers the leftmost
non-null ID. This is problematic because:
- Core assigns IDs when they are null to `f"run-{run_manager.run_id}"`
- The desired meaningful ID might not be available until midway through
the stream, as is the case for the OpenAI Responses API.
For the OpenAI Responses API, we assign message IDs to the top-level
`AIMessage.id`. This works in `.(a)invoke`, but during `.(a)stream` the
IDs get overwritten by the defaults assigned in langchain-core. These
IDs
[must](https://community.openai.com/t/how-to-solve-badrequesterror-400-item-rs-of-type-reasoning-was-provided-without-its-required-following-item-error-in-responses-api/1151686/9)
be available on the AIMessage object to support passing reasoning items
back to the API (e.g., if not using OpenAI's `previous_response_id`
feature). We could add them elsewhere, but seeing as we've already made
the decision to store them in `.id` during `.(a)invoke`, addressing the
issue in core lets us fix the problem with no interface changes.
Follow up to https://github.com/langchain-ai/langsmith-sdk/pull/1696,
I've bumped the `langsmith` version where applicable in `uv.lock`.
Type checking problems here because deps have been updated in
`pyproject.toml` and `uv lock` hasn't been run - we should enforce that
in the future - goes with the other dependabot todos :).
Chat models currently implement support for:
- images in OpenAI Chat Completions format
- other multimodal types (e.g., PDF and audio) in a cross-provider
[standard
format](https://python.langchain.com/docs/how_to/multimodal_inputs/)
Here we update core to extend support to PDF and audio input in Chat
Completions format. **If an OAI-format PDF or audio content block is
passed into any chat model, it will be transformed to the LangChain
standard format**. We assume that any chat model supporting OAI-format
PDF or audio has implemented support for the standard format.
Thank you for contributing to LangChain!
- [ ] **PR title**: "package: description"
- Where "package" is whichever of langchain, community, core, etc. is
being modified. Use "docs: ..." for purely docs changes, "infra: ..."
for CI changes.
- Example: "community: add foobar LLM"
- [ ] **PR message**: ***Delete this entire checklist*** and replace
with
- **Description:** a description of the change
- **Issue:** the issue # it fixes, if applicable
- **Dependencies:** any dependencies required for this change
- **Twitter handle:** if your PR gets announced, and you'd like a
mention, we'll gladly shout you out!
- [ ] **Add tests and docs**: If you're adding a new integration, please
include
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in
`docs/docs/integrations` directory.
- [ ] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/
Additional guidelines:
- Make sure optional dependencies are imported within a function.
- Please do not add dependencies to pyproject.toml files (even optional
ones) unless they are required for unit tests.
- Most PRs should not touch more than one package.
- Changes should be backwards compatible.
- If you are adding something to community, do not re-import it in
langchain.
If no one reviews your PR within a few days, please @-mention one of
baskaryan, eyurtsev, ccurme, vbarda, hwchase17.
Addresses #30158
When using the output parser—either in a chain or standalone—hitting
max_tokens triggers a misleading “missing variable” error instead of
indicating the output was truncated. This subtle bug often surfaces with
Anthropic models.
---------
Co-authored-by: Chester Curme <chester.curme@gmail.com>
We only need to rebuild model schemas if type annotation information
isn't available during declaration - that shouldn't be the case for
these types corrected here.
Need to do more thorough testing to make sure these structures have
complete schemas, but hopefully this boosts startup / import time.