Some providers include (legacy) function calls in `additional_kwargs` in
addition to tool calls. We currently unpack both function calls and tool
calls if present, but OpenAI will raise 400 in this case.
This can come up if providers are mixed in a tool-calling loop. Example:
```python
from langchain.chat_models import init_chat_model
from langchain_core.messages import HumanMessage
from langchain_core.tools import tool
@tool
def get_weather(location: str) -> str:
"""Get weather at a location."""
return "It's sunny."
gemini = init_chat_model("google_genai:gemini-2.0-flash-001").bind_tools([get_weather])
openai = init_chat_model("openai:gpt-4.1-mini").bind_tools([get_weather])
input_message = HumanMessage("What's the weather in Boston?")
tool_call_message = gemini.invoke([input_message])
assert len(tool_call_message.tool_calls) == 1
tool_call = tool_call_message.tool_calls[0]
tool_message = get_weather.invoke(tool_call)
response = openai.invoke( # currently raises 400 / BadRequestError
[input_message, tool_call_message, tool_message]
)
```
Here we ignore function calls if tool calls are present.
**Description**: The 'inspect' package in python skips over the aliases
set in the schema of a pydantic model. This is a workound to include the
aliases from the original input.
**issue**: #31035
Cc: @ccurme @eyurtsev
---------
Co-authored-by: Chester Curme <chester.curme@gmail.com>
Thank you for contributing to LangChain!
- [x] **PR title**: "package: description"
- Where "package" is whichever of langchain, core, etc. is being
modified. Use "docs: ..." for purely docs changes, "infra: ..." for CI
changes.
- Example: "core: add foobar LLM"
- **Description:** Integrated the Bright Data package to enable
Langchain users to seamlessly incorporate Bright Data into their agents.
- **Dependencies:** None
- **LinkedIn handle**:[Bright
Data](https://www.linkedin.com/company/bright-data)
- [x] **Add tests and docs**: If you're adding a new integration, please
include
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in
`docs/docs/integrations` directory.
- [x] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/
Additional guidelines:
- Make sure optional dependencies are imported within a function.
- Please do not add dependencies to pyproject.toml files (even optional
ones) unless they are required for unit tests.
- Most PRs should not touch more than one package.
- Changes should be backwards compatible.
If no one reviews your PR within a few days, please @-mention one of
baskaryan, eyurtsev, ccurme, vbarda, hwchase17.
---------
Co-authored-by: Chester Curme <chester.curme@gmail.com>
**Description:**
Fix the merge logic in `CharacterTextSplitter.split_text` so that when
using a regex lookahead separator (`is_separator_regex=True`) with
`keep_separator=False`, the raw pattern is not re-inserted between
chunks.
**Issue:**
Fixes#31136
**Dependencies:**
None
**Twitter handle:**
None
Since this is my first open-source PR, please feel free to point out any
mistakes, and I'll be eager to make corrections.
Anthropic updated how they report token counts during streaming today.
See changes to `MessageDeltaUsage` in [this
commit](2da00f26c5 (diff-1a396eba0cd9cd8952dcdb58049d3b13f6b7768ead1411888d66e28211f7bfc5)).
It's clean and simple to grab these fields from the final
`message_delta` event. However, some of them are typed as Optional, and
language
[here](e42451ab3f/src/anthropic/lib/streaming/_messages.py (L462))
suggests they may not always be present. So here we take the required
field from the `message_delta` event as we were doing previously, and
ignore the rest.
partners: (langchain-openai) total_tokens should not add 'Nonetype' t…
# PR Description
## Description
Fixed an issue in `langchain-openai` where `total_tokens` was
incorrectly adding `None` to an integer, causing a TypeError. The fix
ensures proper type checking before adding token counts.
## Issue
Fixes the TypeError traceback shown in the image where `'NoneType'`
cannot be added to an integer.
## Dependencies
None
## Twitter handle
None

Co-authored-by: qiulijie <qiulijie@yuaiweiwu.com>
**Library Repo Path Update **: "langchain-cloudflare"
We recently changed our `langchain-cloudflare` repo to allow for future
libraries.
Created a `libs` folder to hold `langchain-cloudflare` python package.
https://github.com/cloudflare/langchain-cloudflare/tree/main/libs/langchain-cloudflare
On `langchain`, updating `packages.yaml` to point to new
`libs/langchain-cloudflare` library folder.
This PR brings several improvements and modernizations to the
documentation around the Astra DB partner package.
- language alignment for better matching with the terms used in the
Astra DB docs
- updated several links to pages on said documentation
- for the `AstraDBVectorStore`, added mentions of the new features in
the overall `astra.mdx`
- for the vector store, rewritten/upgraded most of the usage example
notebook for a more straightforward experience able to highlight the
main usage patterns (including new ones such as the newly-introduced
"autodetect feature")
---------
Co-authored-by: ccurme <chester.curme@gmail.com>
**What does this PR do?**
This PR replaces deprecated usages of ```.dict()``` with
```.model_dump()``` to ensure compatibility with Pydantic v2 and prepare
for v3, addressing the deprecation warning
```PydanticDeprecatedSince20``` as required in [Issue#
31103](https://github.com/langchain-ai/langchain/issues/31103).
**Changes made:**
* Replaced ```.dict()``` with ```.model_dump()``` in multiple locations
* Ensured consistency with Pydantic v2 migration guidelines
* Verified compatibility across affected modules
**Notes**
* This is a code maintenance and compatibility update
* Tested locally with Pydantic v2.11
* No functional logic changes; only internal method replacements to
prevent deprecation issues
When aggregating AIMessageChunks in a stream, core prefers the leftmost
non-null ID. This is problematic because:
- Core assigns IDs when they are null to `f"run-{run_manager.run_id}"`
- The desired meaningful ID might not be available until midway through
the stream, as is the case for the OpenAI Responses API.
For the OpenAI Responses API, we assign message IDs to the top-level
`AIMessage.id`. This works in `.(a)invoke`, but during `.(a)stream` the
IDs get overwritten by the defaults assigned in langchain-core. These
IDs
[must](https://community.openai.com/t/how-to-solve-badrequesterror-400-item-rs-of-type-reasoning-was-provided-without-its-required-following-item-error-in-responses-api/1151686/9)
be available on the AIMessage object to support passing reasoning items
back to the API (e.g., if not using OpenAI's `previous_response_id`
feature). We could add them elsewhere, but seeing as we've already made
the decision to store them in `.id` during `.(a)invoke`, addressing the
issue in core lets us fix the problem with no interface changes.
- **Description:** `ChatAnthropic.get_num_tokens_from_messages` does not
currently receive `kwargs` and pass those on to
`self._client.beta.messages.count_tokens`. This is a problem if you need
to pass specific options to `count_tokens`, such as the `thinking`
option. This PR fixes that.
- **Issue:** N/A
- **Dependencies:** None
- **Twitter handle:** @bengladwell
Co-authored-by: ccurme <chester.curme@gmail.com>
Thank you for contributing to LangChain!
- [x] **PR title**: "package: description"
- Where "package" is whichever of langchain, community, core, etc. is
being modified. Use "docs: ..." for purely docs changes, "infra: ..."
for CI changes.
- Example: "community: add foobar LLM"
**Description**:
* Starting to put together some PR's to fix the typing around
`langchain-chroma` `filter` and `where_document` query filtering, as
mentioned:
https://github.com/langchain-ai/langchain/issues/30879https://github.com/langchain-ai/langchain/issues/30507
The typing of `dict[str, str]` is on the one hand too restrictive (marks
valid filter expressions as ill-typed) and also too permissive (allows
illegal filter expressions). That's not what this PR addresses though.
This PR just removes from the documentation some examples of filters
that are illegal, and also syntactically incorrect: (a) dictionaries
with keys like `$contains` but the key is missing quotation marks; (b)
dictionaries with multiple entries - this is illegal in Chroma filter
syntax and will raise an exception. (`{"foo": "bar", "qux": "baz"}`).
Filter dictionaries in Chroma must have one and one key only. Again this
is just the documentation issue, which is the lowest hanging fruit. I
also think we need to update the types for `filter` and `where_document`
to be (at the very least `dict[str, Any]`), or, since we have access to
Chroma's types, they should be `Where` and `WhereDocument` types. This
has a wider blast radius though, so I'm starting small.
This PR does not fix the issues mentioned above, it's just starting to
get the ball rolling, and cleaning up the documentation.
- [x] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/
Additional guidelines:
- Make sure optional dependencies are imported within a function.
- Please do not add dependencies to pyproject.toml files (even optional
ones) unless they are required for unit tests.
- Most PRs should not touch more than one package.
- Changes should be backwards compatible.
- If you are adding something to community, do not re-import it in
langchain.
If no one reviews your PR within a few days, please @-mention one of
baskaryan, eyurtsev, ccurme, vbarda, hwchase17.
---------
Co-authored-by: Really Him <hesereallyhim@proton.me>
This PR includes the following documentation fixes for the SAP HANA
Cloud vector store integration:
- Removed stale output from the `%pip install` code cell.
- Replaced an unrelated vectorstore documentation link on the provider
overview page.
- Renamed the provider from "SAP HANA" to "SAP HANA Cloud"
Hi there, I'm Célina from 🤗,
This PR introduces support for Hugging Face's serverless Inference
Providers (documentation
[here](https://huggingface.co/docs/inference-providers/index)), allowing
users to specify different providers for chat completion and text
generation tasks.
This PR also removes the usage of `InferenceClient.post()` method in
`HuggingFaceEndpoint`, in favor of the task-specific `text_generation`
method. `InferenceClient.post()` is deprecated and will be removed in
`huggingface_hub v0.31.0`.
---
## Changes made
- bumped the minimum required version of the `huggingface-hub` package
to ensure compatibility with the latest API usage.
- added a `provider` field to `HuggingFaceEndpoint`, enabling users to
select the inference provider (e.g., 'cerebras', 'together',
'fireworks-ai'). Defaults to `hf-inference` (HF Inference API).
- replaced the deprecated `InferenceClient.post()` call in
`HuggingFaceEndpoint` with the task-specific `text_generation` method
for future-proofing, `post()` will be removed in huggingface-hub
v0.31.0.
- updated the `ChatHuggingFace` component:
- added async and streaming support.
- added support for tool calling.
- exposed underlying chat completion parameters for more granular
control.
- Added integration tests for `ChatHuggingFace` and updated the
corresponding unit tests.
✅ All changes are backward compatible.
---------
Co-authored-by: ccurme <chester.curme@gmail.com>