Follow up to https://github.com/langchain-ai/langsmith-sdk/pull/1696,
I've bumped the `langsmith` version where applicable in `uv.lock`.
Type checking problems here because deps have been updated in
`pyproject.toml` and `uv lock` hasn't been run - we should enforce that
in the future - goes with the other dependabot todos :).
## Description
Added support for retrieving column comments in the SQL Database
utility. This feature allows users to see comments associated with
database columns when querying table information. Column comments
provide valuable metadata that helps LLMs better understand the
semantics and purpose of database columns.
A new optional parameter `get_col_comments` was added to the
`get_table_info` method, defaulting to `False` for backward
compatibility. When set to `True`, it retrieves and formats column
comments for each table.
Currently, this feature is supported on PostgreSQL, MySQL, and Oracle
databases.
## Implementation
You should create Table with column comments before.
```python
db = SQLDatabase.from_uri("YOUR_DB_URI")
print(db.get_table_info(get_col_comments=True))
```
## Result
```
CREATE TABLE test_table (
name VARCHAR
school VARCHAR)
/*
Column Comments: {'name': person name, 'school":school_name}
*/
/*
3 rows from test_table:
name
a
b
c
*/
```
## Benefits
1. Enhances LLM's understanding of database schema semantics
2. Preserves valuable domain knowledge embedded in database design
3. Improves accuracy of SQL query generation
4. Provides more context for data interpretation
Tests are available in
`langchain/libs/community/tests/test_sql_get_table_info.py`.
---------
Co-authored-by: chbae <chbae@gcsc.co.kr>
Co-authored-by: Chester Curme <chester.curme@gmail.com>
- **Description:**
This PR marks the `HanaDB` vector store (and related utilities) in
`langchain_community` as deprecated using the `@deprecated` annotation.
- Set `since="0.1.0"` and `removal="1.0"`
- Added a clear migration path and a link to the SAP-maintained
replacement in the
[`langchain_hana`](https://github.com/SAP/langchain-integration-for-sap-hana-cloud)
package.
Additionally, the example notebook has been updated to use the new
`HanaDB` class from `langchain_hana`, ensuring users follow the
recommended integration moving forward.
- **Issue:** None
- **Dependencies:** None
---------
Co-authored-by: Chester Curme <chester.curme@gmail.com>
The `_chunk_size` has not changed by method `self._tokenize`, So i think
these is duplicate code.
Signed-off-by: zhanluxianshen <zhanluxianshen@163.com>
This is a PR to return the message attachments in _get_response, as when
files are generated these attachments are not returned thus generated
files cannot be retrieved
Fixes issue: https://github.com/langchain-ai/langchain/issues/30851
Thank you for contributing to LangChain!
- [ ] **PR title**: "package: description"
- Where "package" is whichever of langchain, community, core, etc. is
being modified. Use "docs: ..." for purely docs changes, "infra: ..."
for CI changes.
- Example: "community: add foobar LLM"
community: fix browserbase integration
docs: update docs
- [ ] **PR message**: ***Delete this entire checklist*** and replace
with
- **Description:** Updated BrowserbaseLoader to use the new python sdk.
- **Issue:** update browserbase integration with langchain
- **Dependencies:** n/a
- **Twitter handle:** @kylejeong21
- [ ] **Add tests and docs**: If you're adding a new integration, please
include
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in
`docs/docs/integrations` directory.
- [ ] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/
Following https://github.com/langchain-ai/langchain/pull/30909: need to
retain "empty" reasoning output when streaming, e.g.,
```python
{'id': 'rs_...', 'summary': [], 'type': 'reasoning'}
```
Tested by existing integration tests, which are currently failing.
- [x] **PR title**: "community: add indexname to other functions in
opensearch"
- [x] **PR message**:
- **Description:** add ability to over-ride index-name if provided in
the kwargs of sub-functions. When used in WSGI application it's crucial
to be able to dynamically change parameters.
- [ ] **Add tests and docs**:
- [x] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/
Chat models currently implement support for:
- images in OpenAI Chat Completions format
- other multimodal types (e.g., PDF and audio) in a cross-provider
[standard
format](https://python.langchain.com/docs/how_to/multimodal_inputs/)
Here we update core to extend support to PDF and audio input in Chat
Completions format. **If an OAI-format PDF or audio content block is
passed into any chat model, it will be transformed to the LangChain
standard format**. We assume that any chat model supporting OAI-format
PDF or audio has implemented support for the standard format.
`mixtral-8x-7b-instruct` was recently retired from Fireworks Serverless.
Here we remove the default model altogether, so that the model must be
explicitly specified on init:
```python
ChatFireworks(model="accounts/fireworks/models/llama-v3p1-70b-instruct") # for example
```
We also set a null default for `temperature`, which previously defaulted
to 0.0. This parameter will no longer be included in request payloads
unless it is explicitly provided.
**langchain_openai: Support of reasoning summary streaming**
**Description:**
OpenAI API now supports streaming reasoning summaries for reasoning
models (o1, o3, o3-mini, o4-mini). More info about it:
https://platform.openai.com/docs/guides/reasoning#reasoning-summaries
It is supported only in Responses API (not Completion API), so you need
to create LangChain Open AI model as follows to support reasoning
summaries streaming:
```
llm = ChatOpenAI(
model="o4-mini", # also o1, o3, o3-mini support reasoning streaming
use_responses_api=True, # reasoning streaming works only with responses api, not completion api
model_kwargs={
"reasoning": {
"effort": "high", # also "low" and "medium" supported
"summary": "auto" # some models support "concise" summary, some "detailed", but auto will always work
}
}
)
```
Now, if you stream events from llm:
```
async for event in llm.astream_events(prompt, version="v2"):
print(event)
```
or
```
for chunk in llm.stream(prompt):
print (chunk)
```
OpenAI API will send you new types of events:
`response.reasoning_summary_text.added`
`response.reasoning_summary_text.delta`
`response.reasoning_summary_text.done`
These events are new, so they were ignored. So I have added support of
these events in function `_convert_responses_chunk_to_generation_chunk`,
so reasoning chunks or full reasoning added to the chunk
additional_kwargs.
Example of how this reasoning summary may be printed:
```
async for event in llm.astream_events(prompt, version="v2"):
if event["event"] == "on_chat_model_stream":
chunk: AIMessageChunk = event["data"]["chunk"]
if "reasoning_summary_chunk" in chunk.additional_kwargs:
print(chunk.additional_kwargs["reasoning_summary_chunk"], end="")
elif "reasoning_summary" in chunk.additional_kwargs:
print("\n\nFull reasoning step summary:", chunk.additional_kwargs["reasoning_summary"])
elif chunk.content and chunk.content[0]["type"] == "text":
print(chunk.content[0]["text"], end="")
```
or
```
for chunk in llm.stream(prompt):
if "reasoning_summary_chunk" in chunk.additional_kwargs:
print(chunk.additional_kwargs["reasoning_summary_chunk"], end="")
elif "reasoning_summary" in chunk.additional_kwargs:
print("\n\nFull reasoning step summary:", chunk.additional_kwargs["reasoning_summary"])
elif chunk.content and chunk.content[0]["type"] == "text":
print(chunk.content[0]["text"], end="")
```
---------
Co-authored-by: Chester Curme <chester.curme@gmail.com>
PR title:
docs: add Valyu integration documentation
Description:
This PR adds documentation and example notebooks for the Valyu
integration, including retriever and tool usage.
Issue:
N/A
Dependencies:
No new dependencies.
---------
Co-authored-by: ccurme <chester.curme@gmail.com>
PR Summary
This change adds a fallback in ChatAnthropic.with_structured_output() to
handle Pydantic models that don’t include a docstring. Without it,
calling:
```py
from pydantic import BaseModel
from langchain_anthropic import ChatAnthropic
class SampleModel(BaseModel):
sample_field: str
llm = ChatAnthropic(
model="claude-3-7-sonnet-latest"
).with_structured_output(SampleModel.model_json_schema())
llm.invoke("test")
```
will raise a
```
KeyError: 'description'
```
because Pydantic omits the description field when no docstring is
present.
This issue doesn’t occur when using ChatOpenAI or if you add a docstring
to the model:
```py
from pydantic import BaseModel
from langchain_openai import ChatOpenAI
class SampleModel(BaseModel):
"""Schema for sample_field output."""
sample_field: str
llm = ChatOpenAI(
model="gpt-4o-mini"
).with_structured_output(SampleModel.model_json_schema())
llm.invoke("test")
```
---------
Co-authored-by: Chester Curme <chester.curme@gmail.com>
Thank you for contributing to LangChain!
- [ ] **PR title**: "package: description"
- Where "package" is whichever of langchain, community, core, etc. is
being modified. Use "docs: ..." for purely docs changes, "infra: ..."
for CI changes.
- Example: "community: add foobar LLM"
- [ ] **PR message**: ***Delete this entire checklist*** and replace
with
- **Description:** a description of the change
- **Issue:** the issue # it fixes, if applicable
- **Dependencies:** any dependencies required for this change
- **Twitter handle:** if your PR gets announced, and you'd like a
mention, we'll gladly shout you out!
- [ ] **Add tests and docs**: If you're adding a new integration, please
include
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in
`docs/docs/integrations` directory.
- [ ] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/
Additional guidelines:
- Make sure optional dependencies are imported within a function.
- Please do not add dependencies to pyproject.toml files (even optional
ones) unless they are required for unit tests.
- Most PRs should not touch more than one package.
- Changes should be backwards compatible.
- If you are adding something to community, do not re-import it in
langchain.
If no one reviews your PR within a few days, please @-mention one of
baskaryan, eyurtsev, ccurme, vbarda, hwchase17.
Addresses #30158
When using the output parser—either in a chain or standalone—hitting
max_tokens triggers a misleading “missing variable” error instead of
indicating the output was truncated. This subtle bug often surfaces with
Anthropic models.
---------
Co-authored-by: Chester Curme <chester.curme@gmail.com>
PR title:
Community: Add bind variable support for oracle adb docloader
Description:
This PR adds support of using bind variable to oracle adb doc loader
class, including minor document change.
Issue:
N/A
Dependencies:
No new dependencies.
This is a follow-on PR to go with the identical changes that were made
in parters/openai.
Previous PR: https://github.com/langchain-ai/langchain/pull/30757
When calling embed_documents and providing a chunk_size argument, that
argument is ignored when OpenAIEmbeddings is instantiated with its
default configuration (where check_embedding_ctx_length=True).
_get_len_safe_embeddings specifies a chunk_size parameter but it's not
being passed through in embed_documents, which is its only caller. This
appears to be an oversight, especially given that the
_get_len_safe_embeddings docstring states it should respect "the set
embedding context length and chunk size."
Developers typically expect method parameters to take effect (also, take
precedence) when explicitly provided, especially when instantiating
using defaults. I was confused as to why my API calls were being
rejected regardless of the chunk size I provided.
When calling `embed_documents` and providing a `chunk_size` argument,
that argument is ignored when `OpenAIEmbeddings` is instantiated with
its default configuration (where `check_embedding_ctx_length=True`).
`_get_len_safe_embeddings` specifies a `chunk_size` parameter but it's
not being passed through in `embed_documents`, which is its only caller.
This appears to be an oversight, especially given that the
`_get_len_safe_embeddings` docstring states it should respect "the set
embedding context length and chunk size."
Developers typically expect method parameters to take effect (also, take
precedence) when explicitly provided, especially when instantiating
using defaults. I was confused as to why my API calls were being
rejected regardless of the chunk size I provided.
This bug also exists in langchain_community package. I can add that to
this PR if requested otherwise I will create a new one once this passes.
**Description:**
partners-anthropic: ChatAnthropic supports b64 and urls in the
part[image_url][url] message variable
**Issue**:
ChatAnthropic right now only supports b64 encoded images in the
part[image_url][url] message variable. This PR enables ChatAnthropic to
also accept image urls in said variable and makes it compatible with
OpenAI messages to make model switching easier.
---------
Co-authored-by: Chester Curme <chester.curme@gmail.com>
SingleStore integration now has its package `langchain-singlestore', so
the community implementation will no longer be maintained.
Added `deprecated` decorator to `SingleStoreDBChatMessageHistory`,
`SingleStoreDBSemanticCache`, and `SingleStoreDB` classes in the
community package.
**Dependencies:** https://github.com/langchain-ai/langchain/pull/30841
---------
Co-authored-by: ccurme <chester.curme@gmail.com>
Support "usage_metadata" for LiteLLM streaming calls.
This is a follow-up to
https://github.com/langchain-ai/langchain/pull/30625, which tackled
non-streaming calls.
If no one reviews your PR within a few days, please @-mention one of
baskaryan, eyurtsev, ccurme, vbarda, hwchase17.
- [ ] **PR message**:
- **Description:** including metadata_field in
max_marginal_relevance_search() would result in error, changed the logic
to be similar to how it's handled in similarity_search, where it can be
any field or simply a "*" to include every field
Replacement for PR #30191 (@ccurme)
**Description**: currently, ChatOllama [will raise a value error if a
ChatMessage is passed to
it](https://github.com/langchain-ai/langchain/blob/master/libs/partners/ollama/langchain_ollama/chat_models.py#L514),
as described
https://github.com/langchain-ai/langchain/pull/30147#issuecomment-2708932481.
Furthermore, ollama-python is removing the limitations on valid roles
that can be passed through chat messages to a model in ollama -
https://github.com/ollama/ollama-python/pull/462#event-16917810634.
This PR removes the role limitations imposed by langchain and enables
passing langchain ChatMessages with arbitrary 'role' values through the
langchain ChatOllama class to the underlying ollama-python Client.
As this PR relies on [merged but unreleased functionality in
ollama-python](
https://github.com/ollama/ollama-python/pull/462#event-16917810634), I
have temporarily pointed the ollama package source to the main branch of
the ollama-python github repo.
Format, lint, and tests of new functionality passing. Need to resolve
issue with recently added ChatOllama tests. (Now resolved)
**Issue**: resolves#30122 (related to ollama issue
https://github.com/ollama/ollama/issues/8955)
**Dependencies**: no new dependencies
[x] PR title
[x] PR message
[x] Lint and test: format, lint, and test all running successfully and
passing
---------
Co-authored-by: Ryan Stewart <ryanstewart@Ryans-MacBook-Pro.local>
Co-authored-by: Chester Curme <chester.curme@gmail.com>
* Remove unused ignores
* Add type ignore codes
* Add mypy rule `warn_unused_ignores`
* Add ruff rule PGH003
NB: some `type: ignore[unused-ignore]` are added because the ignores are
needed when `extended_testing_deps.txt` deps are installed.
We only need to rebuild model schemas if type annotation information
isn't available during declaration - that shouldn't be the case for
these types corrected here.
Need to do more thorough testing to make sure these structures have
complete schemas, but hopefully this boosts startup / import time.
- [ ] **PR title**: "docs: adding Smabbler's Galaxia integration"
- [ ] **PR message**: **Twitter handle:** @Galaxia_graph
I'm adding docs here + added the package to the packages.yml. I didn't
add a unit test, because this integration is just a thin wrapper on top
of our API. There isn't much left to test if you mock it away.
---------
Co-authored-by: Chester Curme <chester.curme@gmail.com>
**Description:** This PR adds provider inference logic to
`init_chat_model` for Perplexity models that use the "sonar..." prefix
(`sonar`, `sonar-pro`, `sonar-reasoning`, `sonar-reasoning-pro` or
`sonar-deep-research`).
This allows users to initialize these models by simply passing the model
name, without needing to explicitly set `model_provider="perplexity"`.
The docstring for `init_chat_model` has also been updated to reflect
this new inference rule.
https://github.com/langchain-ai/langchain/pull/30778 (not released)
broke all invocation modes of ChatOllama (intent was to remove
`"message"` from `generation_info`, but we turned `generation_info` into
`stream_resp["message"]`), resulting in validation errors.
TL;DR: you can't optimize imports with a lazy `__getattr__` if there is
a namespace conflict with a module name and an attribute name. We should
avoid introducing conflicts like this in the future.
This PR fixes a bug introduced by my lazy imports PR:
https://github.com/langchain-ai/langchain/pull/30769.
In `langchain_core`, we have utilities for loading and dumping data.
Unfortunately, one of those utilities is a `load` function, located in
`langchain_core/load/load.py`. To make this function more visible, we
make it accessible at the top level `langchain_core.load` module via
importing the function in `langchain_core/load/__init__.py`.
So, either of these imports should work:
```py
from langchain_core.load import load
from langchain_core.load.load import load
```
As you can tell, this is already a bit confusing. You'd think that the
first import would produce the module `load`, but because of the
`__init__.py` shortcut, both produce the function `load`.
<details> More on why the lazy imports PR broke this support...
All was well, except when the absolute import was run first, see the
last snippet:
```
>>> from langchain_core.load import load
>>> load
<function load at 0x101c320c0>
```
```
>>> from langchain_core.load.load import load
>>> load
<function load at 0x1069360c0>
```
```
>>> from langchain_core.load import load
>>> load
<function load at 0x10692e0c0>
>>> from langchain_core.load.load import load
>>> load
<function load at 0x10692e0c0>
```
```
>>> from langchain_core.load.load import load
>>> load
<function load at 0x101e2e0c0>
>>> from langchain_core.load import load
>>> load
<module 'langchain_core.load.load' from '/Users/sydney_runkle/oss/langchain/libs/core/langchain_core/load/load.py'>
```
In this case, the function `load` wasn't stored in the globals cache for
the `langchain_core.load` module (by the lazy import logic), so Python
defers to a module import.
</details>
New `langchain` tongue twister 😜: we've created a problem for ourselves
because you have to load the load function from the load file in the
load module 😨.
Fix CI to trigger benchmarks on `run-codspeed-benchmarks` label addition
Reduce scope of async benchmark to save time on CI
Waiting to merge this PR until we figure out how to use walltime on
local runners.
Most easily reviewed with the "hide whitespace" option toggled.
Seeing 10-50% speed ups in import time for common structures 🚀
The general purpose of this PR is to lazily import structures within
`langchain_core.XXX_module.__init__.py` so that we're not eagerly
importing expensive dependencies (`pydantic`, `requests`, etc).
Analysis of flamegraphs generated with `importtime` motivated these
changes. For example, the one below demonstrates that importing
`HumanMessage` accidentally triggered imports for `importlib.metadata`,
`requests`, etc.
There's still much more to do on this front, and we can start digging
into our own internal code for optimizations now that we're less
concerned about external imports.
<img width="1210" alt="Screenshot 2025-04-11 at 1 10 54 PM"
src="https://github.com/user-attachments/assets/112a3fe7-24a9-4294-92c1-d5ae64df839e"
/>
I've tracked the improvements with some local benchmarks:
## `pytest-benchmark` results
| Name | Before (s) | After (s) | Delta (s) | % Change |
|-----------------------------|------------|-----------|-----------|----------|
| Document | 2.8683 | 1.2775 | -1.5908 | -55.46% |
| HumanMessage | 2.2358 | 1.1673 | -1.0685 | -47.79% |
| ChatPromptTemplate | 5.5235 | 2.9709 | -2.5526 | -46.22% |
| Runnable | 2.9423 | 1.7793 | -1.163 | -39.53% |
| InMemoryVectorStore | 3.1180 | 1.8417 | -1.2763 | -40.93% |
| RunnableLambda | 2.7385 | 1.8745 | -0.864 | -31.55% |
| tool | 5.1231 | 4.0771 | -1.046 | -20.42% |
| CallbackManager | 4.2263 | 3.4099 | -0.8164 | -19.32% |
| LangChainTracer | 3.8394 | 3.3101 | -0.5293 | -13.79% |
| BaseChatModel | 4.3317 | 3.8806 | -0.4511 | -10.41% |
| PydanticOutputParser | 3.2036 | 3.2995 | 0.0959 | 2.99% |
| InMemoryRateLimiter | 0.5311 | 0.5995 | 0.0684 | 12.88% |
Note the lack of change for `InMemoryRateLimiter` and
`PydanticOutputParser` is just random noise, I'm getting comparable
numbers locally.
## Local CodSpeed results
We're still working on configuring CodSpeed on CI. The local usage
produced similar results.
This PR fixes an issue where ChatPerplexity would raise an
AttributeError when the citations attribute was missing from the model
response (e.g., when using offline models like r1-1776).
The fix checks for the presence of citations, images, and
related_questions before attempting to access them, avoiding crashes in
models that don't provide these fields.
Tested locally with models that omit citations, and the fix works as
expected.
Looks like `pyupgrade` was already used here but missed some docs and
tests.
This helps to keep our docs looking professional and up to date.
Eventually, we should lint / format our inline docs.
**Description:** add support for oauth2 in Jira tool by adding the
possibility to pass a dictionary with oauth parameters. I also adapted
the documentation to show this new behavior
Thank you for contributing to LangChain!
- [ ] **PR title**: "package: description"
- Where "package" is whichever of langchain, community, core, etc. is
being modified. Use "docs: ..." for purely docs changes, "infra: ..."
for CI changes.
- Example: "community: add foobar LLM"
- [ ] **PR message**: ***Delete this entire checklist*** and replace
with
- **Description:** a description of the change
- **Issue:** the issue # it fixes, if applicable
- **Dependencies:** any dependencies required for this change
- **Twitter handle:** if your PR gets announced, and you'd like a
mention, we'll gladly shout you out!
- [ ] **Add tests and docs**: If you're adding a new integration, please
include
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in
`docs/docs/integrations` directory.
- [ ] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/
Additional guidelines:
- Make sure optional dependencies are imported within a function.
- Please do not add dependencies to pyproject.toml files (even optional
ones) unless they are required for unit tests.
- Most PRs should not touch more than one package.
- Changes should be backwards compatible.
- If you are adding something to community, do not re-import it in
langchain.
If no one reviews your PR within a few days, please @-mention one of
baskaryan, eyurtsev, ccurme, vbarda, hwchase17.
Signed-off-by: zhanluxianshen <zhanluxianshen@163.com>
This PR aims to reduce import time of `langchain-core` tools by removing
the `importlib.metadata` import previously used in `__init__.py`. This
is the first in a sequence of PRs to reduce import time delays for
`langchain-core` features and structures 🚀.
Because we're now hard coding the version, we need to make sure
`version.py` and `pyproject.toml` stay in sync, so I've added a new CI
job that runs whenever either of those files are modified. [This
run](https://github.com/langchain-ai/langchain/actions/runs/14358012706/job/40251952044?pr=30744)
demonstrates the failure that occurs whenever the version gets out of
sync (thus blocking a PR).
Before, note the ~15% of time spent on the `importlib.metadata` /related
imports
<img width="1081" alt="Screenshot 2025-04-09 at 9 06 15 AM"
src="https://github.com/user-attachments/assets/59f405ec-ee8d-4473-89ff-45dea5befa31"
/>
After (note, lack of `importlib.metadata` time sink):
<img width="1245" alt="Screenshot 2025-04-09 at 9 01 23 AM"
src="https://github.com/user-attachments/assets/9c32e77c-27ce-485e-9b88-e365193ed58d"
/>
Description:
This PR adds documentation for the langchain-cloudflare integration
package.
Issue:
N/A
Dependencies:
No new dependencies are required.
Tests and Docs:
Added an example notebook demonstrating the usage of the
langchain-cloudflare package, located in docs/docs/integrations.
Added a new package to libs/packages.yml.
Lint and Format:
Successfully ran make format and make lint.
---------
Co-authored-by: Collier King <collier@cloudflare.com>
Co-authored-by: Collier King <collierking99@gmail.com>
Hi there, This is a complementary PR to #30733.
This PR introduces support for Hugging Face's serverless Inference
Providers (documentation
[here](https://huggingface.co/docs/inference-providers/index)), allowing
users to specify different providers
This PR also removes the usage of `InferenceClient.post()` method in
`HuggingFaceEndpointEmbeddings`, in favor of the task-specific
`feature_extraction` method. `InferenceClient.post()` is deprecated and
will be removed in `huggingface_hub` v0.31.0.
## Changes made
- bumped the minimum required version of the `huggingface_hub` package
to ensure compatibility with the latest API usage.
- added a provider field to `HuggingFaceEndpointEmbeddings`, enabling
users to select the inference provider.
- replaced the deprecated `InferenceClient.post()` call in
`HuggingFaceEndpointEmbeddings` with the task-specific
`feature_extraction` method for future-proofing, `post()` will be
removed in `huggingface-hub` v0.31.0.
✅ All changes are backward compatible.
---------
Co-authored-by: Lucain <lucainp@gmail.com>
Co-authored-by: ccurme <chester.curme@gmail.com>
The first in a sequence of PRs focusing on improving performance in
core. We're starting with reducing import times for common structures,
hence the benchmarks here.
The benchmark looks a little bit complicated - we have to use a process
so that we don't suffer from Python's import caching system. I tried
doing manual modification of `sys.modules` between runs, but that's
pretty tricky / hacky to get right, hence the subprocess approach.
Motivated by extremely slow baseline for common imports (we're talking
2-5 seconds):
<img width="633" alt="Screenshot 2025-04-09 at 12 48 12 PM"
src="https://github.com/user-attachments/assets/994616fe-1798-404d-bcbe-48ad0eb8a9a0"
/>
Also added a `make benchmark` command to make local runs easy :).
Currently using walltimes so that we can track total time despite using
a manual proces.
Google vertex ai search will now return the title of the found website
as part of the document metadata, if available.
Thank you for contributing to LangChain!
- **Description**: Vertex AI Search can be used to index websites and
then develop chatbots that use these websites to answer questions. At
present, the document metadata includes an `id` and `source` (which is
the URL). While the URL is enough to create a link, the ID is not
descriptive enough to show users. Therefore, I propose we return `title`
as well, when available (e.g., it will not be available in `.txt`
documents found during the website indexing).
- **Issue**: No bug in particular, but it would be better if this was
here.
- **Dependencies**: None
- I do not use twitter.
Format, Lint and Test seem to be all good.
Generally, this PR is CI performance focused + aims to clean up some
dependencies at the same time.
1. Unpins upper bounds for `numpy` in all `pyproject.toml` files where
`numpy` is specified
2. Requires `numpy >= 2.1.0` for Python 3.13 and `numpy > v1.26.0` for
Python 3.12, plus a `numpy` min version bump for `chroma`
3. Speeds up CI by minutes - linting on Python 3.13, installing `numpy <
2.1.0` was taking [~3
minutes](https://github.com/langchain-ai/langchain/actions/runs/14316342925/job/40123305868?pr=30713),
now the entire env setup takes a few seconds
4. Deleted the `numpy` test dependency from partners where that was not
used, specifically `huggingface`, `voyageai`, `xai`, and `nomic`.
It's a bit unfortunate that `langchain-community` depends on `numpy`, we
might want to try to fix that in the future...
Closes https://github.com/langchain-ai/langchain/issues/26026
Fixes https://github.com/langchain-ai/langchain/issues/30555
Thank you for contributing to LangChain!
- [ ] **PR title**: "package: description"
- Where "package" is whichever of langchain, community, core, etc. is
being modified. Use "docs: ..." for purely docs changes, "infra: ..."
for CI changes.
- Example: "community: add foobar LLM"
- [ ] **PR message**: ***Delete this entire checklist*** and replace
with
- **Description:** a description of the change
- **Issue:** the issue # it fixes, if applicable
- **Dependencies:** any dependencies required for this change
- **Twitter handle:** if your PR gets announced, and you'd like a
mention, we'll gladly shout you out!
- [ ] **Add tests and docs**: If you're adding a new integration, please
include
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in
`docs/docs/integrations` directory.
- [ ] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/
Additional guidelines:
- Make sure optional dependencies are imported within a function.
- Please do not add dependencies to pyproject.toml files (even optional
ones) unless they are required for unit tests.
- Most PRs should not touch more than one package.
- Changes should be backwards compatible.
- If you are adding something to community, do not re-import it in
langchain.
If no one reviews your PR within a few days, please @-mention one of
baskaryan, eyurtsev, ccurme, vbarda, hwchase17.
Signed-off-by: zhanluxianshen <zhanluxianshen@163.com>
Tool-calling tests started intermittently failing with
> groq.APIError: Failed to call a function. Please adjust your prompt.
See 'failed_generation' for more details.
**Description:** The error message was supposed to display the missing
vector name, but instead, it includes only the existing collection
configs.
This simple PR just includes the correct variable name, so that the user
knows the requested vector does not exist in the collection.
Additional guidelines:
- Make sure optional dependencies are imported within a function.
- Please do not add dependencies to pyproject.toml files (even optional
ones) unless they are required for unit tests.
- Most PRs should not touch more than one package.
- Changes should be backwards compatible.
- If you are adding something to community, do not re-import it in
langchain.
If no one reviews your PR within a few days, please @-mention one of
baskaryan, eyurtsev, ccurme, vbarda, hwchase17.
Signed-off-by: Tin Lai <tin@tinyiu.com>
Add ruff rules PGH: https://docs.astral.sh/ruff/rules/#pygrep-hooks-pgh
Except PGH003 which will be dealt in a dedicated PR.
---------
Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>
Co-authored-by: Eugene Yurtsev <eugene@langchain.dev>
**Description:**
Fixed a bug in `BaseCallbackManager.remove_handler()` that caused a
`ValueError` when removing a handler added via the constructor's
`handlers` parameter. The issue occurred because handlers passed to the
constructor were added only to the `handlers` list and not automatically
to `inheritable_handlers` unless explicitly specified. However,
`remove_handler()` attempted to remove the handler from both lists
unconditionally, triggering a `ValueError` when it wasn't in
`inheritable_handlers`.
The fix ensures the method checks for the handler’s presence in each
list before attempting removal, making it more robust while preserving
its original behavior.
**Issue:** Fixes#30640
**Dependencies:** None
---------
Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>
- **Description:** We do not need to set parser in `scrape` since it is
already been done in `_scrape`
- **Issue:** #30629, not directly related but makes sure xml parser is
used
This pull request includes various changes to the `langchain_core`
library, focusing on improving compatibility with different versions of
Pydantic. The primary change involves replacing checks for Pydantic
major versions with boolean flags, which simplifies the code and
improves readability.
This also solves ruff rule checks for
[RUF048](https://docs.astral.sh/ruff/rules/map-int-version-parsing/) and
[PLR2004](https://docs.astral.sh/ruff/rules/magic-value-comparison/).
Key changes include:
### Compatibility Improvements:
*
[`libs/core/langchain_core/output_parsers/json.py`](diffhunk://#diff-5add0cf7134636ae4198a1e0df49ee332ae0c9123c3a2395101e02687c717646L22-R24):
Replaced `PYDANTIC_MAJOR_VERSION` with `IS_PYDANTIC_V1` to check for
Pydantic version 1.
*
[`libs/core/langchain_core/output_parsers/pydantic.py`](diffhunk://#diff-2364b5b4aee01c462aa5dbda5dc3a877dcd20f29df173ad540dc8adf8b192361L14-R14):
Updated version checks from `PYDANTIC_MAJOR_VERSION` to `IS_PYDANTIC_V2`
in the `PydanticOutputParser` class.
[[1]](diffhunk://#diff-2364b5b4aee01c462aa5dbda5dc3a877dcd20f29df173ad540dc8adf8b192361L14-R14)
[[2]](diffhunk://#diff-2364b5b4aee01c462aa5dbda5dc3a877dcd20f29df173ad540dc8adf8b192361L27-R27)
### Utility Enhancements:
*
[`libs/core/langchain_core/utils/pydantic.py`](diffhunk://#diff-ff28020c5f1073a8b63bcd9d8b756a187fd682cb81935295120c63b207071896R23):
Introduced `IS_PYDANTIC_V1` and `IS_PYDANTIC_V2` flags and deprecated
the `get_pydantic_major_version` function. Updated various functions to
use these flags instead of version numbers.
[[1]](diffhunk://#diff-ff28020c5f1073a8b63bcd9d8b756a187fd682cb81935295120c63b207071896R23)
[[2]](diffhunk://#diff-ff28020c5f1073a8b63bcd9d8b756a187fd682cb81935295120c63b207071896R42-R78)
[[3]](diffhunk://#diff-ff28020c5f1073a8b63bcd9d8b756a187fd682cb81935295120c63b207071896L90-R89)
[[4]](diffhunk://#diff-ff28020c5f1073a8b63bcd9d8b756a187fd682cb81935295120c63b207071896L104-R101)
[[5]](diffhunk://#diff-ff28020c5f1073a8b63bcd9d8b756a187fd682cb81935295120c63b207071896L120-R122)
[[6]](diffhunk://#diff-ff28020c5f1073a8b63bcd9d8b756a187fd682cb81935295120c63b207071896L135-R132)
[[7]](diffhunk://#diff-ff28020c5f1073a8b63bcd9d8b756a187fd682cb81935295120c63b207071896L149-R151)
[[8]](diffhunk://#diff-ff28020c5f1073a8b63bcd9d8b756a187fd682cb81935295120c63b207071896L164-R161)
[[9]](diffhunk://#diff-ff28020c5f1073a8b63bcd9d8b756a187fd682cb81935295120c63b207071896L248-R250)
[[10]](diffhunk://#diff-ff28020c5f1073a8b63bcd9d8b756a187fd682cb81935295120c63b207071896L330-R335)
[[11]](diffhunk://#diff-ff28020c5f1073a8b63bcd9d8b756a187fd682cb81935295120c63b207071896L356-R357)
[[12]](diffhunk://#diff-ff28020c5f1073a8b63bcd9d8b756a187fd682cb81935295120c63b207071896L393-R390)
[[13]](diffhunk://#diff-ff28020c5f1073a8b63bcd9d8b756a187fd682cb81935295120c63b207071896L403-R400)
### Test Updates:
*
[`libs/core/tests/unit_tests/output_parsers/test_openai_tools.py`](diffhunk://#diff-694cc0318edbd6bbca34f53304934062ad59ba9f5a788252ce6c5f5452489d67L19-R22):
Updated tests to use `IS_PYDANTIC_V1` and `IS_PYDANTIC_V2` for version
checks.
[[1]](diffhunk://#diff-694cc0318edbd6bbca34f53304934062ad59ba9f5a788252ce6c5f5452489d67L19-R22)
[[2]](diffhunk://#diff-694cc0318edbd6bbca34f53304934062ad59ba9f5a788252ce6c5f5452489d67L532-R535)
[[3]](diffhunk://#diff-694cc0318edbd6bbca34f53304934062ad59ba9f5a788252ce6c5f5452489d67L567-R570)
[[4]](diffhunk://#diff-694cc0318edbd6bbca34f53304934062ad59ba9f5a788252ce6c5f5452489d67L602-R605)
*
[`libs/core/tests/unit_tests/prompts/test_chat.py`](diffhunk://#diff-3e60e744842086a4f3c4b21bc83e819c3435720eab210078e77e2430fb8c7e84R7):
Replaced version tuple checks with `PYDANTIC_VERSION` comparisons.
[[1]](diffhunk://#diff-3e60e744842086a4f3c4b21bc83e819c3435720eab210078e77e2430fb8c7e84R7)
[[2]](diffhunk://#diff-3e60e744842086a4f3c4b21bc83e819c3435720eab210078e77e2430fb8c7e84L35-R38)
[[3]](diffhunk://#diff-3e60e744842086a4f3c4b21bc83e819c3435720eab210078e77e2430fb8c7e84L924-R927)
[[4]](diffhunk://#diff-3e60e744842086a4f3c4b21bc83e819c3435720eab210078e77e2430fb8c7e84L935-R938)
*
[`libs/core/tests/unit_tests/runnables/test_graph.py`](diffhunk://#diff-99a290330ef40103d0ce02e52e21310d6fadea142bfdea13c94d23fc81c0bb5dR3):
Simplified version checks using `PYDANTIC_VERSION`.
[[1]](diffhunk://#diff-99a290330ef40103d0ce02e52e21310d6fadea142bfdea13c94d23fc81c0bb5dR3)
[[2]](diffhunk://#diff-99a290330ef40103d0ce02e52e21310d6fadea142bfdea13c94d23fc81c0bb5dL15-R18)
[[3]](diffhunk://#diff-99a290330ef40103d0ce02e52e21310d6fadea142bfdea13c94d23fc81c0bb5dL234-L239)
*
[`libs/core/tests/unit_tests/runnables/test_runnable.py`](diffhunk://#diff-06bed920c0dad0cfd41d57a8d9e47a7b56832409649c10151061a791860d5bb5L18-R20):
Introduced `PYDANTIC_VERSION_AT_LEAST_29` and
`PYDANTIC_VERSION_AT_LEAST_210` for more readable version checks.
[[1]](diffhunk://#diff-06bed920c0dad0cfd41d57a8d9e47a7b56832409649c10151061a791860d5bb5L18-R20)
[[2]](diffhunk://#diff-06bed920c0dad0cfd41d57a8d9e47a7b56832409649c10151061a791860d5bb5L92-R99)
[[3]](diffhunk://#diff-06bed920c0dad0cfd41d57a8d9e47a7b56832409649c10151061a791860d5bb5L230-R233)
[[4]](diffhunk://#diff-06bed920c0dad0cfd41d57a8d9e47a7b56832409649c10151061a791860d5bb5L652-R655)
Add ruff rules:
* FIX: https://docs.astral.sh/ruff/rules/#flake8-fixme-fix
* TD: https://docs.astral.sh/ruff/rules/#flake8-todos-td
Code cleanup:
*
[`libs/core/langchain_core/outputs/chat_generation.py`](diffhunk://#diff-a1017ee46f58fa4005b110ffd4f8e1fb08f6a2a11d6ca4c78ff8be641cbb89e5L56-R56):
Removed the "HACK" prefix from a comment in the `set_text` method.
Configuration adjustments:
*
[`libs/core/pyproject.toml`](diffhunk://#diff-06baaee12b22a370fef9f170c9ed13e2727e377d3b32f5018430f4f0a39d3537R85-R93):
Added new rules `FIX002`, `TD002`, and `TD003` to the ignore list.
*
[`libs/core/pyproject.toml`](diffhunk://#diff-06baaee12b22a370fef9f170c9ed13e2727e377d3b32f5018430f4f0a39d3537L102-L108):
Removed the `FIX` and `TD` rules from the ignore list.
Test refinement:
*
[`libs/core/tests/unit_tests/runnables/test_runnable.py`](diffhunk://#diff-06bed920c0dad0cfd41d57a8d9e47a7b56832409649c10151061a791860d5bb5L3231-R3232):
Updated a TODO comment to improve clarity in the `test_map_stream`
function.
- [ ] **PR title**: "community: Removes pandas dependency for using
DuckDB for similarity search"
- [ ] **PR message**:
- **Description:** Removes pandas dependency for using DuckDB for
similarity search. The old function still exists as
`similarity_search_pd`, while the new one is at `similarity_search` and
requires no code changes. Return format remains the same.
- **Issue:** Issue #29933 and update on PR #30435
- **Dependencies:** No dependencies
LangChain QwQ allows non-Tongyi users to access thinking models with
extra capabilities which serve as an extension to Alibaba Cloud.
Hi @ccurme I'm back with the updated PR this time with documentation and
a finished package.
- [x] **PR title**: "package: description"
- Where "package" is whichever of langchain, community, core, etc. is
being modified. Use "docs: ..." for purely docs changes, "infra: ..."
for CI changes.
- Example: "community: add foobar LLM"
- **Description:** adds documentation of `langchain-qwq` integration
package. Also adds it to Alibaba Cloud provider
- **Issue:** #30580#30317#30579
- **Dependencies:** openai, json-repair
- **Twitter handle:** YigitBekir
- [x] **Add tests and docs**: If you're adding a new integration, please
include
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in
`docs/docs/integrations` directory.
- [x] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/
Additional guidelines:
- Make sure optional dependencies are imported within a function.
- Please do not add dependencies to pyproject.toml files (even optional
ones) unless they are required for unit tests.
- Most PRs should not touch more than one package.
- Changes should be backwards compatible.
- If you are adding something to community, do not re-import it in
langchain.
If no one reviews your PR within a few days, please @-mention one of
baskaryan, eyurtsev, ccurme, vbarda, hwchase17.
**Description:**
Adds support for Riza custom runtimes to the two Riza code interpreter
tools, allowing users to run LLM-generated code that depends on
libraries outside stdlib.
**Issue:** N/A
**Dependencies:** None
**Twitter handle:** @rizaio
## Description:
This PR adds the necessary documentation for the `langchain-runpod`
partner package integration. It includes:
* A provider page (`docs/docs/integrations/providers/runpod.ipynb`)
explaining the overall setup.
* An LLM component page (`docs/docs/integrations/llms/runpod.ipynb`)
detailing the `RunPod` class usage.
* A Chat Model component page
(`docs/docs/integrations/chat/runpod.ipynb`) detailing the `ChatRunPod`
class usage, including a feature support table.
These documentation files reflect the latest features of the
`langchain-runpod` package (v0.2.0+) such as async support and API
polling logic.
This work also addresses the review feedback provided on the previous
attempt in PR #30246 by:
* Removing all TODOs from documentation.
* Adding the required links between provider and component pages.
* Completing the feature support table in the chat documentation.
* Linking to the source code on GitHub for API reference.
Finally, it registers the `langchain-runpod` package in
`libs/packages.yml`.
## Dependencies:
None added to the core LangChain repository by these documentation
changes. The required dependency (`langchain-runpod`) is managed as a
separate package.
## Twitter handle:
@runpod_io
---------
Co-authored-by: Max Forsey <maxpod@maxpod.local>
Co-authored-by: Chester Curme <chester.curme@gmail.com>