Commit Graph

7272 Commits

Author SHA1 Message Date
Oleg Ovcharuk
b6fe7e8c10
docs: YDB Vector Store docs (#30636)
This PR adds docs about how to use YDB as a vector store

[YDB](https://ydb.tech/) is a versatile open-source distributed SQL
database. It supports [vector
search](https://ydb.tech/docs/en/yql/reference/udf/list/knn) which means
it can be used as a vector store with langchain.

YDB vectore store comes with
[langchain-ydb](https://pypi.org/project/langchain-ydb/) pypi package.

Co-authored-by: ccurme <chester.curme@gmail.com>
2025-04-10 21:33:56 -04:00
湛露先生
7a4ae6fbff
community[patch]: simplify cache logic (#30760)
Thank you for contributing to LangChain!

- [ ] **PR title**: "package: description"
- Where "package" is whichever of langchain, community, core, etc. is
being modified. Use "docs: ..." for purely docs changes, "infra: ..."
for CI changes.
  - Example: "community: add foobar LLM"


- [ ] **PR message**: ***Delete this entire checklist*** and replace
with
    - **Description:** a description of the change
    - **Issue:** the issue # it fixes, if applicable
    - **Dependencies:** any dependencies required for this change
- **Twitter handle:** if your PR gets announced, and you'd like a
mention, we'll gladly shout you out!


- [ ] **Add tests and docs**: If you're adding a new integration, please
include
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in
`docs/docs/integrations` directory.


- [ ] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/

Additional guidelines:
- Make sure optional dependencies are imported within a function.
- Please do not add dependencies to pyproject.toml files (even optional
ones) unless they are required for unit tests.
- Most PRs should not touch more than one package.
- Changes should be backwards compatible.
- If you are adding something to community, do not re-import it in
langchain.

If no one reviews your PR within a few days, please @-mention one of
baskaryan, eyurtsev, ccurme, vbarda, hwchase17.

Signed-off-by: zhanluxianshen <zhanluxianshen@163.com>
2025-04-10 19:20:57 -04:00
ccurme
8e053ac9d2
core[patch]: support customization of backoff parameters in with_retries (#30773)
Co-authored-by: Sydney Runkle <54324534+sydney-runkle@users.noreply.github.com>
2025-04-10 19:18:36 -04:00
William FH
70532a65f8
Async callback benchmark (#30777) 2025-04-10 15:47:19 -07:00
Sydney Runkle
8f8fea2d7e
[performance]: Use hard coded langchain-core version to avoid importlib import (#30744)
This PR aims to reduce import time of `langchain-core` tools by removing
the `importlib.metadata` import previously used in `__init__.py`. This
is the first in a sequence of PRs to reduce import time delays for
`langchain-core` features and structures 🚀.

Because we're now hard coding the version, we need to make sure
`version.py` and `pyproject.toml` stay in sync, so I've added a new CI
job that runs whenever either of those files are modified. [This
run](https://github.com/langchain-ai/langchain/actions/runs/14358012706/job/40251952044?pr=30744)
demonstrates the failure that occurs whenever the version gets out of
sync (thus blocking a PR).

Before, note the ~15% of time spent on the `importlib.metadata` /related
imports

<img width="1081" alt="Screenshot 2025-04-09 at 9 06 15 AM"
src="https://github.com/user-attachments/assets/59f405ec-ee8d-4473-89ff-45dea5befa31"
/>

After (note, lack of `importlib.metadata` time sink):

<img width="1245" alt="Screenshot 2025-04-09 at 9 01 23 AM"
src="https://github.com/user-attachments/assets/9c32e77c-27ce-485e-9b88-e365193ed58d"
/>
2025-04-10 14:15:02 -04:00
Sydney Runkle
cd6a83117c
Adding more import time benchmarks for langchain-core (#30770)
Plus minor typo fix in `ChatPromptTemplate` case id.
2025-04-10 11:50:12 -04:00
amohan
44b83460b2
docs: Add Cloudflare integrations (#30749)
Description:
This PR adds documentation for the langchain-cloudflare integration
package.

Issue:
N/A

Dependencies:
No new dependencies are required.

Tests and Docs:

Added an example notebook demonstrating the usage of the
langchain-cloudflare package, located in docs/docs/integrations.
Added a new package to libs/packages.yml.

Lint and Format:

Successfully ran make format and make lint.

---------

Co-authored-by: Collier King <collier@cloudflare.com>
Co-authored-by: Collier King <collierking99@gmail.com>
2025-04-10 09:27:23 -04:00
ccurme
63c16f5ca8
community: deprecate AzureCosmosDBNoSqlVectorSearch in favor of langchain-azure-ai implementation (#30756) 2025-04-09 21:04:16 +00:00
Christophe Bornet
4cc7bc6c93
core: Add ruff rules PLR (#30696)
Add ruff rules [PLR](https://docs.astral.sh/ruff/rules/#refactor-plr)
Except PLR09xxx and PLR2004.

Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>
2025-04-09 15:15:38 -04:00
célina
68361f9c2d
partners: (langchain-huggingface) Embeddings - Integrate Inference Providers and remove deprecated code (#30735)
Hi there, This is a complementary PR to #30733.
This PR introduces support for Hugging Face's serverless Inference
Providers (documentation
[here](https://huggingface.co/docs/inference-providers/index)), allowing
users to specify different providers

This PR also removes the usage of `InferenceClient.post()` method in
`HuggingFaceEndpointEmbeddings`, in favor of the task-specific
`feature_extraction` method. `InferenceClient.post()` is deprecated and
will be removed in `huggingface_hub` v0.31.0.

## Changes made

- bumped the minimum required version of the `huggingface_hub` package
to ensure compatibility with the latest API usage.
- added a provider field to `HuggingFaceEndpointEmbeddings`, enabling
users to select the inference provider.
- replaced the deprecated `InferenceClient.post()` call in
`HuggingFaceEndpointEmbeddings` with the task-specific
`feature_extraction` method for future-proofing, `post()` will be
removed in `huggingface-hub` v0.31.0.

 All changes are backward compatible.

---------

Co-authored-by: Lucain <lucainp@gmail.com>
Co-authored-by: ccurme <chester.curme@gmail.com>
2025-04-09 19:05:43 +00:00
Christophe Bornet
98f0016fc2
core: Add ruff rules ARG (#30732)
See https://docs.astral.sh/ruff/rules/#flake8-unused-arguments-arg
2025-04-09 14:39:36 -04:00
Sydney Runkle
78ec7d886d
[performance]: Adding benchmarks for common langchain-core imports (#30747)
The first in a sequence of PRs focusing on improving performance in
core. We're starting with reducing import times for common structures,
hence the benchmarks here.

The benchmark looks a little bit complicated - we have to use a process
so that we don't suffer from Python's import caching system. I tried
doing manual modification of `sys.modules` between runs, but that's
pretty tricky / hacky to get right, hence the subprocess approach.

Motivated by extremely slow baseline for common imports (we're talking
2-5 seconds):

<img width="633" alt="Screenshot 2025-04-09 at 12 48 12 PM"
src="https://github.com/user-attachments/assets/994616fe-1798-404d-bcbe-48ad0eb8a9a0"
/>

Also added a `make benchmark` command to make local runs easy :).
Currently using walltimes so that we can track total time despite using
a manual proces.
2025-04-09 13:00:15 -04:00
German Molina
5fb261ce27
community: Google Vertex AI Search now returns the website title as part of the document metadata (#30688)
Google vertex ai search will now return the title of the found website
as part of the document metadata, if available.

Thank you for contributing to LangChain!

- **Description**: Vertex AI Search can be used to index websites and
then develop chatbots that use these websites to answer questions. At
present, the document metadata includes an `id` and `source` (which is
the URL). While the URL is enough to create a link, the ID is not
descriptive enough to show users. Therefore, I propose we return `title`
as well, when available (e.g., it will not be available in `.txt`
documents found during the website indexing).
- **Issue**: No bug in particular, but it would be better if this was
here.
- **Dependencies**: None
- I do not use twitter.

Format, Lint and Test seem to be all good.
2025-04-09 08:54:06 -04:00
Sydney Runkle
4556b81b1d
Clean up numpy dependencies and speed up 3.13 CI with numpy>=2.1.0 (#30714)
Generally, this PR is CI performance focused + aims to clean up some
dependencies at the same time.

1. Unpins upper bounds for `numpy` in all `pyproject.toml` files where
`numpy` is specified
2. Requires `numpy >= 2.1.0` for Python 3.13 and `numpy > v1.26.0` for
Python 3.12, plus a `numpy` min version bump for `chroma`
3. Speeds up CI by minutes - linting on Python 3.13, installing `numpy <
2.1.0` was taking [~3
minutes](https://github.com/langchain-ai/langchain/actions/runs/14316342925/job/40123305868?pr=30713),
now the entire env setup takes a few seconds
4. Deleted the `numpy` test dependency from partners where that was not
used, specifically `huggingface`, `voyageai`, `xai`, and `nomic`.

It's a bit unfortunate that `langchain-community` depends on `numpy`, we
might want to try to fix that in the future...

Closes https://github.com/langchain-ai/langchain/issues/26026
Fixes https://github.com/langchain-ai/langchain/issues/30555
2025-04-08 09:45:07 -04:00
湛露先生
9cbe91896e
Fix deepseek release tag, as it is update name. (#30717)
Thank you for contributing to LangChain!

- [ ] **PR title**: "package: description"
- Where "package" is whichever of langchain, community, core, etc. is
being modified. Use "docs: ..." for purely docs changes, "infra: ..."
for CI changes.
  - Example: "community: add foobar LLM"


- [ ] **PR message**: ***Delete this entire checklist*** and replace
with
    - **Description:** a description of the change
    - **Issue:** the issue # it fixes, if applicable
    - **Dependencies:** any dependencies required for this change
- **Twitter handle:** if your PR gets announced, and you'd like a
mention, we'll gladly shout you out!


- [ ] **Add tests and docs**: If you're adding a new integration, please
include
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in
`docs/docs/integrations` directory.


- [ ] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/

Additional guidelines:
- Make sure optional dependencies are imported within a function.
- Please do not add dependencies to pyproject.toml files (even optional
ones) unless they are required for unit tests.
- Most PRs should not touch more than one package.
- Changes should be backwards compatible.
- If you are adding something to community, do not re-import it in
langchain.

If no one reviews your PR within a few days, please @-mention one of
baskaryan, eyurtsev, ccurme, vbarda, hwchase17.

Signed-off-by: zhanluxianshen <zhanluxianshen@163.com>
2025-04-08 08:43:16 -04:00
Nithish Raghunandanan
893942651b
docs: Update couchbase vector store docs (#30710)
-  **Update LangChain-Couchbase documentation**
- Rename `CouchbaseVectorStore` in favor of `CouchbaseSearchVectorStore`

- [x] **Lint and test**
2025-04-07 18:45:14 -04:00
ccurme
a2bec5f2e5
ollama: release 0.3.1 (#30716) 2025-04-07 20:31:25 +00:00
ccurme
e3f15f0a47
ollama[patch]: add model_name to response metadata (#30706)
Fixes [this standard
test](https://python.langchain.com/api_reference/standard_tests/integration_tests/langchain_tests.integration_tests.chat_models.ChatModelIntegrationTests.html#langchain_tests.integration_tests.chat_models.ChatModelIntegrationTests.test_usage_metadata).
2025-04-07 16:27:58 -04:00
ccurme
e106e9602f
groq[patch]: add retries to integration tests (#30707)
Tool-calling tests started intermittently failing with
> groq.APIError: Failed to call a function. Please adjust your prompt.
See 'failed_generation' for more details.
2025-04-07 12:45:53 -04:00
Mohammad Mohtashim
e935da0b12
ChatTongyi reasoning_content fix (#30694)
- **Description:** Small fix for `reasoning_content` key
- **Issue:** #30689
2025-04-07 09:27:33 -04:00
Tin Lai
4d03ba4686
langchain_qdrant: fix showing the missing sparse vector name (#30701)
**Description:** The error message was supposed to display the missing
vector name, but instead, it includes only the existing collection
configs.

This simple PR just includes the correct variable name, so that the user
knows the requested vector does not exist in the collection.

Additional guidelines:
- Make sure optional dependencies are imported within a function.
- Please do not add dependencies to pyproject.toml files (even optional
ones) unless they are required for unit tests.
- Most PRs should not touch more than one package.
- Changes should be backwards compatible.
- If you are adding something to community, do not re-import it in
langchain.

If no one reviews your PR within a few days, please @-mention one of
baskaryan, eyurtsev, ccurme, vbarda, hwchase17.

Signed-off-by: Tin Lai <tin@tinyiu.com>
2025-04-07 09:19:08 -04:00
Christophe Bornet
6650b94627
core: Add ruff rules PYI (#29335)
See https://docs.astral.sh/ruff/rules/#flake8-pyi-pyi

---------

Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>
2025-04-04 19:59:44 +00:00
Philippe PRADOS
d8e3b7667f
community[patch]: Fix empty producer in PDF Parsers (#30620)
Fix an issue where if a pdf file doesn't have a “producer” in metadata, it generates an exception.
2025-04-04 15:53:49 -04:00
Christophe Bornet
f0159c7125
core: Add ruff rules PGH (except PGH003) (#30656)
Add ruff rules PGH: https://docs.astral.sh/ruff/rules/#pygrep-hooks-pgh
Except PGH003 which will be dealt in a dedicated PR.

---------

Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>
Co-authored-by: Eugene Yurtsev <eugene@langchain.dev>
2025-04-04 19:53:27 +00:00
Armaanjeet Singh Sandhu
7c2468f36b
core: Fix handler removal in BaseCallbackManager (Fixes #30640) (#30659)
**Description:**  
Fixed a bug in `BaseCallbackManager.remove_handler()` that caused a
`ValueError` when removing a handler added via the constructor's
`handlers` parameter. The issue occurred because handlers passed to the
constructor were added only to the `handlers` list and not automatically
to `inheritable_handlers` unless explicitly specified. However,
`remove_handler()` attempted to remove the handler from both lists
unconditionally, triggering a `ValueError` when it wasn't in
`inheritable_handlers`.

The fix ensures the method checks for the handler’s presence in each
list before attempting removal, making it more robust while preserving
its original behavior.

**Issue:** Fixes #30640

**Dependencies:** None

---------

Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>
2025-04-04 15:45:15 -04:00
Mohammad Mohtashim
bff56c5fa6
community[patch]: Redundant Parser checker for Webbaseloader (#30632)
- **Description:** We do not need to set parser in `scrape` since it is
already been done in `_scrape`
- **Issue:** #30629, not directly related but makes sure xml parser is
used
2025-04-04 14:11:26 -04:00
Christophe Bornet
150ac0cb79
core: Add ruff rules DTZ (#30657)
Add ruff rules DTZ:
https://docs.astral.sh/ruff/rules/#flake8-datetimez-dtz
2025-04-04 13:43:47 -04:00
Christophe Bornet
5e418c2666
core: Rework pydantic version checks (#30653)
This pull request includes various changes to the `langchain_core`
library, focusing on improving compatibility with different versions of
Pydantic. The primary change involves replacing checks for Pydantic
major versions with boolean flags, which simplifies the code and
improves readability.
This also solves ruff rule checks for
[RUF048](https://docs.astral.sh/ruff/rules/map-int-version-parsing/) and
[PLR2004](https://docs.astral.sh/ruff/rules/magic-value-comparison/).

Key changes include:

### Compatibility Improvements:
*
[`libs/core/langchain_core/output_parsers/json.py`](diffhunk://#diff-5add0cf7134636ae4198a1e0df49ee332ae0c9123c3a2395101e02687c717646L22-R24):
Replaced `PYDANTIC_MAJOR_VERSION` with `IS_PYDANTIC_V1` to check for
Pydantic version 1.
*
[`libs/core/langchain_core/output_parsers/pydantic.py`](diffhunk://#diff-2364b5b4aee01c462aa5dbda5dc3a877dcd20f29df173ad540dc8adf8b192361L14-R14):
Updated version checks from `PYDANTIC_MAJOR_VERSION` to `IS_PYDANTIC_V2`
in the `PydanticOutputParser` class.
[[1]](diffhunk://#diff-2364b5b4aee01c462aa5dbda5dc3a877dcd20f29df173ad540dc8adf8b192361L14-R14)
[[2]](diffhunk://#diff-2364b5b4aee01c462aa5dbda5dc3a877dcd20f29df173ad540dc8adf8b192361L27-R27)

### Utility Enhancements:
*
[`libs/core/langchain_core/utils/pydantic.py`](diffhunk://#diff-ff28020c5f1073a8b63bcd9d8b756a187fd682cb81935295120c63b207071896R23):
Introduced `IS_PYDANTIC_V1` and `IS_PYDANTIC_V2` flags and deprecated
the `get_pydantic_major_version` function. Updated various functions to
use these flags instead of version numbers.
[[1]](diffhunk://#diff-ff28020c5f1073a8b63bcd9d8b756a187fd682cb81935295120c63b207071896R23)
[[2]](diffhunk://#diff-ff28020c5f1073a8b63bcd9d8b756a187fd682cb81935295120c63b207071896R42-R78)
[[3]](diffhunk://#diff-ff28020c5f1073a8b63bcd9d8b756a187fd682cb81935295120c63b207071896L90-R89)
[[4]](diffhunk://#diff-ff28020c5f1073a8b63bcd9d8b756a187fd682cb81935295120c63b207071896L104-R101)
[[5]](diffhunk://#diff-ff28020c5f1073a8b63bcd9d8b756a187fd682cb81935295120c63b207071896L120-R122)
[[6]](diffhunk://#diff-ff28020c5f1073a8b63bcd9d8b756a187fd682cb81935295120c63b207071896L135-R132)
[[7]](diffhunk://#diff-ff28020c5f1073a8b63bcd9d8b756a187fd682cb81935295120c63b207071896L149-R151)
[[8]](diffhunk://#diff-ff28020c5f1073a8b63bcd9d8b756a187fd682cb81935295120c63b207071896L164-R161)
[[9]](diffhunk://#diff-ff28020c5f1073a8b63bcd9d8b756a187fd682cb81935295120c63b207071896L248-R250)
[[10]](diffhunk://#diff-ff28020c5f1073a8b63bcd9d8b756a187fd682cb81935295120c63b207071896L330-R335)
[[11]](diffhunk://#diff-ff28020c5f1073a8b63bcd9d8b756a187fd682cb81935295120c63b207071896L356-R357)
[[12]](diffhunk://#diff-ff28020c5f1073a8b63bcd9d8b756a187fd682cb81935295120c63b207071896L393-R390)
[[13]](diffhunk://#diff-ff28020c5f1073a8b63bcd9d8b756a187fd682cb81935295120c63b207071896L403-R400)

### Test Updates:
*
[`libs/core/tests/unit_tests/output_parsers/test_openai_tools.py`](diffhunk://#diff-694cc0318edbd6bbca34f53304934062ad59ba9f5a788252ce6c5f5452489d67L19-R22):
Updated tests to use `IS_PYDANTIC_V1` and `IS_PYDANTIC_V2` for version
checks.
[[1]](diffhunk://#diff-694cc0318edbd6bbca34f53304934062ad59ba9f5a788252ce6c5f5452489d67L19-R22)
[[2]](diffhunk://#diff-694cc0318edbd6bbca34f53304934062ad59ba9f5a788252ce6c5f5452489d67L532-R535)
[[3]](diffhunk://#diff-694cc0318edbd6bbca34f53304934062ad59ba9f5a788252ce6c5f5452489d67L567-R570)
[[4]](diffhunk://#diff-694cc0318edbd6bbca34f53304934062ad59ba9f5a788252ce6c5f5452489d67L602-R605)
*
[`libs/core/tests/unit_tests/prompts/test_chat.py`](diffhunk://#diff-3e60e744842086a4f3c4b21bc83e819c3435720eab210078e77e2430fb8c7e84R7):
Replaced version tuple checks with `PYDANTIC_VERSION` comparisons.
[[1]](diffhunk://#diff-3e60e744842086a4f3c4b21bc83e819c3435720eab210078e77e2430fb8c7e84R7)
[[2]](diffhunk://#diff-3e60e744842086a4f3c4b21bc83e819c3435720eab210078e77e2430fb8c7e84L35-R38)
[[3]](diffhunk://#diff-3e60e744842086a4f3c4b21bc83e819c3435720eab210078e77e2430fb8c7e84L924-R927)
[[4]](diffhunk://#diff-3e60e744842086a4f3c4b21bc83e819c3435720eab210078e77e2430fb8c7e84L935-R938)
*
[`libs/core/tests/unit_tests/runnables/test_graph.py`](diffhunk://#diff-99a290330ef40103d0ce02e52e21310d6fadea142bfdea13c94d23fc81c0bb5dR3):
Simplified version checks using `PYDANTIC_VERSION`.
[[1]](diffhunk://#diff-99a290330ef40103d0ce02e52e21310d6fadea142bfdea13c94d23fc81c0bb5dR3)
[[2]](diffhunk://#diff-99a290330ef40103d0ce02e52e21310d6fadea142bfdea13c94d23fc81c0bb5dL15-R18)
[[3]](diffhunk://#diff-99a290330ef40103d0ce02e52e21310d6fadea142bfdea13c94d23fc81c0bb5dL234-L239)
*
[`libs/core/tests/unit_tests/runnables/test_runnable.py`](diffhunk://#diff-06bed920c0dad0cfd41d57a8d9e47a7b56832409649c10151061a791860d5bb5L18-R20):
Introduced `PYDANTIC_VERSION_AT_LEAST_29` and
`PYDANTIC_VERSION_AT_LEAST_210` for more readable version checks.
[[1]](diffhunk://#diff-06bed920c0dad0cfd41d57a8d9e47a7b56832409649c10151061a791860d5bb5L18-R20)
[[2]](diffhunk://#diff-06bed920c0dad0cfd41d57a8d9e47a7b56832409649c10151061a791860d5bb5L92-R99)
[[3]](diffhunk://#diff-06bed920c0dad0cfd41d57a8d9e47a7b56832409649c10151061a791860d5bb5L230-R233)
[[4]](diffhunk://#diff-06bed920c0dad0cfd41d57a8d9e47a7b56832409649c10151061a791860d5bb5L652-R655)
2025-04-04 13:42:30 -04:00
Christophe Bornet
43b5dc7191
core: Add ruff rules TD and FIX (#30654)
Add ruff rules:
* FIX: https://docs.astral.sh/ruff/rules/#flake8-fixme-fix
* TD: https://docs.astral.sh/ruff/rules/#flake8-todos-td

Code cleanup:

*
[`libs/core/langchain_core/outputs/chat_generation.py`](diffhunk://#diff-a1017ee46f58fa4005b110ffd4f8e1fb08f6a2a11d6ca4c78ff8be641cbb89e5L56-R56):
Removed the "HACK" prefix from a comment in the `set_text` method.

Configuration adjustments:

*
[`libs/core/pyproject.toml`](diffhunk://#diff-06baaee12b22a370fef9f170c9ed13e2727e377d3b32f5018430f4f0a39d3537R85-R93):
Added new rules `FIX002`, `TD002`, and `TD003` to the ignore list.
*
[`libs/core/pyproject.toml`](diffhunk://#diff-06baaee12b22a370fef9f170c9ed13e2727e377d3b32f5018430f4f0a39d3537L102-L108):
Removed the `FIX` and `TD` rules from the ignore list.

Test refinement:

*
[`libs/core/tests/unit_tests/runnables/test_runnable.py`](diffhunk://#diff-06bed920c0dad0cfd41d57a8d9e47a7b56832409649c10151061a791860d5bb5L3231-R3232):
Updated a TODO comment to improve clarity in the `test_map_stream`
function.
2025-04-04 13:40:42 -04:00
ccurme
a007c57285
docs: update package registry sort order (#30677) 2025-04-04 13:12:39 -04:00
Sydney Runkle
33ed7c31da
docs: fix perplexity install instructions in ChatPerplexity docstring (#30676)
* `openai` install no longer needs to be done manually
2025-04-04 12:58:18 -04:00
Dhruvajyoti Sarma
f9bb5ec5d0
feature: removed pandas dataframe dependency for similary_search when using DuckDB as vector store (#30445)
- [ ] **PR title**: "community: Removes pandas dependency for using
DuckDB for similarity search"


- [ ] **PR message**: 
- **Description:** Removes pandas dependency for using DuckDB for
similarity search. The old function still exists as
`similarity_search_pd`, while the new one is at `similarity_search` and
requires no code changes. Return format remains the same.
    - **Issue:** Issue #29933 and update on PR #30435 
    - **Dependencies:** No dependencies
2025-04-04 12:19:18 -04:00
Akshay Dongare
f79473b752
Solved issue Implement langchain-litellm #30368 (#30637)
**PR title**: 
- [x] 1. docs: docs/docs/integrations/providers/LiteLLM.md
- [x] 2. docs: docs/docs/integrations/chat/litellm.ipynb
- [x] 3. libs: libs/packages.yml

- [x] **PR message**:
    - **Description:** Implement langchain-litellm
    - **Issue:** the issue #30368 
    - **Twitter handle:** akshay_d02
    - **LinkedIn Handle** https://linkedin.com/in/akshay-dongare 

- [x] **Add tests and docs**: Done

- [x] **Lint and test**: Done

---------

Co-authored-by: Chester Curme <chester.curme@gmail.com>
2025-04-04 16:12:10 +00:00
Yiğit Bekir Kaya, PhD
87e82fe1e8
Added langchain-qwq package documentation (Alibaba Cloud) (#30628)
LangChain QwQ allows non-Tongyi users to access thinking models with
extra capabilities which serve as an extension to Alibaba Cloud.

Hi @ccurme I'm back with the updated PR this time with documentation and
a finished package.

- [x] **PR title**: "package: description"
- Where "package" is whichever of langchain, community, core, etc. is
being modified. Use "docs: ..." for purely docs changes, "infra: ..."
for CI changes.
  - Example: "community: add foobar LLM"



- **Description:** adds documentation of `langchain-qwq` integration
package. Also adds it to Alibaba Cloud provider
- **Issue:** #30580 #30317 #30579
- **Dependencies:** openai, json-repair
- **Twitter handle:** YigitBekir


- [x] **Add tests and docs**: If you're adding a new integration, please
include
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in
`docs/docs/integrations` directory.


- [x] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/

Additional guidelines:
- Make sure optional dependencies are imported within a function.
- Please do not add dependencies to pyproject.toml files (even optional
ones) unless they are required for unit tests.
- Most PRs should not touch more than one package.
- Changes should be backwards compatible.
- If you are adding something to community, do not re-import it in
langchain.

If no one reviews your PR within a few days, please @-mention one of
baskaryan, eyurtsev, ccurme, vbarda, hwchase17.
2025-04-04 11:47:14 -04:00
Andrew Benton
4e7a9a7014
community: Add support for custom runtimes to Riza tools (#30664)
**Description:**
Adds support for Riza custom runtimes to the two Riza code interpreter
tools, allowing users to run LLM-generated code that depends on
libraries outside stdlib.
**Issue:** N/A
**Dependencies:** None
**Twitter handle:** @rizaio
2025-04-04 11:03:14 -04:00
diego dupin
aa37893c00
MariaDB vector store documentation addition (#30229)
### New Feature

Since version 11.7.1, MariaDB support vector. This is a super fast
implementation (see [some perf
blog](https://smalldatum.blogspot.com/2025/01/evaluating-vector-indexes-in-mariadb.html)
The goal is to support MariaDB with langchain

Implementation is done in
https://github.com/mariadb-corporation/langchain-mariadb, published in
https://pypi.org/project/langchain-mariadb/

This concerns the doc addition
 

(initial PR https://github.com/langchain-ai/langchain/pull/29989)

---------

Co-authored-by: ccurme <chester.curme@gmail.com>
Co-authored-by: Oskar Stark <oskarstark@googlemail.com>
2025-04-04 14:56:25 +00:00
Sydney Runkle
1cdea6ab07
langchain-community: release 0.3.21 (#30673) 2025-04-04 14:14:50 +00:00
Sydney Runkle
901dffe06b
langchain: release 0.3.23 (#30670)
* Bump `text-splitters` min version
* Bump `langchain-core` min version
* Bump `langchain` version 🚀
2025-04-04 10:06:29 -04:00
ccurme
0c2c8c36c1
text-splitters: release 0.3.8 (#30671) 2025-04-04 09:58:45 -04:00
ccurme
59d508a2ee
openai[patch]: make computer test more reliable (#30672) 2025-04-04 13:53:59 +00:00
Sydney Runkle
c235328b39 Revert "update langchain version and bump min core v"
This reverts commit d0f154dbaa.
2025-04-04 09:31:51 -04:00
Sydney Runkle
d0f154dbaa update langchain version and bump min core v 2025-04-04 09:27:49 -04:00
Sydney Runkle
32cd70d7d2
release: bump core to v0.3.51 (#30668) 2025-04-04 13:23:09 +00:00
Max Forsey
18cf457eec
langchain-runpod integration (#30648)
## Description:

This PR adds the necessary documentation for the `langchain-runpod`
partner package integration. It includes:

* A provider page (`docs/docs/integrations/providers/runpod.ipynb`)
explaining the overall setup.
* An LLM component page (`docs/docs/integrations/llms/runpod.ipynb`)
detailing the `RunPod` class usage.
* A Chat Model component page
(`docs/docs/integrations/chat/runpod.ipynb`) detailing the `ChatRunPod`
class usage, including a feature support table.

These documentation files reflect the latest features of the
`langchain-runpod` package (v0.2.0+) such as async support and API
polling logic.

This work also addresses the review feedback provided on the previous
attempt in PR #30246 by:
*   Removing all TODOs from documentation.
*   Adding the required links between provider and component pages.
*   Completing the feature support table in the chat documentation.
*   Linking to the source code on GitHub for API reference.

Finally, it registers the `langchain-runpod` package in
`libs/packages.yml`.

## Dependencies:

None added to the core LangChain repository by these documentation
changes. The required dependency (`langchain-runpod`) is managed as a
separate package.

## Twitter handle:

@runpod_io

---------

Co-authored-by: Max Forsey <maxpod@maxpod.local>
Co-authored-by: Chester Curme <chester.curme@gmail.com>
2025-04-03 23:57:06 +00:00
Sydney Runkle
af66ab098e
Adding Perplexity extra and deprecating the community version of ChatPerplexity (#30649)
Plus, some accompanying docs updates

Some compelling usage:

```py
from langchain_perplexity import ChatPerplexity


chat = ChatPerplexity(model="llama-3.1-sonar-small-128k-online")
response = chat.invoke(
    "What were the most significant newsworthy events that occurred in the US recently?",
    extra_body={"search_recency_filter": "week"},
)
print(response.content)
# > Here are the top significant newsworthy events in the US recently: ...
```

Also, some confirmation of structured outputs:

```py
from langchain_perplexity import ChatPerplexity
from pydantic import BaseModel


class AnswerFormat(BaseModel):
    first_name: str
    last_name: str
    year_of_birth: int
    num_seasons_in_nba: int


messages = [
    {"role": "system", "content": "Be precise and concise."},
    {
        "role": "user",
        "content": (
            "Tell me about Michael Jordan. "
            "Please output a JSON object containing the following fields: "
            "first_name, last_name, year_of_birth, num_seasons_in_nba. "
        ),
    },
]

llm = ChatPerplexity(model="llama-3.1-sonar-small-128k-online")
structured_llm = llm.with_structured_output(AnswerFormat)
response = structured_llm.invoke(messages)
print(repr(response))
#> AnswerFormat(first_name='Michael', last_name='Jordan', year_of_birth=1963, num_seasons_in_nba=15)
```
2025-04-03 14:29:17 -04:00
ccurme
374769e8fe
core[patch]: log information from certain errors (#30626)
Some exceptions raised by SDKs include information in httpx responses
(see for example
[OpenAI](https://github.com/openai/openai-python/blob/main/src/openai/_exceptions.py)).
Here we trace information from those exceptions.

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
2025-04-03 16:45:19 +00:00
Sydney Runkle
17a9cd61e9
Bump langchain-core version in perplexity's pyproject.toml (#30647)
Blocking v0.1.0 release of `langchain-perplexity`
2025-04-03 16:19:10 +00:00
Sydney Runkle
3814bd1ea7
partners: Add Perplexity Chat Integration (#30618)
Perplexity's importance in the space has been growing, so we think it's
time to add an official integration!

Note: following the release of `langchain-perplexity` to `pypi`, we
should be able to add `perplexity` as an extra in
`libs/langchain/pyproject.toml`, but we're blocked by a circular import
for now.

---------

Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>
Co-authored-by: Chester Curme <chester.curme@gmail.com>
2025-04-03 16:09:14 +00:00
Alejandro Rodríguez
884125e129
community: support usage_metadata for litellm (#30625)
Support "usage_metadata" for LiteLLM. 

If no one reviews your PR within a few days, please @-mention one of
baskaryan, eyurtsev, ccurme, vbarda, hwchase17.
2025-04-02 19:45:15 -04:00
Christophe Bornet
f241fd5c11
core: Add ruff rules RET (#29384)
See https://docs.astral.sh/ruff/rules/#flake8-return-ret
All auto-fixes
2025-04-02 16:59:56 -04:00
Eugene Yurtsev
9ae792f56c
core: 0.3.50 release (#30623)
0.3.50 release
2025-04-02 14:46:23 -04:00
Christophe Bornet
ccc3d32ec8
core: Add ruff rules for Pylint PLC (Convention) and PLE (Errors) (#29286)
See https://docs.astral.sh/ruff/rules/#pylint-pl
2025-04-02 10:58:03 -04:00
ccurme
fe0fd9dd70
openai[patch]: upgrade tiktoken and fix test (#30621)
Related to https://github.com/langchain-ai/langchain/issues/30344

https://github.com/langchain-ai/langchain/pull/30542 introduced an
erroneous test for token counts for o-series models. tiktoken==0.8 does
not support o-series models in
`tiktoken.encoding_for_model(model_name)`, and this is the version of
tiktoken we had in the lock file. So we would default to `cl100k_base`
for o-series, which is the wrong encoding model. The test tested against
this wrong encoding (so it passed with tiktoken 0.8).

Here we update tiktoken to 0.9 in the lock file, and fix the expected
counts in the test. Verified that we are pulling
[o200k_base](https://github.com/openai/tiktoken/blob/main/tiktoken/model.py#L8),
as expected.
2025-04-02 10:44:48 -04:00
oxy-tg
38807871ec
docs: Add Oxylabs integration (#30591)
Description:
This PR adds documentation for the langchain-oxylabs integration
package.

The documentation includes instructions for configuring Oxylabs
credentials and provides example code demonstrating how to use the
package.

Issue:
N/A

Dependencies:
No new dependencies are required.

Tests and Docs:

Added an example notebook demonstrating the usage of the
Langchain-Oxylabs package, located in docs/docs/integrations.
Added a provider page in docs/docs/providers.
Added a new package to libs/packages.yml.

Lint and Test:

Successfully ran make format, make lint, and make test.
2025-04-02 14:40:32 +00:00
ccurme
816492e1d3
openai: release 0.3.12 (#30616) 2025-04-02 13:20:15 +00:00
Bagatur
111dd90a46
openai[patch]: support structured output and tools (#30581)
Co-authored-by: ccurme <chester.curme@gmail.com>
2025-04-02 09:14:02 -04:00
Mahir Shah
9d3262c7aa
core: Propagate config_factories in RunnableBinding (#30603)
- **Description:** Propagates config_factories when calling decoration
methods for RunnableBinding--e.g. bind, with_config, with_types,
with_retry, and with_listeners. This ensures that configs attached to
the original RunnableBinding are kept when creating the new
RunnableBinding and the configs are merged during invocation. Picks up
where #30551 left off.
  - **Issue:** #30531

Co-authored-by: ccurme <chester.curme@gmail.com>
2025-04-01 18:03:58 -04:00
ccurme
8a69de5c24
openai[patch]: ignore file blocks when counting tokens (#30601)
OpenAI does not appear to document how it transforms PDF pages to
images, which determines how tokens are counted:
https://platform.openai.com/docs/guides/pdf-files?api-mode=chat#usage-considerations

Currently these block types raise ValueError inside
`get_num_tokens_from_messages`. Here we update to generate a warning and
continue.
2025-04-01 15:29:33 -04:00
Christophe Bornet
558191198f
core: Add ruff rule FBT003 (boolean-trap) (#29424)
See
https://docs.astral.sh/ruff/rules/boolean-positional-value-in-call/#boolean-positional-value-in-call-fbt003
This PR also fixes some FBT001/002 in private methods but does not
enforce these rules globally atm.

Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>
2025-04-01 17:40:12 +00:00
Christophe Bornet
4f8ea13cea
core: Add ruff rules PERF (#29375)
See https://docs.astral.sh/ruff/rules/#perflint-perf
2025-04-01 13:34:56 -04:00
Christophe Bornet
8a33402016
core: Add ruff rules PT (pytest) (#29381)
See https://docs.astral.sh/ruff/rules/#flake8-pytest-style-pt
2025-04-01 13:31:07 -04:00
Christophe Bornet
768e4f695a
core: Add ruff rules S110 and S112 (#30599) 2025-04-01 13:17:22 -04:00
Christophe Bornet
88b4233fa1
core: Add ruff rules D (docstring) (#29406)
This ensures that the code is properly documented:
https://docs.astral.sh/ruff/rules/#pydocstyle-d

Related to #21983
2025-04-01 13:15:45 -04:00
Andras L Ferenczi
64df60e690
community[minor]: Add custom sitemap URL parameter to GitbookLoader (#30549)
## Description
This PR adds a new `sitemap_url` parameter to the `GitbookLoader` class
that allows users to specify a custom sitemap URL when loading content
from a GitBook site. This is particularly useful for GitBook sites that
use non-standard sitemap file names like `sitemap-pages.xml` instead of
the default `sitemap.xml`.
The standard `GitbookLoader` assumes that the sitemap is located at
`/sitemap.xml`, but some GitBook instances (including GitBook's own
documentation) use different paths for their sitemaps. This parameter
makes the loader more flexible and helps users extract content from a
wider range of GitBook sites.
## Issue
Fixes bug
[30473](https://github.com/langchain-ai/langchain/issues/30473) where
the `GitbookLoader` would fail to find pages on GitBook sites that use
custom sitemap URLs.
## Dependencies
No new dependencies required.
*I've added*:
* Unit tests to verify the parameter works correctly
* Integration tests to confirm the parameter is properly used with real
GitBook sites
* Updated docstrings with parameter documentation
The changes are fully backward compatible, as the parameter is optional
with a sensible default.

---------

Co-authored-by: andrasfe <andrasf94@gmail.com>
Co-authored-by: Eugene Yurtsev <eugene@langchain.dev>
2025-04-01 16:17:21 +00:00
Christophe Bornet
fdda1aaea1
core: Accept ALL ruff rules with exclusions (#30595)
This pull request updates the `pyproject.toml` configuration file to
modify the linting rules and ignored warnings for the project. The most
important changes include switching to a more comprehensive selection of
linting rules and updating the list of ignored rules to better align
with the project's requirements.

Linting rules update:

* Changed the `select` option to include all available linting rules by
setting it to `["ALL"]`.

Ignored rules update:

* Updated the `ignore` option to include specific rules that interfere
with the formatter, are incompatible with Pydantic, or are temporarily
excluded due to project constraints.
2025-04-01 11:17:51 -04:00
Kacper Włodarczyk
26a3256fc6
community[major]: DynamoDBChatMessageHistory bulk add messages, raise errors (#30572)
This PR addresses two key issues:

- **Prevent history errors from failing silently**: Previously, errors
in message history were only logged and not raised, which can lead to
inconsistent state and downstream failures (e.g., ValidationError from
Bedrock due to malformed message history). This change ensures that such
errors are raised explicitly, making them easier to detect and debug.
(Side note: I’m using AWS Lambda Powertools Logger but hadn’t configured
it properly with the standard Python logger—my bad. If the error had
been raised, I would’ve seen it in the logs 😄) This is a **BREAKING
CHANGE**

- **Add messages in bulk instead of iteratively**: This introduces a
custom add_messages method to add all messages at once. The previous
approach failed silently when individual messages were too large,
resulting in partial history updates and inconsistent state. With this
change, either all messages are added successfully, or none are—helping
avoid obscure history-related errors from Bedrock.

---------

Co-authored-by: Kacper Wlodarczyk <kacper.wlodarczyk@chaosgears.com>
2025-04-01 11:13:32 -04:00
Armaanjeet Singh Sandhu
4bbc249b13
community: Fix attribute access for transcript text in YoutubeLoader (Fixes #30309) (#30582)
**Description:** 
Fixes a bug in the YoutubeLoader where FetchedTranscript objects were
not properly processed. The loader was only extracting the 'text'
attribute from FetchedTranscriptSnippet objects while ignoring 'start'
and 'duration' attributes. This would cause a TypeError when the code
later tried to access these missing keys, particularly when using the
CHUNKS format or any code path that needed timestamp information.

This PR modifies the conversion of FetchedTranscriptSnippet objects to
include all necessary attributes, ensuring that the loader works
correctly with all transcript formats.

**Issue:** Fixes #30309

**Dependencies:** None

**Testing:**
- Tested the fix with multiple YouTube videos to confirm it resolves the
issue
- Verified that both regular loading and CHUNKS format work correctly
2025-04-01 07:13:06 -04:00
Ivan Brko
ecff055096
community[minor]: Improve Brave Search Tool, allow api key in env var (#30364)
- **Description:** 

- Make Brave Search Tool consistent with other tools and allow reading
its api key from `BRAVE_SEARCH_API_KEY` instead of having to pass the
api key manually (no breaking changes)
- Improve Brave Search Tool by storing api key in `SecretStr` instead of
plain `str`.
    - Add unit test for `BraveSearchWrapper`
    - Reflect the changes in the documentation
  - **Issue:** N/A
  - **Dependencies:** N/A
  - **Twitter handle:** ivan_brko
2025-03-31 14:48:52 -04:00
ccurme
0c623045b5
core[patch]: pydantic 2.11 compat (#30554)
Release notes: https://pydantic.dev/articles/pydantic-v2-11-release

Covered here:

- We no longer access `model_fields` on class instances (that is now
deprecated);
- Update schema normalization for Pydantic version testing to reflect
changes to generated JSON schema (addition of `"additionalProperties":
True` for dict types with value Any or object).

## Considerations:

### Changes to JSON schema generation

#### Tool-calling / structured outputs

This may impact tool-calling + structured outputs for some providers,
but schema generation only changes if you have parameters of the form
`dict`, `dict[str, Any]`, `dict[str, object]`, etc. If dict parameters
are typed my understanding is there are no changes.

For OpenAI for example, untyped dicts work for structured outputs with
default settings before and after updating Pydantic, and error both
before/after if `strict=True`.

### Use of `model_fields`

There is one spot where we previously accessed `super(cls,
self).model_fields`, where `cls` is an object in the MRO. This was done
for the purpose of tracking aliases in secrets. I've updated this to
always be `type(self).model_fields`-- see comment in-line for detail.

---------

Co-authored-by: Sydney Runkle <54324534+sydney-runkle@users.noreply.github.com>
2025-03-31 14:22:57 -04:00
keshavshrikant
e8be3cca5c
fix huggingface tokenizer default length function (#30185)
#30184
2025-03-31 11:54:30 -04:00
Wenqi Li
64f97e707e
ollama[patch]: Support seed param for OllamaLLM (#30553)
**Description:** a description of the change
add the seed param for OllamaLLM client reproducibility

**Issue:** the issue # it fixes, if applicable
follow up of a similar issue
https://github.com/langchain-ai/langchain/issues/24703
see also https://github.com/langchain-ai/langchain/pull/24782

**Dependencies:** any dependencies required for this change
n/a
2025-03-31 11:28:49 -04:00
Christophe Bornet
8395abbb42
core: Fix test_stream_error_callback (#30228)
Fixes #29436

---------

Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>
2025-03-31 10:37:22 -04:00
Christophe Bornet
026de908eb
core: Add ruff rules G, FA, INP, AIR and ISC (#29334)
Fixes mostly for rules G. See
https://docs.astral.sh/ruff/rules/#flake8-logging-format-g
2025-03-31 10:05:23 -04:00
ccurme
b4fe1f1ec0
groq: release 0.3.2 (#30570) 2025-03-31 13:29:45 +00:00
ccurme
9c682af8f3
langchain: release 0.3.22 (#30557)
Closes https://github.com/langchain-ai/langchain/issues/30536
2025-03-30 14:48:22 -04:00
William FH
b075eab3e0
Include delayed inputs in langchain tracer (#30546) 2025-03-28 16:07:22 -07:00
Thommy257
372dc7f991
core[patch]: fix loss of partially initialized variables during prompt composition (#30096)
**Description:**
This PR addresses the loss of partially initialised variables when
composing different prompts. I.e. it allows the following snippet to
run:

```python
from langchain_core.prompts import ChatPromptTemplate

prompt = ChatPromptTemplate.from_messages([('system', 'Prompt {x} {y}')]).partial(x='1')
appendix = ChatPromptTemplate.from_messages([('system', 'Appendix {z}')])

(prompt + appendix).invoke({'y': '2', 'z': '3'})
```

Previously, this would have raised a `KeyError`, stating that variable
`x` remains undefined.

**Issue**
References issue #30049

**Todo**
- [x] **Add tests and docs**: If you're adding a new integration, please
include
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in
`docs/docs/integrations` directory.


- [x] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/

---------

Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>
2025-03-28 20:41:57 +00:00
Koshik Debanath
e7883d5b9f
langchain-openai: Support token counting for o-series models in ChatOpenAI (#30542)
Related to #30344

Add support for token counting for o-series models in
`test_token_counts.py`.

* **Update `_MODELS` and `_CHAT_MODELS` dictionaries**
- Add "o1", "o3", and "gpt-4o" to `_MODELS` and `_CHAT_MODELS`
dictionaries.

* **Update token counts**
  - Add token counts for "o1", "o3", and "gpt-4o" models.

---

For more details, open the [Copilot Workspace
session](https://copilot-workspace.githubnext.com/langchain-ai/langchain/pull/30542?shareId=ab208bf7-80a3-4b8d-80c4-2287486fedae).
2025-03-28 16:02:09 -04:00
Eugene Yurtsev
d075ad21a0
core[patch]: specify default event loop scope in pyproject.toml (#30543)
Specify default event loop scope
2025-03-28 19:51:19 +00:00
Ahmed Tammaa
f23c3e2444
text-splitters[patch]: Refactor HTMLHeaderTextSplitter for Enhanced Maintainability and Readability (#29397)
Please see PR #27678 for context

## Overview

This pull request presents a refactor of the `HTMLHeaderTextSplitter`
class aimed at improving its maintainability and readability. The
primary enhancements include simplifying the internal structure by
consolidating multiple private helper functions into a single private
method, thereby reducing complexity and making the codebase easier to
understand and extend. Importantly, all existing functionalities and
public interfaces remain unchanged.

## PR Goals

1. **Simplify Internal Logic**:
- **Consolidation of Private Methods**: The original implementation
utilized multiple private helper functions (`_header_level`,
`_dom_depth`, `_get_elements`) to manage different aspects of HTML
parsing and document generation. This fragmentation increased cognitive
load and potential maintenance overhead.
- **Streamlined Processing**: By merging these functionalities into a
single private method (`_generate_documents`), the class now offers a
more straightforward flow, making it easier for developers to trace and
understand the processing steps. (Thanks to @eyurtsev)

2. **Enhance Readability**:
- **Clearer Method Responsibilities**: With fewer private methods, each
method now has a more focused responsibility. The primary logic resides
within `_generate_documents`, which handles both HTML traversal and
document creation in a cohesive manner.
- **Reduced Redundancy**: Eliminating redundant checks and consolidating
logic reduces the code's verbosity, making it more concise without
sacrificing clarity.

3. **Improve Maintainability**:
- **Easier Debugging and Extension**: A simplified internal structure
allows for quicker identification of issues and easier implementation of
future enhancements or feature additions.
- **Consistent Header Management**: The new implementation ensures that
headers are managed consistently within a single context, reducing the
likelihood of bugs related to header scope and hierarchy.

4. **Maintain Backward Compatibility**:
- **Unchanged Public Interface**: All public methods (`split_text`,
`split_text_from_url`, `split_text_from_file`) and their signatures
remain unchanged, ensuring that existing integrations and usage patterns
are unaffected.
- **Preserved Docstrings**: Comprehensive docstrings are retained,
providing clear documentation for users and developers alike.

## Detailed Changes

1. **Removed Redundant Private Methods**:
- **Eliminated `_header_level`, `_dom_depth`, and `_get_elements`**:
These methods were merged into the `_generate_documents` method,
centralizing the logic for HTML parsing and document generation.

2. **Consolidated Document Generation Logic**:
- **Single Private Method `_generate_documents`**: This method now
handles the entire process of parsing HTML, tracking active headers,
managing document chunks, and yielding `Document` instances. This
consolidation reduces the number of moving parts and simplifies the
overall processing flow.

3. **Simplified Header Management**:
- **Immediate Header Scope Handling**: Headers are now managed within
the traversal loop of `_generate_documents`, ensuring that headers are
added or removed from the active headers dictionary in real-time based
on their DOM depth and hierarchy.
- **Removed `chunk_dom_depth` Attribute**: The need to track chunk DOM
depth separately has been eliminated, as header scopes are now directly
managed within the traversal logic.

4. **Streamlined Chunk Finalization**:
- **Enhanced `finalize_chunk` Function**: The chunk finalization process
has been simplified to directly yield a single `Document` when needed,
without maintaining an intermediate list. This change reduces
unnecessary list operations and makes the logic more straightforward.

5. **Improved Variable Naming and Flow**:
- **Descriptive Variable Names**: Variables such as `current_chunk` and
`node_text` provide clear insights into their roles within the
processing logic.
- **Direct Header Removal Logic**: Headers that are out of scope are
removed immediately during traversal, ensuring that the active headers
dictionary remains accurate and up-to-date.

6. **Preserved Comprehensive Docstrings**:
- **Unchanged Documentation**: All existing docstrings, including
class-level and method-level documentation, remain intact. This ensures
that users and developers continue to have access to detailed usage
instructions and method explanations.

## Testing

All existing test cases from `test_html_header_text_splitter.py` have
been executed against the refactored code. The results confirm that:

- **Functionality Remains Intact**: The splitter continues to accurately
parse HTML content, respect header hierarchies, and produce the expected
`Document` objects with correct metadata.
- **Backward Compatibility is Maintained**: No changes were required in
the test cases, and all tests pass without modifications, demonstrating
that the refactor does not introduce any regressions or alter existing
behaviors.


This example remains fully operational and behaves as before, returning
a list of `Document` objects with the expected metadata and content
splits.

## Conclusion

This refactor achieves a more maintainable and readable codebase by
simplifying the internal structure of the `HTMLHeaderTextSplitter`
class. By consolidating multiple private methods into a single, cohesive
private method, the class becomes easier to understand, debug, and
extend. All existing functionalities are preserved, and comprehensive
tests confirm that the refactor maintains the expected behavior. These
changes align with LangChain’s standards for clean, maintainable, and
efficient code.

---

---------

Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>
2025-03-28 15:36:00 -04:00
omahs
6f8735592b
docs,langchain-community: Fix typos in docs and code (#30541)
Fix typos
2025-03-28 19:21:16 +00:00
Agus
47d50f49d9
docs: Add GOAT integration to docs (#30478)
This PR adds:
1. Docs for the GOAT integration 
2. An "Agentic Finance" table to the Tools page that includes GOAT

**Twitter handle**: @0xaguspunk
2025-03-28 15:19:37 -04:00
Shixian Sheng
94a7fd2497
docs: fix broken hyperlinks in fireworks integration package README (#30538)
Fix two broken hyperlinks
2025-03-28 15:18:44 -04:00
Oskar Stark
0d2cea747c
docs: streamline LangSmith teasing (#30302)
This can only be reviewed by [hiding
whitespaces](https://github.com/langchain-ai/langchain/pull/30302/files?diff=unified&w=1).

The motivation behind this PR is to get my hands on the docs and make
the LangSmith teasing short and clear.

Right now I don't know how to do it, but this could be an include in the
future.

---------

Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>
2025-03-28 15:13:22 -04:00
Eugene Yurtsev
dd0faab07e fix types 2025-03-28 14:23:50 -04:00
Eugene Yurtsev
21ab1dc675 Merge branch 'master' of github.com:xzq-xu/langchain into xzq-xu/master 2025-03-28 13:56:49 -04:00
Eugene Yurtsev
22cee5d983 x 2025-03-28 13:56:10 -04:00
Eugene Yurtsev
a14d8b103b
Merge branch 'master' into master 2025-03-28 13:53:58 -04:00
Eugene Yurtsev
6d22f40a0b x 2025-03-28 13:51:06 -04:00
Philippe PRADOS
92189c8b31
community[patch]: Handle gray scale images in ImageBlobParser (Fixes 30261 and 29586) (#30493)
Fix [29586](https://github.com/langchain-ai/langchain/issues/29586) and
[30261](https://github.com/langchain-ai/langchain/pull/30261)
2025-03-28 10:15:40 -04:00
小豆豆学长
1f0686db80
community: add netmind integration (#30149)
Co-authored-by: yanrujing <rujing.yan@protagonist-ai.com>
Co-authored-by: ccurme <chester.curme@gmail.com>
2025-03-27 15:27:04 -04:00
Kyungho Byoun
e6b6c07395
community: add HANA dialect to SQLDatabase (#30475)
This PR includes support for HANA dialect in SQLDatabase, which is a
wrapper class for SQLAlchemy.

Currently, it is unable to set schema name when using HANA DB with
Langchain. And, it does not show any message to user so that it makes
hard for user to figure out why the SQL does not work as expected.

Here is the reference document for HANA DB to set schema for the
session.

- [SET SCHEMA Statement (Session
Management)](https://help.sap.com/docs/SAP_HANA_PLATFORM/4fe29514fd584807ac9f2a04f6754767/20fd550375191014b886a338afb4cd5f.html)
2025-03-27 15:19:50 -04:00
Christophe Bornet
e181d43214
core: Bump ruff version to 0.11 (#30519)
Changes are from the new TC006 rule:
https://docs.astral.sh/ruff/rules/runtime-cast-value/
TC006 is auto-fixed.
2025-03-27 13:01:49 -04:00
ccurme
59908f04d4
fireworks: release 0.2.9 (#30527) 2025-03-27 16:04:20 +00:00
ccurme
05482877be
mistralai: release 0.2.10 (#30526) 2025-03-27 16:01:40 +00:00
Andras L Ferenczi
63673b765b
Fix: Enable max_retries Parameter in ChatMistralAI Class (#30448)
**partners: Enable max_retries in ChatMistralAI**

**Description**

- This pull request reactivates the retry logic in the
completion_with_retry method of the ChatMistralAI class, restoring the
intended functionality of the previously ineffective max_retries
parameter. New unit test that mocks failed/successful retry calls and an
integration test to confirm end-to-end functionality.

**Issue**
- Closes #30362

**Dependencies**
- No additional dependencies required

Co-authored-by: andrasfe <andrasf94@gmail.com>
2025-03-27 11:53:44 -04:00
Keiichi Hirobe
956b09f468
core[patch]: stop deleting records with "scoped_full" when doc is empty (#30520)
Fix a bug that causes `scoped_full` in index to delete records when there are no input docs.
2025-03-27 11:04:34 -04:00
Christophe Bornet
b28a474e79
core[patch]: Add ruff rules for PLW (Pylint Warnings) (#29288)
See https://docs.astral.sh/ruff/rules/#warning-w_1

---------

Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>
2025-03-27 10:26:12 +00:00
xzq.xu
92dc3f7341 format test lint passed 2025-03-27 13:44:59 +08:00
xzq.xu
d0a9808148 modify test name 2025-03-27 13:34:51 +08:00
xzq.xu
ed2428f902 add a unit test 2025-03-27 12:43:16 +08:00
David Sánchez Sánchez
75823d580b
community: fix perplexity response parameters not being included in model response (#30440)
This pull request includes enhancements to the `perplexity.py` file in
the `chat_models` module, focusing on improving the handling of
additional keyword arguments (`additional_kwargs`) in message processing
methods. Additionally, new unit tests have been added to ensure the
correct inclusion of citations, images, and related questions in the
`additional_kwargs`.

Issue: resolves https://github.com/langchain-ai/langchain/issues/30439

Enhancements to `perplexity.py`:

*
[`libs/community/langchain_community/chat_models/perplexity.py`](diffhunk://#diff-d3e4d7b277608683913b53dcfdbd006f0f4a94d110d8b9ac7acf855f1f22207fL208-L212):
Modified the `_convert_delta_to_message_chunk`, `_stream`, and
`_generate` methods to handle `additional_kwargs`, which include
citations, images, and related questions.
[[1]](diffhunk://#diff-d3e4d7b277608683913b53dcfdbd006f0f4a94d110d8b9ac7acf855f1f22207fL208-L212)
[[2]](diffhunk://#diff-d3e4d7b277608683913b53dcfdbd006f0f4a94d110d8b9ac7acf855f1f22207fL277-L286)
[[3]](diffhunk://#diff-d3e4d7b277608683913b53dcfdbd006f0f4a94d110d8b9ac7acf855f1f22207fR324-R331)

New unit tests:

*
[`libs/community/tests/unit_tests/chat_models/test_perplexity.py`](diffhunk://#diff-dab956d79bd7d17a0f5dea3f38ceab0d583b43b63eb1b29138ee9b6b271ba1d9R119-R275):
Added new tests `test_perplexity_stream_includes_citations_and_images`
and `test_perplexity_stream_includes_citations_and_related_questions` to
verify that the `stream` method correctly includes citations, images,
and related questions in the `additional_kwargs`.
2025-03-26 22:28:08 -04:00
Adeel Ehsan
d7d0bca2bc
docs: add vectara to libs package yml (#30504) 2025-03-26 16:47:53 -04:00
ccurme
a9b1e1b177
openai: release 0.3.11 (#30503) 2025-03-26 19:24:37 +00:00
ccurme
8119a7bc5c
openai[patch]: support streaming token counts in AzureChatOpenAI (#30494)
When OpenAI originally released `stream_options` to enable token usage
during streaming, it was not supported in AzureOpenAI. It is now
supported.

Like the [OpenAI
SDK](f66d2e6fdc/src/openai/resources/completions.py (L68)),
ChatOpenAI does not return usage metadata during streaming by default
(which adds an extra chunk to the stream). The OpenAI SDK requires users
to pass `stream_options={"include_usage": True}`. ChatOpenAI implements
a convenience argument `stream_usage: Optional[bool]`, and an attribute
`stream_usage: bool = False`.

Here we extend this to AzureChatOpenAI by moving the `stream_usage`
attribute and `stream_usage` kwarg (on `_(a)stream`) from ChatOpenAI to
BaseChatOpenAI.

---

Additional consideration: we must be sensitive to the number of users
using BaseChatOpenAI to interact with other APIs that do not support the
`stream_options` parameter.

Suppose OpenAI in the future updates the default behavior to stream
token usage. Currently, BaseChatOpenAI only passes `stream_options` if
`stream_usage` is True, so there would be no way to disable this new
default behavior.

To address this, we could update the `stream_usage` attribute to
`Optional[bool] = None`, but this is technically a breaking change (as
currently values of False are not passed to the client). IMO: if / when
this change happens, we could accompany it with this update in a minor
bump.

--- 

Related previous PRs:
- https://github.com/langchain-ai/langchain/pull/22628
- https://github.com/langchain-ai/langchain/pull/22854
- https://github.com/langchain-ai/langchain/pull/23552

---------

Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>
2025-03-26 15:16:37 -04:00
ccurme
f68eaab44f
tests: release 0.3.17 (#30502) 2025-03-26 18:56:54 +00:00
Louis Auneau
0b532a4ed0
community: Azure Document Intelligence parser features not available fixed (#30370)
Thank you for contributing to LangChain!

- **Description:** Azure Document Intelligence OCR solution has a
*feature* parameter that enables some features such as high-resolution
document analysis, key-value pairs extraction, ... In langchain parser,
you could be provided as a `analysis_feature` parameter to the
constructor that was passed on the `DocumentIntelligenceClient`.
However, according to the `DocumentIntelligenceClient` [API
Reference](https://learn.microsoft.com/en-us/python/api/azure-ai-documentintelligence/azure.ai.documentintelligence.documentintelligenceclient?view=azure-python),
this is not a valid constructor parameter. It was therefore remove and
instead stored as a parser property that is used in the
`begin_analyze_document`'s `features` parameter (see [API
Reference](https://learn.microsoft.com/en-us/python/api/azure-ai-formrecognizer/azure.ai.formrecognizer.documentanalysisclient?view=azure-python#azure-ai-formrecognizer-documentanalysisclient-begin-analyze-document)).
I also removed the check for "Supported features" since all features are
supported out-of-the-box. Also I did not check if the provided `str`
actually corresponds to the Azure package enumeration of features, since
the `ValueError` when creating the enumeration object is pretty
explicit.
Last caveat, is that some features are not supported for some kind of
documents. This is documented inside Microsoft documentation and
exception are also explicit.
- **Issue:** N/A
- **Dependencies:** No
- **Twitter handle:** @Louis___A

---------

Co-authored-by: Louis Auneau <louis@handshakehealth.co>
2025-03-26 14:40:14 -04:00
Philippe PRADOS
8e5d2a44ce
community[patch]: update PyPDFParser to take into account filters returned as arrays (#30489)
The image parsing is generating a bug as the the extracted objects for
the /Filter returns sometimes an array, sometimes a string.

Fix [Issue
30098](https://github.com/langchain-ai/langchain/issues/30098)
2025-03-26 14:16:54 -04:00
ccurme
422ba4cde5
infra: handle flaky tests (#30501) 2025-03-26 13:28:56 -04:00
ccurme
9a80be7bb7
core[patch]: release 0.3.49 (#30500) 2025-03-26 13:26:32 -04:00
ccurme
299b222c53
mistral[patch]: check types in adding model_name to response_metadata (#30499) 2025-03-26 16:30:09 +00:00
ccurme
22d1a7d7b6
standard-tests[patch]: require model_name in response_metadata if returns_usage_metadata (#30497)
We are implementing a token-counting callback handler in
`langchain-core` that is intended to work with all chat models
supporting usage metadata. The callback will aggregate usage metadata by
model. This requires responses to include the model name in its
metadata.

To support this, if a model `returns_usage_metadata`, we check that it
includes a string model name in its `response_metadata` in the
`"model_name"` key.

More context: https://github.com/langchain-ai/langchain/pull/30487
2025-03-26 12:20:53 -04:00
Ante Javor
20f82502e5
Community: Add Memgraph integration docs (#30457)
Thank you for contributing to LangChain!

**Description:** 
Since we just implemented
[langchain-memgraph](https://pypi.org/project/langchain-memgraph/)
integration, we are adding basic docs to [your site based on this
comment](https://github.com/langchain-ai/langchain/pull/30197#pullrequestreview-2671616410)
from @ccurme .
   
 **Twitter handle:**
 [@memgraphdb](https://x.com/memgraphdb)


- [x] **Add tests and docs**: If you're adding a new integration, please
include
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in
`docs/docs/integrations` directory.


- [x] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/

Additional guidelines:
- Make sure optional dependencies are imported within a function.
- Please do not add dependencies to pyproject.toml files (even optional
ones) unless they are required for unit tests.
- Most PRs should not touch more than one package.
- Changes should be backwards compatible.
- If you are adding something to community, do not re-import it in
langchain.

If no one reviews your PR within a few days, please @-mention one of
baskaryan, eyurtsev, ccurme, vbarda, hwchase17.

---------

Co-authored-by: Chester Curme <chester.curme@gmail.com>
2025-03-26 11:58:09 -04:00
xzq.xu
913c8b71d9 format import 2025-03-26 23:34:38 +08:00
xzq.xu
7e3dea5db8 add a new-line 2025-03-26 23:32:07 +08:00
xzq.xu
d602141ab1 remove unused e 2025-03-26 23:10:41 +08:00
xzq.xu
dd9031fc82 _prep_run_args,tool_input copy, Exception 2025-03-26 23:06:43 +08:00
xzq.xu
3382b0d8ea _prep_run_args,tool_input copy 2025-03-26 22:56:32 +08:00
xzq.xu
65ecc22606 # Fix: Prevent run_manager from being added to state object 2025-03-26 22:36:31 +08:00
ccurme
7e62e3a137
core[patch]: store model names on usage callback handler (#30487)
So we avoid mingling tokens from different models.
2025-03-25 21:26:09 -04:00
ccurme
32827765bf
core[patch]: mark usage callback handler as beta (#30486) 2025-03-25 23:25:57 +00:00
Eugene Yurtsev
9f345d64fd
core[patch]: Remove old accidental commit (#30483)
Remove commented out file that was accidentally added

Co-authored-by: ccurme <chester.curme@gmail.com>
2025-03-25 15:37:20 -07:00
ccurme
4b9e2e51f3
core[patch]: add token counting callback handler (#30481)
Stripped-down version of
[OpenAICallbackHandler](https://github.com/langchain-ai/langchain/blob/master/libs/community/langchain_community/callbacks/openai_info.py)
that just tracks `AIMessage.usage_metadata`.

```python
from langchain_core.callbacks import get_usage_metadata_callback
from langgraph.prebuilt import create_react_agent

def get_weather(location: str) -> str:
    """Get the weather at a location."""
    return "It's sunny."

tools = [get_weather]
agent = create_react_agent("openai:gpt-4o-mini", tools)

with get_usage_metadata_callback() as cb:
    result = await agent.ainvoke({"messages": "What's the weather in Boston?"})
    print(cb.usage_metadata)
```
2025-03-25 18:16:39 -04:00
Eugene Yurtsev
0acca6b9c8
core[patch]: Fix handling of title when tool schema is specified manually via JSONSchema (#30479)
Fix issue: https://github.com/langchain-ai/langchain/issues/30456
2025-03-25 15:15:24 -04:00
Ben Chambers
c5e42a4027
community: deprecate graph vector store (#30328)
- **Description:** mark GraphVectorStore `@deprecated`

---------

Co-authored-by: Chester Curme <chester.curme@gmail.com>
2025-03-25 13:52:54 +00:00
Ian Muge
a8ce63903d
community: Add edge properties to the gremlin graph schema (#30449)
Description: Extend the gremlin graph schema to include the edge
properties, grouped by its triples; i.e: `inVLabel` and `outVLabel`.
This should give more context when crafting queries to run against a
gremlin graph db
2025-03-24 19:03:01 -04:00
ccurme
b60e6f6efa
community[patch]: update API ref for AmazonTextractPDFParser (#30468) 2025-03-24 23:02:52 +00:00
David Sánchez Sánchez
3ba0d28d8e
community: update perplexity docstring (#30451)
This pull request includes extensive documentation updates for the
`ChatPerplexity` class in the
`libs/community/langchain_community/chat_models/perplexity.py` file. The
changes provide detailed setup instructions, key initialization
arguments, and usage examples for various functionalities of the
`ChatPerplexity` class.

Documentation improvements:

* Added setup instructions for installing the `openai` package and
setting the `PPLX_API_KEY` environment variable.
* Documented key initialization arguments for completion parameters and
client parameters, including `model`, `temperature`, `max_tokens`,
`streaming`, `pplx_api_key`, `request_timeout`, and `max_retries`.
* Provided examples for instantiating the `ChatPerplexity` class,
invoking it with messages, using structured output, invoking with
perplexity-specific parameters, streaming responses, and accessing token
usage and response metadata.Thank you for contributing to LangChain!
2025-03-24 15:01:02 -04:00
Vadym Barda
97dec30eea
docs[patch]: update trim_messages doc (#30462) 2025-03-24 18:50:48 +00:00
ccurme
c2dd8d84ff
infra[patch]: remove pyspark from langchain-community extended testing requirements (#30466) 2025-03-24 14:41:54 -04:00
ccurme
aa30d2d57f
standard-tests: release 0.3.16 (#30464) 2025-03-24 18:35:12 +00:00
ccurme
b09e7c125c
cli: use pytest-watcher (#30465)
pytest-watch is no longer maintained.
2025-03-24 18:06:31 +00:00
ccurme
50ec4a1a4f
openai[patch]: attempt to make test less flaky (#30463) 2025-03-24 17:36:36 +00:00
ccurme
8486e0ae80
openai[patch]: bump openai sdk (#30461)
[New required
field](https://github.com/openai/openai-python/pull/2223/files#diff-530fd17eb1cc43440c82630df0ddd9b0893cf14b04065a95e6eef6cd2f766a44R26)
for `ResponseUsage` released in 1.66.5.
2025-03-24 12:10:00 -04:00
ccurme
cbbc968903
openai: release 0.3.10 (#30460) 2025-03-24 15:37:53 +00:00
ccurme
ed5e589191
openai[patch]: support multi-turn computer use (#30410)
Here we accept ToolMessages of the form
```python
ToolMessage(
    content=<representation of screenshot> (see below),
    tool_call_id="abc123",
    additional_kwargs={"type": "computer_call_output"},
)
```
and translate them to `computer_call_output` items for the Responses
API.

We also propagate `reasoning_content` items from AIMessages.

## Example

### Load screenshots
```python
import base64

def load_png_as_base64(file_path):
    with open(file_path, "rb") as image_file:
        encoded_string = base64.b64encode(image_file.read())
        return encoded_string.decode('utf-8')

screenshot_1_base64 = load_png_as_base64("/path/to/screenshot/of/application.png")
screenshot_2_base64 = load_png_as_base64("/path/to/screenshot/of/desktop.png")
```

### Initial message and response
```python
from langchain_core.messages import HumanMessage, ToolMessage
from langchain_openai import ChatOpenAI

llm = ChatOpenAI(
    model="computer-use-preview",
    model_kwargs={"truncation": "auto"},
)

tool = {
    "type": "computer_use_preview",
    "display_width": 1024,
    "display_height": 768,
    "environment": "browser"
}
llm_with_tools = llm.bind_tools([tool])

input_message = HumanMessage(
    content=[
        {
            "type": "text",
            "text": (
                "Click the red X to close and reveal my Desktop. "
                "Proceed, no confirmation needed."
            )
        },
        {
            "type": "input_image",
            "image_url": f"data:image/png;base64,{screenshot_1_base64}",
        }
    ]
)

response = llm_with_tools.invoke(
    [input_message],
    reasoning={
        "generate_summary": "concise",
    },
)
response.additional_kwargs["tool_outputs"]
```

### Construct ToolMessage
```python
tool_call_id = response.additional_kwargs["tool_outputs"][0]["call_id"]

tool_message = ToolMessage(
    content=[
        {
            "type": "input_image",
            "image_url": f"data:image/png;base64,{screenshot_2_base64}"
        }
    ],
    #  content=f"data:image/png;base64,{screenshot_2_base64}",  # <-- also acceptable
    tool_call_id=tool_call_id,
    additional_kwargs={"type": "computer_call_output"},
)
```

### Invoke again
```python
messages = [
    input_message,
    response,
    tool_message,
]

response_2 = llm_with_tools.invoke(
    messages,
    reasoning={
        "generate_summary": "concise",
    },
)
```
2025-03-24 15:25:36 +00:00
Vadym Barda
7bc50730aa
core[patch]: release 0.3.48 (#30458) 2025-03-24 09:48:03 -04:00
Mohammad Mohtashim
33f1ab1528
Youtube Loader load method Fixed (#30314)
- **Description:** Fixed the `YoutubeLoader` loading method not
returning the correct object
- **Issue:** #30309

---------

Co-authored-by: ccurme <chester.curme@gmail.com>
2025-03-23 14:48:03 -04:00
Simon Paredes
df4448dfac
langchain-groq: Add response metadata when streaming (#30379)
- **Description:** Add missing `model_name` and `system_fingerprint`
metadata when streaming.

---------

Co-authored-by: Chester Curme <chester.curme@gmail.com>
2025-03-23 14:34:41 -04:00
Changyong Um
e2d9fe766f
community[tool]: Integrate a tool for the naver_search (#30392)
Hello!
I have reopened a pull request for tool integration.
Please refer to the previous
[PR](https://github.com/langchain-ai/langchain/pull/30248).

I understand that for the tool integration, a separate package should be
created, and only the documentation should be added under docs/docs/. If
there are any other procedures, please let me know.


[langchain-naver-community](https://github.com/e7217/langchain-naver-community)

cc: @ccurme

---------

Co-authored-by: Chester Curme <chester.curme@gmail.com>
2025-03-23 14:05:24 -04:00
ccurme
d867afff1c
docs: update package table ordering (#30437)
Update download counts (only impacts ordering, counts in rendered page
are updated automatically).
2025-03-22 18:07:08 -04:00
Matthew Farrellee
e7032901c3
langchain-tests: allow test_serdes for packages outside the default valid namespaces (#30343)
**Description:**

a third party package not listed in the default valid namespaces cannot
pass test_serdes because the load() does not allow for extending the
valid_namespaces.

test_serdes will fail with -
ValueError: Invalid namespace: {'lc': 1, 'type': 'constructor', 'id':
['langchain_other', 'chat_models', 'ChatOther'], 'kwargs':
{'model_name': '...', 'api_key': '...'}, 'name': 'ChatOther'}

this change has test_serdes automatically extend valid_namespaces based
off the ChatModel under test's namespace.
2025-03-22 17:27:39 -04:00
Jiwon Kang
699475a01d
community: uuidv1 is unsafe (#30432)
this_row_id previously used UUID v1. However, since UUID v1 can be
predicted if the MAC address and timestamp are known, it poses a
potential security risk. Therefore, it has been changed to UUID v4.
2025-03-22 15:27:49 -04:00
Dhruvajyoti Sarma
31551dab40
feature: added warning when duckdb is used as a vectorstore without pandas (#30435)
added warning when duckdb is used as a vectorstore without pandas being
installed (currently used for similarity search result processing)

Thank you for contributing to LangChain!

- [ ] **PR title**: "community: added warning when duckdb is used as a
vectorstore without pandas"


- [ ] **PR message**: ***Delete this entire checklist*** and replace
with
- **Description:** displays a warning when using duckdb as a vector
store without pandas being installed, as it is used by the
`similarity_search` function
    - **Issue:** #29933 
    - **Dependencies:** None

---------

Co-authored-by: Chester Curme <chester.curme@gmail.com>
2025-03-22 19:27:21 +00:00
Cesar Sanz
5383abfeee
Fix incorrect import path for AzureAIChatCompletionsModel (#30417)
Fixes #30416

Correct the import path for `AzureAIChatCompletionsModel` in the
`_init_chat_model_helper` function.

* Update the import statement in
`libs/langchain/langchain/chat_models/base.py` to `from
langchain_azure_ai.chat_models import AzureAIChatCompletionsModel`.

---

For more details, open the [Copilot Workspace
session](https://copilot-workspace.githubnext.com/langchain-ai/langchain/pull/30417?shareId=6ff6d5de-e3d1-4972-8d24-5e74838e9945).
2025-03-22 07:44:51 -04:00
Misakar
7750ad588b
community:ChatLiteLLM support output reasoning content (#30430) 2025-03-22 07:43:33 -04:00
Adrián Panella
b75573e858
core: add tool_call exclusion in filter_message (#30289)
Extend functionallity to allow to filter pairs of tool calls (ai +
tool).

---------

Co-authored-by: vbarda <vadym@langchain.dev>
2025-03-21 23:05:29 +00:00
Vadym Barda
673ec00030
docs[patch]: add warning to token counter docstring (#30426) 2025-03-21 18:59:40 -04:00
Adrián Panella
3933a4abc3
core(mermaid): allow greater customization (#29939)
Adds greater style customization by allowing a custom frontmatter
config. This allows to set a `theme` and `look` or to adjust theme by
setting `themeVariables`

Example:

```python

node_colors = NodeStyles(
    default="fill:#e2e2e2,line-height:1.2,stroke:#616161",
    first="fill:#cfeab8,fill-opacity:0",
    last="fill:#eac3b8",
)

frontmatter_config = {
    "config": {
        "theme": "neutral",
        "look": "handDrawn"
    }
}

graph.get_graph().draw_mermaid_png(node_colors=node_colors, frontmatter_config=frontmatter_config)
```


![image](https://github.com/user-attachments/assets/11b56d30-3be2-482f-8432-3ce704a09552)

---------

Co-authored-by: vbarda <vadym@langchain.dev>
2025-03-21 18:25:26 -04:00
Vadym Barda
07823cd41c
core[patch]: optimize trim_messages (#30327)
Refactored w/ Claude

Up to 20x speedup! (with theoretical max improvement of `O(n / log n)`)
2025-03-21 17:08:26 -04:00
ccurme
b78ae7817e
openai[patch]: trace strict in structured_output_kwargs (#30425) 2025-03-21 14:37:28 -04:00
ccurme
1de7fa8f3a
Revert "deepseek: temporarily bypass tests" (#30424)
Reverts langchain-ai/langchain#30423
2025-03-21 17:14:31 +00:00
ccurme
c74dfff836
deepseek: temporarily bypass tests (#30423)
Deepseek infra is not stable enough to get through integration tests.

Previous two attempts had two tests time out, they both pass locally.
2025-03-21 17:08:35 +00:00
ccurme
7147903724
deepseek: release 0.1.3 (#30422) 2025-03-21 16:39:50 +00:00
Andras L Ferenczi
b5f49df86a
partner: ChatDeepSeek on openrouter not returning reasoning (#30240)
Deepseek model does not return reasoning when hosted on openrouter
(Issue [30067](https://github.com/langchain-ai/langchain/issues/30067))

the following code did not return reasoning:

```python
llm = ChatDeepSeek( model = 'deepseek/deepseek-r1:nitro', api_base="https://openrouter.ai/api/v1", api_key=os.getenv("OPENROUTER_API_KEY")) 
messages = [
    {"role": "system", "content": "You are an assistant."},
    {"role": "user", "content": "9.11 and 9.8, which is greater? Explain the reasoning behind this decision."}
]
response = llm.invoke(messages, extra_body={"include_reasoning": True})
print(response.content)
print(f"REASONING: {response.additional_kwargs.get('reasoning_content', '')}")
print(response)
```

The fix is to extract reasoning from
response.choices[0].message["model_extra"] and from
choices[0].delta["reasoning"]. and place in response additional_kwargs.
Change is really just the addition of a couple one-sentence if
statements.

---------

Co-authored-by: andrasfe <andrasf94@gmail.com>
Co-authored-by: Chester Curme <chester.curme@gmail.com>
2025-03-21 16:35:37 +00:00
Vadym Barda
4852ab8d0a
core[patch]: more tests for trim_messages (#30421) 2025-03-21 16:19:52 +00:00
ccurme
e8e3b2bfae
ollama: release 0.3.0 (#30420) 2025-03-21 15:50:08 +00:00
Bob Merkus
5700646cc5
ollama: add reasoning model support (e.g. deepseek) (#29689)
# Description
This PR adds reasoning model support for `langchain-ollama` by
extracting reasoning token blocks, like those used in deepseek. It was
inspired by
[ollama-deep-researcher](https://github.com/langchain-ai/ollama-deep-researcher),
specifically the parsing of [thinking
blocks](6d1aaf2139/src/assistant/graph.py (L91)):
```python
  # TODO: This is a hack to remove the <think> tags w/ Deepseek models 
  # It appears very challenging to prompt them out of the responses 
  while "<think>" in running_summary and "</think>" in running_summary:
      start = running_summary.find("<think>")
      end = running_summary.find("</think>") + len("</think>")
      running_summary = running_summary[:start] + running_summary[end:]
```

This notes that it is very hard to remove the reasoning block from
prompting, but we actually want the model to reason in order to increase
model performance. This implementation extracts the thinking block, so
the client can still expect a proper message to be returned by
`ChatOllama` (and use the reasoning content separately when desired).

This implementation takes the same approach as
[ChatDeepseek](5d581ba22c/libs/partners/deepseek/langchain_deepseek/chat_models.py (L215)),
which adds the reasoning content to
chunk.additional_kwargs.reasoning_content;
```python
  if hasattr(response.choices[0].message, "reasoning_content"):  # type: ignore
      rtn.generations[0].message.additional_kwargs["reasoning_content"] = (
          response.choices[0].message.reasoning_content  # type: ignore
      )
```

This should probably be handled upstream in ollama + ollama-python, but
this seems like a reasonably effective solution. This is a standalone
example of what is happening;

```python
async def deepseek_message_astream(
    llm: BaseChatModel,
    messages: list[BaseMessage],
    config: RunnableConfig | None = None,
    *,
    model_target: str = "deepseek-r1",
    **kwargs: Any,
) -> AsyncIterator[BaseMessageChunk]:
    """Stream responses from Deepseek models, filtering out <think> tags.

    Args:
        llm: The language model to stream from
        messages: The messages to send to the model

    Yields:
        Filtered chunks from the model response
    """
    # check if the model is deepseek based
    if (llm.name and model_target not in llm.name) or (hasattr(llm, "model") and model_target not in llm.model):
        async for chunk in llm.astream(messages, config=config, **kwargs):
            yield chunk
        return

    # Yield with a buffer, upon completing the <think></think> tags, move them to the reasoning content and start over
    buffer = ""
    async for chunk in llm.astream(messages, config=config, **kwargs):
        # start or append
        if not buffer:
            buffer = chunk.content
        else:
            buffer += chunk.content if hasattr(chunk, "content") else chunk

        # Process buffer to remove <think> tags
        if "<think>" in buffer or "</think>" in buffer:
            if hasattr(chunk, "tool_calls") and chunk.tool_calls:
                raise NotImplementedError("tool calls during reasoning should be removed?")
            if "<think>" in chunk.content or "</think>" in chunk.content:
                continue
            chunk.additional_kwargs["reasoning_content"] = chunk.content
            chunk.content = ""
        # upon block completion, reset the buffer
        if "<think>" in buffer and "</think>" in buffer:
            buffer = ""
        yield chunk

```

# Issue
Integrating reasoning models (e.g. deepseek-r1) into existing LangChain
based workflows is hard due to the thinking blocks that are included in
the message contents. To avoid this, we could match the `ChatOllama`
integration with `ChatDeepseek` to return the reasoning content inside
`message.additional_arguments.reasoning_content` instead.

# Dependenices
None

---------

Co-authored-by: Chester Curme <chester.curme@gmail.com>
2025-03-21 15:44:54 +00:00
ccurme
d8145dda95
xai: release 0.2.2 (#30403) 2025-03-20 20:25:16 +00:00
ccurme
e194902994
mistral: release 0.2.9 (#30402) 2025-03-20 20:22:24 +00:00
ccurme
49466ec9ca
groq: release 0.3.1 (#30401) 2025-03-20 20:19:49 +00:00
ccurme
db1e340387
fireworks: release 0.2.8 (#30400) 2025-03-20 16:15:51 -04:00
ccurme
785a8e7d45
tests: release 0.3.15 (#30397) 2025-03-20 15:38:40 -04:00
ccurme
5588ca4cfb
core: release 0.3.47 (#30396) 2025-03-20 18:52:53 +00:00
ccurme
de3960d285
multiple: enforce standards on tool_choice (#30372)
- Test if models support forcing tool calls via `tool_choice`. If they
do, they should support
  - `"any"` to specify any tool
  - the tool name as a string to force calling a particular tool
- Add `tool_choice` to signature of `BaseChatModel.bind_tools` in core
- Deprecate `tool_choice_value` in standard tests in favor of a boolean
`has_tool_choice`

Will follow up with PRs in external repos (tested in AWS and Google
already).
2025-03-20 17:48:59 +00:00
ccurme
b86cd8270c
multiple: support strict and method in with_structured_output (#30385) 2025-03-20 13:17:07 -04:00
Mohammad Mohtashim
1103bdfaf1
(Ollama) Fix String Value parsing in _parse_arguments_from_tool_call (#30154)
- **Description:** Fix String Value parsing in
_parse_arguments_from_tool_call
- **Issue:** #30145

---------

Co-authored-by: Chester Curme <chester.curme@gmail.com>
2025-03-19 21:47:18 -04:00
Tim König
b5992695ae
community: add ZoteroRetriever (#30270)
**Description** 
This contribution adds a retriever for the Zotero API.
[Zotero](https://www.zotero.org/) is an open source reference management
for bibliographic data and related research materials. A retriever will
allow langchain applications to retrieve relevant documents from
personal or shared group libraries, which I believe will be helpful for
numerous applications, such as RAG systems, personal research
assistants, etc. Tests and docs were added.

The documentation provided assumes the retriever will be part of the
langchain-community package, as this seemed customary. Please let me
know if this is not the preferred way to do it. I also uploaded the
implementation to PyPI.

**Dependencies**
The retriever requires the `pyzotero` package for API access. This
dependency is stated in the docs, and the retriever will return an error
if the package is not found. However, this dependency is not added to
the langchain package itself.

**Twitter handle**
I'm no longer using Twitter, but I'd appreciate a shoutout on
[Bluesky](https://bsky.app/profile/koenigt.bsky.social) or
[LinkedIn](https://www.linkedin.com/in/dr-tim-k%C3%B6nig-534aa2324/)!


Let me know if there are any issues, I'll gladly try and sort them out!

---------

Co-authored-by: Chester Curme <chester.curme@gmail.com>
2025-03-19 20:19:32 -04:00
pulvedu
4346aca5cf
Integration update (#30381)
This pull request includes a change to the following
- docs/docs/integrations/tools/tavily_search.ipynb 
- docs/docs/integrations/tools/tavily_extract.ipynb
- added docs/docs/integrations/providers/tavily.mdx

---------

Co-authored-by: pulvedu <dustin@tavily.com>
2025-03-19 17:58:25 -04:00
Daniel Rauber
9b687d7fbd
community[minor]: PlaywrightURLLoader can take stored session file (#30152)
**Description:**
Implements an additional `browser_session` parameter on
PlaywrightURLLoader which can be used to initialize the browser context
by providing a stored playwright context.
2025-03-19 16:29:07 -04:00
Vadym Barda
73c04f4707
core[patch]: release 0.3.46 (#30383) 2025-03-19 15:09:08 -04:00
William FH
ce84f8ba7e
Dereference run tree (#30377) 2025-03-19 19:05:06 +00:00
William FH
8265be4d3e
Unset context to None in var (#30380) 2025-03-19 18:53:17 +00:00
William FH
4130e6476b
Unset context after step (#30378)
While we are already careful to copy before setting the config, if other
objects hold a reference to the config or context, it wouldn't be
cleared.
2025-03-19 11:46:23 -07:00
Vadym Barda
37190881d3
core[patch]: add util for approximate token counting (#30373) 2025-03-19 17:48:38 +00:00
Matthew Farrellee
5f812f5968
langchain-tests: skip instead of passing image message tests (#30375)
**Description:** use skip for image message tests
2025-03-19 15:35:32 +00:00
ccurme
aae8306d6c
groq: release 0.3.0 (#30374) 2025-03-19 15:23:30 +00:00
Ashwin
83cfb9691f
Fix typo: change 'ben' to 'be' in comment (#30358)
**Description:**  
This PR fixes a minor typo in the comments within
`libs/partners/openai/langchain_openai/chat_models/base.py`. The word
"ben" has been corrected to "be" for clarity and professionalism.

**Issue:**  
N/A

**Dependencies:**  
None
2025-03-19 10:35:35 -04:00
Florian Chappaz
07cb41ea9e
community: aligning ChatLiteLLM default parameters with litellm (#30360)
**Description:**
Since `ChatLiteLLM` is forwarding most parameters to
`litellm.completion(...)`, there is no reason to set other default
values than the ones defined by `litellm`.

In the case of parameter 'n', it also provokes an issue when trying to
call a serverless endpoint on Azure, as it is considered an extra
parameter. So we need to keep it optional.

We can debate about backward compatibility of this change: in my
opinion, there should not be big issues since from my experience,
calling `litellm.completion()` without these parameters works fine.

**Issue:** 
- #29679 

**Dependencies:** None
2025-03-19 09:07:28 -04:00
Hodory
57ffacadd0
community: add keep_newlines parameter to process_pages method (#30365)
- **Description:** Adding keep_newlines parameter to process_pages
method with page_ids on Confluence document loader
- **Issue:** N/A (This is an enhancement rather than a bug fix)
- **Dependencies:** N/A
- **Twitter handle:** N/A
2025-03-19 08:57:59 -04:00
William FH
f5a0092551
Rm test for parent_run presence (#30356) 2025-03-18 19:44:19 -07:00
Adam Brenner
f949d9a3d3
docs: Add Dell PowerScale Document Loader (#30209)
# Description
Adds documentation on LangChain website for a Dell specific document
loader for on-prem storage devices. Additional details on what the
document loader is described in the PR as well as on our github repo:
[https://github.com/dell/powerscale-rag-connector](https://github.com/dell/powerscale-rag-connector)

This PR also creates a category on the document loader webpage as no
existing category exists for on-prem. This follows the existing pattern
already established as the website has a category for cloud providers.

# Issue:
New release, no issue.

# Dependencies:

None

# Twitter handle:

DellTech

---------

Signed-off-by: Adam Brenner <adam@aeb.io>
Co-authored-by: Chester Curme <chester.curme@gmail.com>
2025-03-18 22:39:21 -04:00
ccurme
9fb0db6937
community: release 0.3.20 (#30354) 2025-03-18 21:57:12 +00:00
ccurme
168f1dfd93
langchain[patch]: update text-splitters min bound (#30352) 2025-03-18 20:53:43 +00:00
ccurme
f6cf2ce2ad
langchain[patch]: lock with latest text-splitters (#30350) 2025-03-18 19:29:11 +00:00
ccurme
2909b49045
langchain: release 0.3.21 (#30348) 2025-03-18 19:13:20 +00:00
ccurme
958f85d541
text-splitters: release 0.3.7 (#30347) 2025-03-18 19:11:37 +00:00
Lance Martin
46d6bf0330
ollama[minor]: update default method for structured output (#30273)
From function calling to Ollama's [dedicated structured output
feature](https://ollama.com/blog/structured-outputs).

---------

Co-authored-by: Chester Curme <chester.curme@gmail.com>
2025-03-18 12:44:22 -04:00
Marlene
ff8ce60dcc
Core: Adding Azure AI to Supported Chat Models (#30342)
- **Description:** I was testing out `init_chat` and saw that chat
models can now be inferred. Azure OpenAI is currently only supported but
we would like to add support for Azure AI which is a different package.
This PR edits the `base.py` file to add the chat implementation.
- I don't think this adds any additional dependencies 
- Will add a test and lint, but starting an initial draft PR. 

cc @santiagxf

---------

Co-authored-by: Chester Curme <chester.curme@gmail.com>
2025-03-18 11:53:20 -04:00
TheSongg
251551ccf1
doc: Implement langchain-xinference (#30296)
- [ ] **PR title**: Implement langchain-xinference

- [ ] **PR message**: 
Implement a standalone package for Xinference chat models and llm
models.

https://github.com/langchain-ai/langchain/issues/30045#issue-2887214214
2025-03-18 11:50:16 -04:00
wenmeng zhou
5a6e1254a7
support return reasoning content for models like qwq in dashscope (#30317)
Thank you for contributing to LangChain!

- [x] **PR title**: "package: description"
- Where "package" is whichever of langchain, community, core, etc. is
being modified. Use "docs: ..." for purely docs changes, "infra: ..."
for CI changes.
  - Example: "community: add foobar LLM"

here is an example
```python
from langchain_community.chat_models.tongyi import ChatTongyi
from langchain_core.messages import HumanMessage

chatLLM = ChatTongyi(
    model="qwq-32b",   # refer to  https://help.aliyun.com/zh/model-studio/getting-started/models for more models
)
res = chatLLM.stream([HumanMessage(content="how much is 1 plus 1")])
for r in res:
    print(r)
```

```shell
content='' additional_kwargs={'reasoning_content': 'Okay, so the'} response_metadata={} id='run-bd026918-16e5-429f-aa75-3ff7701e9f8d'
content='' additional_kwargs={'reasoning_content': ' user is asking "'} response_metadata={} id='run-bd026918-16e5-429f-aa75-3ff7701e9f8d'
content='' additional_kwargs={'reasoning_content': 'how much is 1 plus'} response_metadata={} id='run-bd026918-16e5-429f-aa75-3ff7701e9f8d'
content='' additional_kwargs={'reasoning_content': ' 1." Let me think'} response_metadata={} id='run-bd026918-16e5-429f-aa75-3ff7701e9f8d'
content='' additional_kwargs={'reasoning_content': ' about this. Hmm'} response_metadata={} id='run-bd026918-16e5-429f-aa75-3ff7701e9f8d'
content='' additional_kwargs={'reasoning_content': ', 1 plus'} response_metadata={} id='run-bd026918-16e5-429f-aa75-3ff7701e9f8d'
content='' additional_kwargs={'reasoning_content': " 1... That's a pretty"} response_metadata={} id='run-bd026918-16e5-429f-aa75-3ff7701e9f8d'
content='' additional_kwargs={'reasoning_content': ' basic math question. I'} response_metadata={} id='run-bd026918-16e5-429f-aa75-3ff7701e9f8d'
content='' additional_kwargs={'reasoning_content': ' remember from arithmetic that when'} response_metadata={} id='run-bd026918-16e5-429f-aa75-3ff7701e9f8d'
content='' additional_kwargs={'reasoning_content': ' you add 1 and'} response_metadata={} id='run-bd026918-16e5-429f-aa75-3ff7701e9f8d'
content='' additional_kwargs={'reasoning_content': ' 1 together, the'} response_metadata={} id='run-bd026918-16e5-429f-aa75-3ff7701e9f8d'
content='' additional_kwargs={'reasoning_content': ' result is 2.'} response_metadata={} id='run-bd026918-16e5-429f-aa75-3ff7701e9f8d'
content='' additional_kwargs={'reasoning_content': ' But wait, maybe'} response_metadata={} id='run-bd026918-16e5-429f-aa75-3ff7701e9f8d'
content='' additional_kwargs={'reasoning_content': ' I should double-check to be'} response_metadata={} id='run-bd026918-16e5-429f-aa75-3ff7701e9f8d'
content='' additional_kwargs={'reasoning_content': ' sure. Let me visualize it'} response_metadata={} id='run-bd026918-16e5-429f-aa75-3ff7701e9f8d'
content='' additional_kwargs={'reasoning_content': '. If I have one apple'} response_metadata={} id='run-bd026918-16e5-429f-aa75-3ff7701e9f8d'
content='' additional_kwargs={'reasoning_content': ' and someone gives me another'} response_metadata={} id='run-bd026918-16e5-429f-aa75-3ff7701e9f8d'
content='' additional_kwargs={'reasoning_content': ' apple, I have'} response_metadata={} id='run-bd026918-16e5-429f-aa75-3ff7701e9f8d'
content='' additional_kwargs={'reasoning_content': ' two apples total. Yeah,'} response_metadata={} id='run-bd026918-16e5-429f-aa75-3ff7701e9f8d'
content='' additional_kwargs={'reasoning_content': ' that makes sense. Or'} response_metadata={} id='run-bd026918-16e5-429f-aa75-3ff7701e9f8d'
content='' additional_kwargs={'reasoning_content': ' on a number line'} response_metadata={} id='run-bd026918-16e5-429f-aa75-3ff7701e9f8d'
content='' additional_kwargs={'reasoning_content': ', starting at 1 and'} response_metadata={} id='run-bd026918-16e5-429f-aa75-3ff7701e9f8d'
content='' additional_kwargs={'reasoning_content': ' moving 1 step forward lands'} response_metadata={} id='run-bd026918-16e5-429f-aa75-3ff7701e9f8d'
content='' additional_kwargs={'reasoning_content': ' you at 2'} response_metadata={} id='run-bd026918-16e5-429f-aa75-3ff7701e9f8d'
content='' additional_kwargs={'reasoning_content': '. \n\nIs there any'} response_metadata={} id='run-bd026918-16e5-429f-aa75-3ff7701e9f8d'
content='' additional_kwargs={'reasoning_content': ' context where 1 +'} response_metadata={} id='run-bd026918-16e5-429f-aa75-3ff7701e9f8d'
content='' additional_kwargs={'reasoning_content': ' 1 might not equal'} response_metadata={} id='run-bd026918-16e5-429f-aa75-3ff7701e9f8d'
content='' additional_kwargs={'reasoning_content': ' 2? Like in different'} response_metadata={} id='run-bd026918-16e5-429f-aa75-3ff7701e9f8d'
content='' additional_kwargs={'reasoning_content': ' number bases? Let'} response_metadata={} id='run-bd026918-16e5-429f-aa75-3ff7701e9f8d'
content='' additional_kwargs={'reasoning_content': "'s see. In base"} response_metadata={} id='run-bd026918-16e5-429f-aa75-3ff7701e9f8d'
content='' additional_kwargs={'reasoning_content': ' 10, which'} response_metadata={} id='run-bd026918-16e5-429f-aa75-3ff7701e9f8d'
content='' additional_kwargs={'reasoning_content': ' is standard,'} response_metadata={} id='run-bd026918-16e5-429f-aa75-3ff7701e9f8d'
content='' additional_kwargs={'reasoning_content': ' 1+1 is'} response_metadata={} id='run-bd026918-16e5-429f-aa75-3ff7701e9f8d'
content='' additional_kwargs={'reasoning_content': ' 2. But if'} response_metadata={} id='run-bd026918-16e5-429f-aa75-3ff7701e9f8d'
content='' additional_kwargs={'reasoning_content': ' we were in binary'} response_metadata={} id='run-bd026918-16e5-429f-aa75-3ff7701e9f8d'
content='' additional_kwargs={'reasoning_content': ' (base 2'} response_metadata={} id='run-bd026918-16e5-429f-aa75-3ff7701e9f8d'
content='' additional_kwargs={'reasoning_content': '), 1 +'} response_metadata={} id='run-bd026918-16e5-429f-aa75-3ff7701e9f8d'
content='' additional_kwargs={'reasoning_content': ' 1 would be 1'} response_metadata={} id='run-bd026918-16e5-429f-aa75-3ff7701e9f8d'
content='' additional_kwargs={'reasoning_content': '0. But the question'} response_metadata={} id='run-bd026918-16e5-429f-aa75-3ff7701e9f8d'
content='' additional_kwargs={'reasoning_content': " doesn't specify a base,"} response_metadata={} id='run-bd026918-16e5-429f-aa75-3ff7701e9f8d'
content='' additional_kwargs={'reasoning_content': ' so I think the'} response_metadata={} id='run-bd026918-16e5-429f-aa75-3ff7701e9f8d'
content='' additional_kwargs={'reasoning_content': ' default is base 10'} response_metadata={} id='run-bd026918-16e5-429f-aa75-3ff7701e9f8d'
content='' additional_kwargs={'reasoning_content': '. \n\nAlternatively, could'} response_metadata={} id='run-bd026918-16e5-429f-aa75-3ff7701e9f8d'
content='' additional_kwargs={'reasoning_content': ' this be a trick'} response_metadata={} id='run-bd026918-16e5-429f-aa75-3ff7701e9f8d'
content='' additional_kwargs={'reasoning_content': ' question? Maybe they'} response_metadata={} id='run-bd026918-16e5-429f-aa75-3ff7701e9f8d'
content='' additional_kwargs={'reasoning_content': "'re referring to something else"} response_metadata={} id='run-bd026918-16e5-429f-aa75-3ff7701e9f8d'
content='' additional_kwargs={'reasoning_content': ', like in Boolean'} response_metadata={} id='run-bd026918-16e5-429f-aa75-3ff7701e9f8d'
content='' additional_kwargs={'reasoning_content': ' algebra where 1 +'} response_metadata={} id='run-bd026918-16e5-429f-aa75-3ff7701e9f8d'
content='' additional_kwargs={'reasoning_content': ' 1 might still'} response_metadata={} id='run-bd026918-16e5-429f-aa75-3ff7701e9f8d'
content='' additional_kwargs={'reasoning_content': ' be 1 in'} response_metadata={} id='run-bd026918-16e5-429f-aa75-3ff7701e9f8d'
content='' additional_kwargs={'reasoning_content': ' some contexts? Wait'} response_metadata={} id='run-bd026918-16e5-429f-aa75-3ff7701e9f8d'
content='' additional_kwargs={'reasoning_content': ', no, in Boolean'} response_metadata={} id='run-bd026918-16e5-429f-aa75-3ff7701e9f8d'
content='' additional_kwargs={'reasoning_content': ' addition, 1'} response_metadata={} id='run-bd026918-16e5-429f-aa75-3ff7701e9f8d'
content='' additional_kwargs={'reasoning_content': ' + 1 is typically'} response_metadata={} id='run-bd026918-16e5-429f-aa75-3ff7701e9f8d'
content='' additional_kwargs={'reasoning_content': " 1 because it's logical"} response_metadata={} id='run-bd026918-16e5-429f-aa75-3ff7701e9f8d'
content='' additional_kwargs={'reasoning_content': ' OR. But the'} response_metadata={} id='run-bd026918-16e5-429f-aa75-3ff7701e9f8d'
content='' additional_kwargs={'reasoning_content': ' question just says "1'} response_metadata={} id='run-bd026918-16e5-429f-aa75-3ff7701e9f8d'
content='' additional_kwargs={'reasoning_content': ' plus 1," which is'} response_metadata={} id='run-bd026918-16e5-429f-aa75-3ff7701e9f8d'
content='' additional_kwargs={'reasoning_content': ' more arithmetic than Boolean.'} response_metadata={} id='run-bd026918-16e5-429f-aa75-3ff7701e9f8d'
content='' additional_kwargs={'reasoning_content': ' \n\nOr maybe in some other'} response_metadata={} id='run-bd026918-16e5-429f-aa75-3ff7701e9f8d'
content='' additional_kwargs={'reasoning_content': ' mathematical structure like modular arithmetic?'} response_metadata={} id='run-bd026918-16e5-429f-aa75-3ff7701e9f8d'
content='' additional_kwargs={'reasoning_content': ' For example, modulo'} response_metadata={} id='run-bd026918-16e5-429f-aa75-3ff7701e9f8d'
content='' additional_kwargs={'reasoning_content': ' 2,'} response_metadata={} id='run-bd026918-16e5-429f-aa75-3ff7701e9f8d'
content='' additional_kwargs={'reasoning_content': ' 1 + 1 is'} response_metadata={} id='run-bd026918-16e5-429f-aa75-3ff7701e9f8d'
content='' additional_kwargs={'reasoning_content': ' 0. But again'} response_metadata={} id='run-bd026918-16e5-429f-aa75-3ff7701e9f8d'
content='' additional_kwargs={'reasoning_content': ', unless specified, it'} response_metadata={} id='run-bd026918-16e5-429f-aa75-3ff7701e9f8d'
content='' additional_kwargs={'reasoning_content': "'s probably standard addition"} response_metadata={} id='run-bd026918-16e5-429f-aa75-3ff7701e9f8d'
content='' additional_kwargs={'reasoning_content': '. \n\nThe user might be'} response_metadata={} id='run-bd026918-16e5-429f-aa75-3ff7701e9f8d'
content='' additional_kwargs={'reasoning_content': ' testing if I know basic'} response_metadata={} id='run-bd026918-16e5-429f-aa75-3ff7701e9f8d'
content='' additional_kwargs={'reasoning_content': ' math, or maybe'} response_metadata={} id='run-bd026918-16e5-429f-aa75-3ff7701e9f8d'
content='' additional_kwargs={'reasoning_content': " they're a student just"} response_metadata={} id='run-bd026918-16e5-429f-aa75-3ff7701e9f8d'
content='' additional_kwargs={'reasoning_content': ' starting out. Either way,'} response_metadata={} id='run-bd026918-16e5-429f-aa75-3ff7701e9f8d'
content='' additional_kwargs={'reasoning_content': ' the straightforward answer is'} response_metadata={} id='run-bd026918-16e5-429f-aa75-3ff7701e9f8d'
content='' additional_kwargs={'reasoning_content': ' 2. I should also'} response_metadata={} id='run-bd026918-16e5-429f-aa75-3ff7701e9f8d'
content='' additional_kwargs={'reasoning_content': " consider if there's any cultural"} response_metadata={} id='run-bd026918-16e5-429f-aa75-3ff7701e9f8d'
content='' additional_kwargs={'reasoning_content': ' references or jokes where'} response_metadata={} id='run-bd026918-16e5-429f-aa75-3ff7701e9f8d'
content='' additional_kwargs={'reasoning_content': ' 1 + 1 equals'} response_metadata={} id='run-bd026918-16e5-429f-aa75-3ff7701e9f8d'
content='' additional_kwargs={'reasoning_content': ' something else, but I can'} response_metadata={} id='run-bd026918-16e5-429f-aa75-3ff7701e9f8d'
content='' additional_kwargs={'reasoning_content': "'t think of any common"} response_metadata={} id='run-bd026918-16e5-429f-aa75-3ff7701e9f8d'
content='' additional_kwargs={'reasoning_content': ' ones. \n\nAlternatively'} response_metadata={} id='run-bd026918-16e5-429f-aa75-3ff7701e9f8d'
content='' additional_kwargs={'reasoning_content': ', in some contexts like'} response_metadata={} id='run-bd026918-16e5-429f-aa75-3ff7701e9f8d'
content='' additional_kwargs={'reasoning_content': ' in chemistry,'} response_metadata={} id='run-bd026918-16e5-429f-aa75-3ff7701e9f8d'
content='' additional_kwargs={'reasoning_content': ' 1 + 1 could refer'} response_metadata={} id='run-bd026918-16e5-429f-aa75-3ff7701e9f8d'
content='' additional_kwargs={'reasoning_content': ' to mixing solutions, but that'} response_metadata={} id='run-bd026918-16e5-429f-aa75-3ff7701e9f8d'
content='' additional_kwargs={'reasoning_content': "'s not standard. The question"} response_metadata={} id='run-bd026918-16e5-429f-aa75-3ff7701e9f8d'
content='' additional_kwargs={'reasoning_content': ' is pretty simple,'} response_metadata={} id='run-bd026918-16e5-429f-aa75-3ff7701e9f8d'
content='' additional_kwargs={'reasoning_content': ' so I think the answer'} response_metadata={} id='run-bd026918-16e5-429f-aa75-3ff7701e9f8d'
content='' additional_kwargs={'reasoning_content': ' is 2. To'} response_metadata={} id='run-bd026918-16e5-429f-aa75-3ff7701e9f8d'
content='' additional_kwargs={'reasoning_content': ' be thorough, maybe mention'} response_metadata={} id='run-bd026918-16e5-429f-aa75-3ff7701e9f8d'
content='' additional_kwargs={'reasoning_content': ' that in standard arithmetic it'} response_metadata={} id='run-bd026918-16e5-429f-aa75-3ff7701e9f8d'
content='' additional_kwargs={'reasoning_content': "'s 2, but if"} response_metadata={} id='run-bd026918-16e5-429f-aa75-3ff7701e9f8d'
content='' additional_kwargs={'reasoning_content': " there's a different"} response_metadata={} id='run-bd026918-16e5-429f-aa75-3ff7701e9f8d'
content='' additional_kwargs={'reasoning_content': ' context, the answer'} response_metadata={} id='run-bd026918-16e5-429f-aa75-3ff7701e9f8d'
content='' additional_kwargs={'reasoning_content': ' might vary. But since'} response_metadata={} id='run-bd026918-16e5-429f-aa75-3ff7701e9f8d'
content='' additional_kwargs={'reasoning_content': ' no context is given'} response_metadata={} id='run-bd026918-16e5-429f-aa75-3ff7701e9f8d'
content='' additional_kwargs={'reasoning_content': ', 2 is the safest'} response_metadata={} id='run-bd026918-16e5-429f-aa75-3ff7701e9f8d'
content='' additional_kwargs={'reasoning_content': ' answer.'} response_metadata={} id='run-bd026918-16e5-429f-aa75-3ff7701e9f8d'
content='The result' additional_kwargs={'reasoning_content': ''} response_metadata={} id='run-bd026918-16e5-429f-aa75-3ff7701e9f8d'
content=' of 1 plus' additional_kwargs={'reasoning_content': ''} response_metadata={} id='run-bd026918-16e5-429f-aa75-3ff7701e9f8d'
content=' 1 is **2**.' additional_kwargs={'reasoning_content': ''} response_metadata={} id='run-bd026918-16e5-429f-aa75-3ff7701e9f8d'
content=' \n\nIn standard arithmetic (base' additional_kwargs={'reasoning_content': ''} response_metadata={} id='run-bd026918-16e5-429f-aa75-3ff7701e9f8d'
content=' 10), adding' additional_kwargs={'reasoning_content': ''} response_metadata={} id='run-bd026918-16e5-429f-aa75-3ff7701e9f8d'
content=' 1 and 1 together' additional_kwargs={'reasoning_content': ''} response_metadata={} id='run-bd026918-16e5-429f-aa75-3ff7701e9f8d'
content=' yields 2. This is' additional_kwargs={'reasoning_content': ''} response_metadata={} id='run-bd026918-16e5-429f-aa75-3ff7701e9f8d'
content=' a fundamental mathematical principle. If' additional_kwargs={'reasoning_content': ''} response_metadata={} id='run-bd026918-16e5-429f-aa75-3ff7701e9f8d'
content=' the question involves a different context' additional_kwargs={'reasoning_content': ''} response_metadata={} id='run-bd026918-16e5-429f-aa75-3ff7701e9f8d'
content=' (e.g., binary' additional_kwargs={'reasoning_content': ''} response_metadata={} id='run-bd026918-16e5-429f-aa75-3ff7701e9f8d'
content=', modular arithmetic, or a' additional_kwargs={'reasoning_content': ''} response_metadata={} id='run-bd026918-16e5-429f-aa75-3ff7701e9f8d'
content=' metaphorical meaning), it' additional_kwargs={'reasoning_content': ''} response_metadata={} id='run-bd026918-16e5-429f-aa75-3ff7701e9f8d'
content=' would need clarification,' additional_kwargs={'reasoning_content': ''} response_metadata={} id='run-bd026918-16e5-429f-aa75-3ff7701e9f8d'
content=' but under typical circumstances, the' additional_kwargs={'reasoning_content': ''} response_metadata={} id='run-bd026918-16e5-429f-aa75-3ff7701e9f8d'
content=' answer is **2**.' additional_kwargs={'reasoning_content': ''} response_metadata={} id='run-bd026918-16e5-429f-aa75-3ff7701e9f8d'
content='' additional_kwargs={'reasoning_content': ''} response_metadata={'finish_reason': 'stop', 'request_id': '4738c641-6bd8-9efc-a4fe-d929d4e62bef', 'token_usage': {'input_tokens': 16, 'output_tokens': 560, 'total_tokens': 576}} id='run-bd026918-16e5-429f-aa75-3ff7701e9f8d'

```

Co-authored-by: ccurme <chester.curme@gmail.com>
2025-03-18 11:43:10 -04:00
ccurme
b91daf06eb
groq[minor]: remove default model (#30341)
The default model for `ChatGroq`, `"mixtral-8x7b-32768"`, is being
retired on March 20, 2025. Here we remove the default, such that model
names must be explicitly specified (being explicit is a good practice
here, and avoids the need for breaking changes down the line). This
change will be released in a minor version bump to 0.3.

This follows https://github.com/langchain-ai/langchain/pull/30161
(released in version 0.2.5), where we began generating warnings to this
effect.

![Screenshot 2025-03-18 at 10 33
27 AM](https://github.com/user-attachments/assets/f1e4b302-c62a-43b0-aa86-eaf9271e86cb)
2025-03-18 10:50:34 -04:00
amuwall
f6a17fbc56
community: fix import exception too constrictive (#30218)
Fix this issue #30097
2025-03-17 22:09:02 -04:00
qonnop
036f00dc92
community: support in-memory data (Blob.from_data) in all audio parsers (#30262)
OpenAIWhisperParser, OpenAIWhisperParserLocal, YandexSTTParser do not
handle in-memory audio data (loaded via Blob.from_data) correctly. They
require Blob.path to be set and AudioSegment is always read from the
file system. In-memory data is handled correctly only for
FasterWhisperParser so far. I changed OpenAIWhisperParser,
OpenAIWhisperParserLocal, YandexSTTParser accordingly to match
FasterWhisperParser.
Thanks for reviewing the PR!

Co-authored-by: qonnop <qonnop@users.noreply.github.com>
2025-03-17 19:52:33 -04:00
Matthew Farrellee
1985aaf095
langchain-tests: allow subclasses to add addition, non-standard tests (#30204)
**description:** the ChatModel[Integration]Tests classes are powerful
and helpful, this change allows sub-classes to add additional tests.

for instance,

```
class TestChatMyServiceIntegration(ChatModelIntegrationTests):
    ...
    def test_myservice(self, model: BaseChatModel) -> None:
        ...
```

---------

Co-authored-by: ccurme <chester.curme@gmail.com>
2025-03-17 23:37:16 +00:00
Ben
789db7398b
text-splitters: Add JSFrameworkTextSplitter for Handling JavaScript Framework Code (#28972)
## Description
This pull request introduces a new text splitter,
`JSFrameworkTextSplitter`, to the Langchain library. The
`JSFrameworkTextSplitter` extends the `RecursiveCharacterTextSplitter`
to handle JavaScript framework code effectively, including React (JSX),
Vue, and Svelte. It identifies and utilizes framework-specific component
tags and syntax elements as splitting points, alongside standard
JavaScript syntax. This ensures that code is divided at natural
boundaries, enhancing the parsing and processing of JavaScript and
framework-specific code.

### Key Features
- Supports React (JSX), Vue, and Svelte frameworks.
- Identifies and uses framework-specific tags and syntax elements as
natural splitting points.
- Extends the existing `RecursiveCharacterTextSplitter` for seamless
integration.

## Issue
No specific issue addressed.

## Dependencies
No additional dependencies required.

---------

Co-authored-by: ccurme <chester.curme@gmail.com>
2025-03-17 23:32:33 +00:00
ccurme
5684653775
openai[patch]: release 0.3.9 (#30325) 2025-03-17 16:08:41 +00:00
ccurme
eb9b992aa6
openai[patch]: support additional Responses API features (#30322)
- Include response headers
- Max tokens
- Reasoning effort
- Fix bug with structured output / strict
- Fix bug with simultaneous tool calling + structured output
2025-03-17 12:02:21 -04:00
Bae-ChangHyun
d8510270ee
community: add 'extract' mode to FireCrawlLoader for structured data extraction (#30242)
**Description:** 
Added an 'extract' mode to FireCrawlLoader that enables structured data
extraction from web pages. This feature allows users to Extract
structured data from a single URLs, or entire websites using Large
Language Models (LLMs).
You can show more params and usage on [firecrawl
docs](https://docs.firecrawl.dev/features/extract-beta).
You can extract from only one url now.(it depends on firecrawl's extract
method)

**Dependencies:** 
No new dependencies required. Uses existing FireCrawl API capabilities.

---------

Co-authored-by: chbae <chbae@gcsc.co.kr>
Co-authored-by: ccurme <chester.curme@gmail.com>
2025-03-17 15:15:57 +00:00
qonnop
747efa16ec
community: fix CPU support for FasterWhisperParser (implicit compute type for WhisperModel) (#30263)
FasterWhisperParser fails on a machine without an NVIDIA GPU: "Requested
float16 compute type, but the target device or backend do not support
efficient float16 computation." This problem arises because the
WhisperModel is called with compute_type="float16", which works only for
NVIDIA GPU.

According to the [CTranslate2
docs](https://opennmt.net/CTranslate2/quantization.html#bit-floating-points-float16)
float16 is supported only on NVIDIA GPUs. Removing the compute_type
parameter solves the problem for CPUs. According to the [CTranslate2
docs](https://opennmt.net/CTranslate2/quantization.html#quantize-on-model-loading)
setting compute_type to "default" (standard when omitting the parameter)
uses the original compute type of the model or performs implicit
conversion for the specific computation device (GPU or CPU). I suggest
to remove compute_type="float16".

@hulitaitai you are the original author of the FasterWhisperParser - is
there a reason for setting the parameter to float16?

Thanks for reviewing the PR!

Co-authored-by: qonnop <qonnop@users.noreply.github.com>
2025-03-14 22:22:29 -04:00
ccurme
c74e7b997d
openai[patch]: support structured output via Responses API (#30265)
Also runs all standard tests using Responses API.
2025-03-14 15:14:23 -04:00
Priyansh Agrawal
f54f14b747
community: cube document loader - do not load non-public dimensions and measures (#30286)
Thank you for contributing to LangChain!

- [x] **PR title**: "package: description"
- Where "package" is whichever of langchain, community, core, etc. is
being modified. Use "docs: ..." for purely docs changes, "infra: ..."
for CI changes.
  - Example: "community: add foobar LLM"

- **Description:** Do not load non-public dimensions and measures
(public: false) with Cube semantic loader

- **Issue:** Currently, non-public dimensions and measures are loaded by
the Cube document loader which leads to downstream applications using
these which is not allowed by Cube.


- [x] **Add tests and docs**: If you're adding a new integration, please
include
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in
`docs/docs/integrations` directory.


- [x] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/

Additional guidelines:
- Make sure optional dependencies are imported within a function.
- Please do not add dependencies to pyproject.toml files (even optional
ones) unless they are required for unit tests.
- Most PRs should not touch more than one package.
- Changes should be backwards compatible.
- If you are adding something to community, do not re-import it in
langchain.

If no one reviews your PR within a few days, please @-mention one of
baskaryan, eyurtsev, ccurme, vbarda, hwchase17.
2025-03-14 15:07:56 -04:00
Stavros Kontopoulos
ac22cde130
langchain_ollama: Support keep_alive in embeddings (#30251)
- Description: Adds support for keep_alive in Ollama Embeddings see
https://github.com/ollama/ollama/issues/6401.
Builds on top of of
https://github.com/langchain-ai/langchain/pull/29296. I have this use
case where I want to keep the embeddings model in cpu forever.
- Dependencies: no deps are being introduced.
- Issue: haven't created an issue yet.
2025-03-14 14:56:50 -04:00
homeffjy
2c99f12062
community[patch]: fix bilibili loader handling of multi-page content (#30283)
Previously the loader would only extract subtitles from the first page
of multi-page videos.
2025-03-14 14:53:03 -04:00
ccurme
d5d0134e7b
anthropic: release 0.3.10 (#30287) 2025-03-14 16:23:21 +00:00
ccurme
226f29bc96
anthropic: support built-in tools, improve docs (#30274)
- Support features from recent update:
https://www.anthropic.com/news/token-saving-updates (mostly adding
support for built-in tools in `bind_tools`
- Add documentation around prompt caching, token-efficient tool use, and
built-in tools.
2025-03-14 16:18:50 +00:00
Priyansh Agrawal
f27e2d7ce7
community: cube document loader - fix logging (#30285)
Thank you for contributing to LangChain!

- [x] **PR title**: "package: description"
- Where "package" is whichever of langchain, community, core, etc. is
being modified. Use "docs: ..." for purely docs changes, "infra: ..."
for CI changes.
  - Example: "community: add foobar LLM"

- **Description:** Fix bad log message on line#56 and replace f-string
logs with format specifiers

- **Issue:** Log messages such as this one
`INFO:langchain_community.document_loaders.cube_semantic:Loading
dimension values for: {dimension_name}...`

- [ ] **Add tests and docs**: If you're adding a new integration, please
include
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in
`docs/docs/integrations` directory.


- [x] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/

Additional guidelines:
- Make sure optional dependencies are imported within a function.
- Please do not add dependencies to pyproject.toml files (even optional
ones) unless they are required for unit tests.
- Most PRs should not touch more than one package.
- Changes should be backwards compatible.
- If you are adding something to community, do not re-import it in
langchain.

If no one reviews your PR within a few days, please @-mention one of
baskaryan, eyurtsev, ccurme, vbarda, hwchase17.
2025-03-14 11:36:18 -04:00
ccurme
bbd4b36d76
mistralai[patch]: bump core (#30278) 2025-03-13 23:04:36 +00:00
ccurme
315bb17ef5
core: release 0.3.45 (#30277) 2025-03-13 22:44:23 +00:00
pulvedu
d0bfc7f820
community[fix] : Pass API_KEY as argument (#30272)
PR Title:
community: Fix Pass API_KEY as argument

PR Message:
Description:
This PR fixes validation error "Value error, Did not find
tavily_api_key, please add an environment variable `TAVILY_API_KEY`
which contains it, or pass `tavily_api_key` as a named parameter."

Dependencies:
No new dependencies introduced.

---------

Co-authored-by: pulvedu <dustin@tavily.com>
Co-authored-by: Chester Curme <chester.curme@gmail.com>
2025-03-13 22:19:31 +00:00
ccurme
733abcc884
mistral: release 0.2.8 (#30275) 2025-03-13 21:54:34 +00:00
Jacob Lee
e9c1765967
fix(core): Ignore missing secrets on deserialization (#30252) 2025-03-13 12:27:03 -07:00
ccurme
ebea5e014d
standard tests: test simple agent loop (#30268) 2025-03-13 16:34:12 +00:00
ccurme
cd1ea8e94d
openai[patch]: support Responses API (#30231)
Co-authored-by: Bagatur <baskaryan@gmail.com>
2025-03-12 12:25:46 -04:00
Jason Zhang
49bdd3b6fe
docs: Add AgentQL provider doc, tool/toolkit doc and documentloader doc (#30144)
- **Description:** Added AgentQL docs for the provider page, tools page
and documentloader page
- **Twitter handle:** @AgentQL

Repo:
https://github.com/tinyfish-io/agentql-integrations/tree/main/langchain
PyPI: https://pypi.org/project/langchain-agentql/

If no one reviews your PR within a few days, please @-mention one of
baskaryan, eyurtsev, ccurme, vbarda, hwchase17.

---------

Co-authored-by: Chester Curme <chester.curme@gmail.com>
2025-03-11 21:57:40 -04:00
Vadym Barda
23fa70f328
core[patch]: release 0.3.44 (#30236) 2025-03-11 18:59:02 -04:00
Vadym Barda
c7842730ef
core[patch]: support single-node subgraphs and put subgraph nodes under the respective subgraphs (#30234) 2025-03-11 18:55:45 -04:00
ccurme
62c570dd77
standard-tests, openai: bump core (#30202) 2025-03-10 19:22:24 +00:00
ccurme
f896e701eb
deepseek: install local langchain-tests in test deps (#30198) 2025-03-10 16:58:17 +00:00
Hugh Gao
aa6dae4a5b
community: Remove the system message count limit for ChatTongyi. (#30192)
## Description
The models in DashScope support multiple SystemMessage. Here is the
[Doc](https://bailian.console.aliyun.com/model_experience_center/text#/model-market/detail/qwen-long?tabKey=sdk),
and the example code on the document page:
```python
import os
from openai import OpenAI

client = OpenAI(
    api_key=os.getenv("DASHSCOPE_API_KEY"),  # 如果您没有配置环境变量,请在此处替换您的API-KEY
    base_url="https://dashscope.aliyuncs.com/compatible-mode/v1",  # 填写DashScope服务base_url
)
# 初始化messages列表
completion = client.chat.completions.create(
    model="qwen-long",
    messages=[
        {'role': 'system', 'content': 'You are a helpful assistant.'},
        # 请将 'file-fe-xxx'替换为您实际对话场景所使用的 file-id。
        {'role': 'system', 'content': 'fileid://file-fe-xxx'},
        {'role': 'user', 'content': '这篇文章讲了什么?'}
    ],
    stream=True,
    stream_options={"include_usage": True}
)

full_content = ""
for chunk in completion:
    if chunk.choices and chunk.choices[0].delta.content:
        # 拼接输出内容
        full_content += chunk.choices[0].delta.content
        print(chunk.model_dump())

print({full_content})
```
Tip: The example code is for OpenAI, but the document said that it also
supports the DataScope API, and I tested it, and it works.
```
Is the Dashscope SDK invocation method compatible?

Yes, the Dashscope SDK remains compatible for model invocation. However, file uploads and file-ID retrieval are currently only supported via the OpenAI SDK. The file-ID obtained through this method is also compatible with Dashscope for model invocation.
```
2025-03-10 08:58:40 -04:00
ccurme
67aff1648b
community: Add OpenGradient integration (Toolkit) (#30190)
Commandeering https://github.com/langchain-ai/langchain/pull/30135

---------

Co-authored-by: kylexqian <kylexqian@gmail.com>
2025-03-09 18:08:07 -04:00
ccurme
b209d46eb3
mistral[patch]: set global ssl context (#30189) 2025-03-09 21:27:41 +00:00
Vijay Selvaraj
df459d0d5e
community: add Valthera integration (#30105)
```markdown
**Description:**  
This PR integrates Valthera into LangChain, introducing an framework designed to send highly personalized nudges by an LLM agent. This is modeled after Dr. BJ Fogg's Behavior Model. This integration includes:

- Custom data connectors for HubSpot, PostHog, and Snowflake.
- A unified data aggregator that consolidates user data.
- Scoring configurations to compute motivation and ability scores.
- A reasoning engine that determines the appropriate user action.
- A trigger generator to create personalized messages for user engagement.

**Issue:**  
N/A

**Dependencies:**  
N/A

**Twitter handle:**  
- `@vselvarajijay`

**Tests and Docs:**  
- `docs/docs/integrations/tools/valthera` 
- `https://github.com/valthera/langchain-valthera/tree/main/tests`

```

---------

Co-authored-by: Chester Curme <chester.curme@gmail.com>
2025-03-09 21:19:08 +00:00
ccurme
3823daa0b9
cli: update integration doc template for tools (#30188)
Chain example -> langgraph agent
2025-03-09 21:14:43 +00:00
Jonathan Feng
911accf733
docs: add contextualai documentation (#30050)
Thank you for contributing to LangChain!
 
**Description:** adds ContextualAI's `langchain-contextual` package's
documentation

If no one reviews your PR within a few days, please @-mention one of
baskaryan, efriis, eyurtsev, ccurme, vbarda, hwchase17.

---------

Co-authored-by: Chester Curme <chester.curme@gmail.com>
2025-03-09 02:43:13 +00:00
Bharat
b9746a6910
fixes#30182: update tool names to match OpenAI function name pattern (#30183)
The OpenAI API requires function names to match the pattern
'^[a-zA-Z0-9_-]+$'. This updates the JIRA toolkit's tool names to use
underscores instead of spaces to comply with this requirement and
prevent BadRequestError when using the tools with OpenAI functions.

Error fixed:
```
File "langgraph-bug-fix/.venv/lib/python3.13/site-packages/openai/_base_client.py", line 1023, in _request
    raise self._make_status_error_from_response(err.response) from None
openai.BadRequestError: Error code: 400 - {'error': {'message': "Invalid 'tools[0].function.name': string does not match pattern. Expected a string that matches the pattern '^[a-zA-Z0-9_-]+$'.", 'type': 'invalid_request_error', 'param': 'tools[0].function.name', 'code': 'invalid_value'}}
During task with name 'agent' and id 'aedd7537-e8d5-6678-d0c5-98129586d3ac'
```

Issue:#30182
2025-03-08 20:48:25 -05:00
ccurme
cee0fecb08
docs: update package registry counts (#30181) 2025-03-08 20:37:59 -05:00
William FH
bac3a28e70
Flush (#30157) 2025-03-07 16:32:15 -08:00
ccurme
a7ab5e8372
community[patch]: ChatPerplexity: track usage metadata (#30175) 2025-03-07 23:25:05 +00:00
ccurme
1c993b921c
core[patch]: release 0.3.43 (#30173) 2025-03-07 21:56:00 +00:00
ccurme
9893e5cb80
core[patch]: catch structured_output_format (#30172)
Change to `ls_structured_output_format` was not backward-compatible with
older versions of integration packages.
2025-03-07 16:50:06 -05:00
ccurme
33a3510243
core[patch]: export ArgsSchema (#30169)
This is needed for type hints

see: https://github.com/langchain-ai/langchain/pull/30167
2025-03-07 20:43:05 +00:00
ccurme
17507c9ba6
groq[patch]: release 0.2.5 (#30168) 2025-03-07 20:25:51 +00:00
andyzhou1982
9e863c89d2
add JiebaLinkExtractor for chinese doc extracting (#30150)
Thank you for contributing to LangChain!

- [ ] **PR title**: "community: chinese doc extracting"


- [ ] **PR message**: 
- **Description:** add jieba_link_extractor.py for chinese doc
extracting
    - **Dependencies:** jieba


- [ ] **Add tests and docs**: If you're adding a new integration, please
include
  /doc/doc/integrations/providers/jieba.md
  /doc/doc/integrations/vectorstores/jieba_link_extractor.ipynb
  /libs/packages.yml

---------

Co-authored-by: Chester Curme <chester.curme@gmail.com>
2025-03-07 20:21:46 +00:00
ccurme
74e7772a5f
groq[patch]: warn if model is not specified (#30161)
Groq is retiring `mixtral-8x7b-32768`, which is currently the default
model for ChatGroq, on March 20. Here we emit a warning if the model is
not specified explicitly.

A version 0.3.0 will be released ahead of March 20 that removes the
default altogether.
2025-03-07 15:21:13 -05:00
Ioannis Bakagiannis
3444e587ee
docs: Integration Update - ADS4GPTs (#30153)
docs: New integration for LangChain - ads4gpts-langchain

Description: Tools and Toolkit for Agentic integration natively within
LangChain with ADS4GPTs, in order to help applications monetize with
advertising.

Twitter handle: @ads4gpts

Co-authored-by: knitlydevaccount <loom+github@knitly.app>
2025-03-07 14:35:44 -05:00
ccurme
3c258194ae
tests[patch]: release 0.3.14 (#30165) 2025-03-07 18:34:05 +00:00
ccurme
34638ccfae
openai[patch]: release 0.3.8 (#30164) 2025-03-07 18:26:40 +00:00
ccurme
4e5058f29c
core[patch]: release 0.3.42 (#30163) 2025-03-07 18:14:45 +00:00
Eugene Yurtsev
894fd63a61
cli: release 0.0.36 (#30159)
Bump for 0.0.36
2025-03-07 13:05:40 -05:00
ccurme
806211475a
core[patch]: update structured output tracing (#30123)
- Trace JSON schema in `options`
- Rename to `ls_structured_output_format`
2025-03-07 13:05:25 -05:00
ccurme
230876a7c5
anthropic[patch]: add PDF input example to API reference (#30156) 2025-03-07 14:19:08 +00:00
joeconstantino
022ff9eead
Tableau docs for new datasource qa tool (#30125)
- **Description: a notebook showing langchain and langraph agents using
the new langchain_tableau tool
- **Twitter handle: @joe_constantin0

---------

Co-authored-by: Joe Constantino <joe@constantino.com>
Co-authored-by: Chester Curme <chester.curme@gmail.com>
2025-03-06 14:58:56 +00:00
ccurme
52b0570bec
core, openai, standard-tests: improve OpenAI compatibility with Anthropic content blocks (#30128)
- Support thinking blocks in core's `convert_to_openai_messages` (pass
through instead of error)
- Ignore thinking blocks in ChatOpenAI (instead of error)
- Support Anthropic-style image blocks in ChatOpenAI

---

Standard integration tests include a `supports_anthropic_inputs`
property which is currently enabled only for tests on `ChatAnthropic`.
This test enforces compatibility with message histories of the form:
```
- system message
- human message
- AI message with tool calls specified only through `tool_use` content blocks
- human message containing `tool_result` and an additional `text` block
```
It additionally checks support for Anthropic-style image inputs if
`supports_image_inputs` is enabled.

Here we change this test, such that if you enable
`supports_anthropic_inputs`:
- You support AI messages with text and `tool_use` content blocks
- You support Anthropic-style image inputs (if `supports_image_inputs`
is enabled)
- You support thinking content blocks.

That is, we add a test case for thinking content blocks, but we also
remove the requirement of handling tool results within HumanMessages
(motivated by existing agent abstractions, which should all return
ToolMessage). We move that requirement to a ChatAnthropic-specific test.
2025-03-06 09:53:14 -05:00
Pat Patterson
b3dc66f7a3
community: fix AttributeError when creating LanceDB vectorstore (#30127)
**Description:**

This PR adds a call to `guard_import()` to fix an AttributeError raised
when creating LanceDB vectorstore instance with an existing LanceDB
table.

**Issue:**

This PR fixes issue #30124.

**Dependencies:**

No additional dependencies.

**Twitter handle:**

[@metadaddy](https://x.com/metadaddy), but I spend more time at
[@metadaddy.net](https://bsky.app/profile/metadaddy.net) these days.

---------

Co-authored-by: Chester Curme <chester.curme@gmail.com>
2025-03-05 23:04:38 +00:00
Hugh Gao
9b7b8e4a1a
community: make DashScope models support Partial Mode for text continuation. (#30108)
## Description
make DashScope models support Partial Mode for text continuation.

For text continuation in ChatTongYi, it supports text continuation with
a prefix by adding a "partial" argument in AIMessage. The document is
[Partial Mode
](https://help.aliyun.com/zh/model-studio/user-guide/partial-mode?spm=a2c4g.11186623.help-menu-2400256.d_1_0_0_8.211e5b77KMH5Pn&scm=20140722.H_2862210._.OR_help-T_cn~zh-V_1).
The API example is:
```py
import os
import dashscope

messages = [{
    "role": "user",
    "content": "请对“春天来了,大地”这句话进行续写,来表达春天的美好和作者的喜悦之情"
},
{
    "role": "assistant",
    "content": "春天来了,大地",
    "partial": True
}]
response = dashscope.Generation.call(
    api_key=os.getenv("DASHSCOPE_API_KEY"),
    model='qwen-plus',
    messages=messages,
    result_format='message',  
)

print(response.output.choices[0].message.content)
```

---------

Co-authored-by: Chester Curme <chester.curme@gmail.com>
2025-03-05 16:22:14 +00:00
黑牛
f0153414d5
Add request_id field to improve request tracking and debugging (for Tongyi model) (#30110)
- **Description**: Added the request_id field to the check_response
function to improve request tracking and debugging, applicable for the
Tongyi model.
- **Issue**: None
- **Dependencies**: None
- **Twitter handle**: None

- **Add tests and docs**: None

- **Lint and test**: Ran `make format`, `make lint`, and `make test` to
ensure the code meets formatting and testing requirements.
2025-03-05 11:03:47 -05:00
Manthan Surkar
1ee8aceaee
community: fix Jira API wrapper failing initialization with cloud param (#30117)
### **Description**  
Converts the boolean `jira_cloud` parameter in the Jira API Wrapper to a
string before initializing the Jira Client. Also adds tests for the
same.

### **Issue**  
[Jira API Wrapper
Bug](8abb65e138/libs/community/langchain_community/utilities/jira.py (L47))

```python
jira_cloud_str = get_from_dict_or_env(values, "jira_cloud", "JIRA_CLOUD")
jira_cloud = jira_cloud_str.lower() == "true"
```

The above code has a bug where the value of `"jira_cloud"` is a boolean.
If it is passed, calling `.lower()` on a boolean raises an error.
Additionally, `False` cannot be passed explicitly since
`get_from_dict_or_env` falls back to environment variables.

Relevant code in `langchain_core`:  

[Source](https://github.com/thesmallstar/langchain/blob/master/.venv/lib/python3.13/site-packages/langchain_core/utils/env.py#L46)

```python
if isinstance(key, str) and key in data and data[key]:  # Here, data[key] is False
```

This PR fixes both issues.

### **Twitter Handle**  
[Manthan Surkar](https://x.com/manthan_surkar)
2025-03-05 10:49:25 -05:00
Adrián Panella
c599ba47d5
core(mermaid): fix error when 3+ subgraph levels (#29970) 2025-03-04 13:27:49 -05:00
Alexander Henlein
417efa30a6
docs: add Taiga Tool integration docs (#30042)
This PR adds documentation for the langchain-taiga Tool integration,
including an example notebook at
'docs/docs/integrations/tools/taiga.ipynb' and updates to
'libs/packages.yml' to track the new package.

Issue:
N/A

Dependencies:
None

Twitter handle:
N/A

---------

Co-authored-by: ccurme <chester.curme@gmail.com>
2025-03-04 17:51:20 +00:00
Mathias Marciano
5f0102242a
Fixed an issue with the OpenAI Assistant's 'retrieval' tool and adding support for the 'attachments' parameter (#30006)
PR Title:
langchain: add attachments support in OpenAIAssistantRunnable

PR Description:
This PR fixes an issue with the "retrieval" tool (internally named
"file_search") in the OpenAI Assistant by adding support for the
"attachments" parameter in the invoke method. This change allows files
to be linked to messages when they are inserted into threads, which is
essential for utilizing OpenAI's Retrieval Augmented Generation (RAG)
feature.

Issue:
N/A

Dependencies:
None

Twitter handle:
N/A

---------

Co-authored-by: Chester Curme <chester.curme@gmail.com>
2025-03-04 17:34:11 +00:00
Philippe PRADOS
4710c1fa8c
community[minor]: Fix regular expression in visualize and outlines modules. (#30002)
Fix invalid escape characteres
2025-03-04 12:23:48 -05:00
ccurme
577c0d0715
community[patch]: release 0.3.19 (#30104) 2025-03-04 16:12:03 +00:00
ccurme
ba5ddb218f
anthropic[patch]: release 0.3.9 (#30103) 2025-03-04 10:53:55 -05:00
ccurme
9383a0536a
tests[patch]: release 0.3.13 (#30102) 2025-03-04 10:53:43 -05:00
ccurme
fb16c25920
langchain[patch]: release 0.3.20 (#30101) 2025-03-04 15:47:27 +00:00
ccurme
692a68bf1c
core[patch]: release 0.3.41 (#30100) 2025-03-04 15:08:57 +00:00
ccurme
484d945500
community[patch]: remove numpy cap for python < 3.12 (#30084) 2025-03-04 09:46:41 -05:00
ZhangShenao
8575d7491f
[Doc] Improve api doc (#30073)
- Update api_doc for `BaseMessage`
- add static method decorator for `retry_runnable`
2025-03-04 09:39:07 -05:00
Samuel Dion-Girardeau
ccb64e9f4f
docs: Fix typo in code samples for max_tokens_for_prompt (#30088)
- **Description:** Fix typo in code samples for max_tokens_for_prompt.
Code blocks had singular "token" but the method has plural "tokens".
- **Issue:** N/A
- **Dependencies:** N/A
- **Twitter handle:** N/A
2025-03-04 09:11:21 -05:00
ArrayPD
c671d54c6f
core: make with_alisteners() example workable. (#30059)
**Description:**

5 fix of example from function with_alisteners() in
libs/core/langchain_core/runnables/base.py
Replace incoherent example output with workable example's output.

1. SyntaxError: unterminated string literal
    print(f"on start callback starts at {format_t(time.time())}
    correct as
    print(f"on start callback starts at {format_t(time.time())}")

2. SyntaxError: unterminated string literal
    print(f"on end callback starts at {format_t(time.time())}
    correct as
    print(f"on end callback starts at {format_t(time.time())}")

3. NameError: name 'Runnable' is not defined
    Fix as
    from langchain_core.runnables import Runnable

4. NameError: name 'asyncio' is not defined
    Fix as
    import asyncio

5. NameError: name 'format_t' is not defined.
    Implement format_t() as
    from datetime import datetime, timezone

    def format_t(timestamp: float) -> str:
return datetime.fromtimestamp(timestamp, tz=timezone.utc).isoformat()
2025-03-01 15:39:02 -05:00
cold-eye
7c175e3fda
Update ascend.py (#30060)
add batch_size to fix oom when embed large amount texts

Thank you for contributing to LangChain!

- [ ] **PR title**: "package: description"
- Where "package" is whichever of langchain, community, core, etc. is
being modified. Use "docs: ..." for purely docs changes, "infra: ..."
for CI changes.
  - Example: "community: add foobar LLM"


- [ ] **PR message**: ***Delete this entire checklist*** and replace
with
    - **Description:** a description of the change
    - **Issue:** the issue # it fixes, if applicable
    - **Dependencies:** any dependencies required for this change
- **Twitter handle:** if your PR gets announced, and you'd like a
mention, we'll gladly shout you out!


- [ ] **Add tests and docs**: If you're adding a new integration, please
include
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in
`docs/docs/integrations` directory.


- [ ] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/

Additional guidelines:
- Make sure optional dependencies are imported within a function.
- Please do not add dependencies to pyproject.toml files (even optional
ones) unless they are required for unit tests.
- Most PRs should not touch more than one package.
- Changes should be backwards compatible.
- If you are adding something to community, do not re-import it in
langchain.

If no one reviews your PR within a few days, please @-mention one of
baskaryan, efriis, eyurtsev, ccurme, vbarda, hwchase17.
2025-03-01 14:10:41 -05:00
ccurme
3b066dc005
anthropic[patch]: allow structured output when thinking is enabled (#30047)
Structured output will currently always raise a BadRequestError when
Claude 3.7 Sonnet's `thinking` is enabled, because we rely on forced
tool use for structured output and this feature is not supported when
`thinking` is enabled.

Here we:
- Emit a warning if `with_structured_output` is called when `thinking`
is enabled.
- Raise `OutputParserException` if no tool calls are generated.

This is arguably preferable to raising an error in all cases.

```python
from langchain_anthropic import ChatAnthropic
from pydantic import BaseModel


class Person(BaseModel):
    name: str
    age: int


llm = ChatAnthropic(
    model="claude-3-7-sonnet-latest",
    max_tokens=5_000,
    thinking={"type": "enabled", "budget_tokens": 2_000},
)
structured_llm = llm.with_structured_output(Person)  # <-- this generates a warning
```

```python
structured_llm.invoke("Alice is 30.")  # <-- works
```

```python
structured_llm.invoke("Hello!")  # <-- raises OutputParserException
```
2025-02-28 14:44:11 -05:00
ccurme
f8ed5007ea
anthropic, mistral: return model_name in response metadata (#30048)
Took a "census" of models supported by init_chat_model-- of those that
return model names in response metadata, these were the only two that
had it keyed under `"model"` instead of `"model_name"`.
2025-02-28 18:56:05 +00:00
Christophe Bornet
9e6ffd1264
core: Add ruff rules PTH (pathlib) (#29338)
See https://docs.astral.sh/ruff/rules/#flake8-use-pathlib-pth

Co-authored-by: ccurme <chester.curme@gmail.com>
2025-02-28 13:22:20 -05:00
TheSongg
86b364de3b
Add asynchronous generate interface (#30001)
- [ ] **PR title**: [langchain_community.llms.xinference]: Add
asynchronous generate interface

- [ ] **PR message**: The asynchronous generate interface support stream
data and non-stream data.
          
        chain = prompt | llm
        async for chunk in chain.astream(input=user_input):
            yield chunk


- [ ] **Add tests and docs**:

       from langchain_community.llms import Xinference
       from langchain.prompts import PromptTemplate

       llm = Xinference(
server_url="http://0.0.0.0:9997", # replace your xinference server url
model_uid={model_uid} # replace model_uid with the model UID return from
launching the model
           stream = True
            )
prompt = PromptTemplate(input=['country'], template="Q: where can we
visit in the capital of {country}? A:")
       chain = prompt | llm
       async for chunk in chain.astream(input=user_input):
           yield chunk
2025-02-28 12:32:44 -05:00
Fakai Zhao
f07338d2bf
Implementing the MMR algorithm for OLAP vector storage (#30033)
Thank you for contributing to LangChain!

-  **Implementing the MMR algorithm for OLAP vector storage**: 
  - Support Apache Doris and StarRocks OLAP database.
- Example: "vectorstore.as_retriever(search_type="mmr",
search_kwargs={"k": 10})"


- **Implementing the MMR algorithm for OLAP vector storage**: 
    - **Apache Doris
    - **StarRocks
    - **Dependencies:** any dependencies required for this change
- **Twitter handle:** if your PR gets announced, and you'd like a
mention, we'll gladly shout you out!


- **Add tests and docs**: 
- Example: "vectorstore.as_retriever(search_type="mmr",
search_kwargs={"k": 10})"


- [ ] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/

Additional guidelines:
- Make sure optional dependencies are imported within a function.
- Please do not add dependencies to pyproject.toml files (even optional
ones) unless they are required for unit tests.
- Most PRs should not touch more than one package.
- Changes should be backwards compatible.
- If you are adding something to community, do not re-import it in
langchain.

If no one reviews your PR within a few days, please @-mention one of
baskaryan, efriis, eyurtsev, ccurme, vbarda, hwchase17.

---------

Co-authored-by: fakzhao <fakzhao@cisco.com>
2025-02-28 08:50:22 -05:00
Daniel Rauber
186cd7f1a1
community: PlaywrightURLLoader should wait for page load event before attempting to extract data (#30043)
## Description

The PlaywrightURLLoader should wait for a page to be loaded before
attempting to extract data.
2025-02-28 08:45:51 -05:00
ccurme
0dbcc1d099
docs: document anthropic features (#30030)
Update integrations page with extended thinking feature.

Update API reference with extended thinking and citations.
2025-02-27 19:37:04 -05:00
ccurme
6c7c8a164f
openai[patch]: add unit test (#30022)
Test `max_completion_tokens` is propagated to payload for
AzureChatOpenAI.
2025-02-27 11:09:17 -05:00
DamonXue
156a60013a
docs: fix tavily_search code-block format. (#30012)
This pull request includes a change to the `TavilySearchResults` class
in the `tool.py` file, which updates the code block format in the
documentation.

Documentation update:

*
[`libs/community/langchain_community/tools/tavily_search/tool.py`](diffhunk://#diff-e3b6a980979268b639c6a86e9b182756b0f7c7e9e5605e613bc0a72ea6aa5301L54-R59):
Changed the code block format from Python to JSON in the example
provided in the docstring.Thank you for contributing to LangChain!
2025-02-27 10:55:15 -05:00
kawamou
8977ac5ab0
community[fix]: Handle None value in raw_content from Tavily API response (#30021)
## **Description:**

When using the Tavily retriever with include_raw_content=True, the
retriever occasionally fails with a Pydantic ValidationError because
raw_content can be None.

The Document model in langchain_core/documents/base.py requires
page_content to be a non-None value, but the Tavily API sometimes
returns None for raw_content.

This PR fixes the issue by ensuring that even when raw_content is None,
an empty string is used instead:

```python
page_content=result.get("content", "")
            if not self.include_raw_content
            else (result.get("raw_content") or ""),
2025-02-27 10:53:53 -05:00
Lakindu Boteju
f69deee1bd
community: Add cost data for aws bedrock anthropic.claude-3-7 model (#30016)
This pull request includes updates to the
`libs/community/langchain_community/callbacks/bedrock_anthropic_callback.py`
file to add a new model version to the list of supported models.

Updates to supported models:

* Added support for the `anthropic.claude-3-7-sonnet-20250219-v1:0`
model with a rate of `0.003` for 1000 input tokens.
* Added support for the `anthropic.claude-3-7-sonnet-20250219-v1:0`
model with a rate of `0.015` for 1000 output tokens.

AWS Bedrock pricing reference : https://aws.amazon.com/bedrock/pricing
2025-02-27 09:51:52 -05:00
Lakindu Boteju
e0e9e560b3
PyMuPDF4LLM integration to LangChain (#29953)
## PyMuPDF4LLM integration to LangChain for PDF content extraction in
Markdown format

### Description

[PyMuPDF4LLM](https://github.com/pymupdf/RAG) makes it easier to extract
PDF content in Markdown format, needed for LLM & RAG applications.
(License: GNU Affero General Public License v3.0)


[langchain-pymupdf4llm](https://github.com/lakinduboteju/langchain-pymupdf4llm)
integrates PyMuPDF4LLM to LangChain as a Document Loader.
(License: MIT License)

This pull request introduces the integration of
[PyMuPDF4LLM](https://pymupdf.readthedocs.io/en/latest/pymupdf4llm) into
the LangChain project as an integration package:
[`langchain-pymupdf4llm`](https://github.com/lakinduboteju/langchain-pymupdf4llm).
The most important changes include adding new Jupyter notebooks to
document the integration and updating the package configuration file to
include the new package.

### Documentation:

* `docs/docs/integrations/providers/pymupdf4llm.ipynb`: Added a new
Jupyter notebook to document the integration of `PyMuPDF4LLM` with
LangChain, including installation instructions and class imports.
* `docs/docs/integrations/document_loaders/pymupdf4llm.ipynb`: Added a
new Jupyter notebook to document the usage of `langchain-pymupdf4llm` as
a LangChain integration package in detail.

### Package registration:

* `libs/packages.yml`: Updated the package configuration file to include
the `langchain-pymupdf4llm` package.

### Additional information

* Related to: https://github.com/langchain-ai/langchain/pull/29848

---------

Co-authored-by: Chester Curme <chester.curme@gmail.com>
2025-02-26 15:59:12 -05:00
Dan Mirsky
d98c3f76c2
core[patch]: Fix FileCallbackHandler name resolution, Fixes #29941 (#29942)
- **Description:** Same changes as #26593 but for FileCallbackHandler
- **Issue:**  Fixes #29941
- **Dependencies:** None
- **Twitter handle:** None

- [x] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/
2025-02-26 14:54:24 -05:00
Christophe Bornet
b3885c124f
core: Add ruff rules TC (#29268)
See https://docs.astral.sh/ruff/rules/#flake8-type-checking-tc
Some fixes done for TC001,TC002 and TC003 but these rules are excluded
since they don't play well with Pydantic.

---------

Co-authored-by: Chester Curme <chester.curme@gmail.com>
2025-02-26 19:39:05 +00:00
talos
9cd20080fc
community: Update SQLiteVec table trigger (#29914)
**Issue**: This trigger can only be used by the first table created.
Cannot create additional triggers for other tables.

**fixed**: Update the trigger name so that it can be used for new
tables.

---------

Co-authored-by: Chester Curme <chester.curme@gmail.com>
2025-02-26 15:10:13 +00:00
ccurme
7562677f3f
langchain[patch]: delete erroneous lock file (#30007)
Picked up during merge.
2025-02-26 15:01:05 +00:00
Erick Friis
3c96012f5e
langchain: make numpy optional (#29182)
Co-authored-by: Chester Curme <chester.curme@gmail.com>
2025-02-26 14:35:24 +00:00
Artem Yankov
6177b9f9ab
community: add title, score and raw_content to tavily search results (#29995)
**Description:**

Tavily search results returned from API include useful information like
title, score and (optionally) raw_content that is missed in wrapper
although it's documented there properly. Add this data to the result
structure.

---------

Co-authored-by: Chester Curme <chester.curme@gmail.com>
2025-02-25 23:27:21 +00:00
Eugene Yurtsev
b525226531
core[patch]: version 0.3.40 (#29997)
Version 0.3.40 release
2025-02-25 23:09:40 +00:00
Vadym Barda
0fc50b82a0
core[patch]: allow passing description to @tool decorator (#29976) 2025-02-25 17:45:36 -05:00
Naveen SK
21bfc95e14
docs: Correct grammatical typos in various documentation files (#29983)
**Description:**
Fixed grammatical typos in various documentation files

**Issue:**
N/A

**Dependencies:**
N/A

**Twitter handle:**
@MrNaveenSK

Co-authored-by: ccurme <chester.curme@gmail.com>
2025-02-25 19:13:31 +00:00
ccurme
1158d3134d
langchain[patch]: remove aiohttp (#29991)
My guess is this was left over from when `community` was in langchain.
2025-02-25 11:43:00 -05:00
ccurme
afd7888392
langchain[patch]: remove explicit dependency on tenacity (#29990)
Not used anywhere in `langchain`, already a dependency of
langchain-core.
2025-02-25 11:31:55 -05:00
ccurme
32704f0ad8
langchain: update extended test (#29988) 2025-02-25 14:58:20 +00:00
Yan
47e1a384f7
Writer partners integration docs (#29961)
**Documentation of Writer provider and additional features**
* [PyPi langchain-writer
web-page](https://pypi.org/project/langchain-writer/)
* [GitHub langchain-writer
repo](https://github.com/writer/langchain-writer)

---------

Co-authored-by: Chester Curme <chester.curme@gmail.com>
2025-02-24 19:30:09 -05:00
ccurme
79f5bbfb26
anthropic[patch]: release 0.3.8 (#29973) 2025-02-24 15:24:35 -05:00
ccurme
ded886f622
anthropic[patch]: support claude 3.7 sonnet (#29971) 2025-02-24 15:17:47 -05:00
Bagatur
d00d645829
docs[patch]: update disable_streaming docstring (#29968) 2025-02-24 18:40:31 +00:00
ccurme
b7a1705052
openai[patch]: release 0.3.7 (#29967) 2025-02-24 11:59:28 -05:00
ccurme
5437ee385b
core[patch]: release 0.3.39 (#29966) 2025-02-24 11:47:01 -05:00
ccurme
291a232fb8
openai[patch]: set global ssl context (#29932)
We set 
```python
global_ssl_context = ssl.create_default_context(cafile=certifi.where())
```
at the module-level and share it among httpx clients.
2025-02-24 11:25:16 -05:00
ccurme
9ce07980b7
core[patch]: pydantic 2.11 compat (#29963)
Resolves https://github.com/langchain-ai/langchain/issues/29951

Was able to reproduce the issue with Anthropic installing from pydantic
`main` and correct it with the fix recommended in the issue.

Thanks very much @Viicos for finding the bug and the detailed writeup!
2025-02-24 11:11:25 -05:00
ccurme
0d3a3b99fc
core[patch]: release 0.3.38 (#29962) 2025-02-24 15:04:53 +00:00
ccurme
b1a7f4e106
core, openai[patch]: support serialization of pydantic models in messages (#29940)
Resolves https://github.com/langchain-ai/langchain/issues/29003,
https://github.com/langchain-ai/langchain/issues/27264
Related: https://github.com/langchain-ai/langchain-redis/issues/52

```python
from langchain.chat_models import init_chat_model
from langchain.globals import set_llm_cache
from langchain_community.cache import SQLiteCache
from pydantic import BaseModel

cache = SQLiteCache()

set_llm_cache(cache)

class Temperature(BaseModel):
    value: int
    city: str

llm = init_chat_model("openai:gpt-4o-mini")
structured_llm = llm.with_structured_output(Temperature)
```
```python
# 681 ms
response = structured_llm.invoke("What is the average temperature of Rome in May?")
```
```python
# 6.98 ms
response = structured_llm.invoke("What is the average temperature of Rome in May?")
```
2025-02-24 09:34:27 -05:00
ccurme
927ec20b69
openai[patch]: update system role to developer for o-series models (#29785)
Some o-series models will raise a 400 error for `"role": "system"`
(`o1-mini` and `o1-preview` will raise, `o1` and `o3-mini` will not).

Here we update `ChatOpenAI` to update the role to `"developer"` for all
model names matching `^o\d`.

We only make this change on the ChatOpenAI class (not BaseChatOpenAI).
2025-02-24 08:59:46 -05:00
Ahmed Tammaa
8b511a3a78
[Exception Handling] DeepSeek JSONDecodeError (#29758)
For Context please check #29626 

The Deepseek is using langchain_openai. The error happens that it show
`json decode error`.

I added a handler for this to give a more sensible error message which
is DeepSeek API returned empty/invalid json.

Reproducing the issue is a bit challenging as it is inconsistent,
sometimes DeepSeek returns valid data and in other times it returns
invalid data which triggers the JSON Decode Error.

This PR is an exception handling, but not an ultimate fix for the issue.

---------

Co-authored-by: Chester Curme <chester.curme@gmail.com>
2025-02-23 15:00:32 -05:00
Julien Elkaim
e586bffe51
community: Repair embeddings/llamacpp's embed_query method (#29935)
**Description:** As commented on the commit
[41b6a86](41b6a86bbe)
it introduced a bug for when we do an embedding request and the model
returns a non-nested list. Typically it's the case for model
**_nomic-embed-text_**.

- I added the unit test, and ran `make format`, `make lint` and `make
test` from the `community` package.
- No new dependency.

---------

Co-authored-by: Chester Curme <chester.curme@gmail.com>
2025-02-23 19:32:17 +00:00
Saraswathy Kalaiselvan
5ca4933b9d
docs: updated ChatLiteLLM model_kwargs description (#29937)
- [x] **PR title**: docs: (community) update ChatLiteLLM

- [x] **PR message**:
- **Description:** updated description of model_kwargs parameter which
was wrongly describing for temperature.
    - **Issue:** #29862 
    - **Dependencies:** N/A
    
- [x] **Add tests and docs**: N/A

- [x] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/

---------

Co-authored-by: Chester Curme <chester.curme@gmail.com>
2025-02-23 19:27:13 +00:00
ccurme
512eb1b764
anthropic[patch]: update models for integration tests (#29938) 2025-02-23 14:23:48 -05:00
Christophe Bornet
f6d4fec4d5
core: Add ruff rules ANN (type annotations) (#29271)
See https://docs.astral.sh/ruff/rules/#flake8-annotations-ann
The interest compared to only mypy is that ruff is very fast at
detecting missing annotations.

ANN101 and ANN102 are deprecated so we ignore them 
ANN401 (no Any type) ignored to be in sync with mypy config

---------

Co-authored-by: ccurme <chester.curme@gmail.com>
2025-02-22 17:46:28 -05:00
Bagatur
979a991dc2
core[patch]: dont deep copy merge_message_runs (#28454)
afaict no need to deep copy here, if we merge messages then we convert
them to chunks first anyways
2025-02-22 21:56:45 +00:00
Mohammad Mohtashim
afa94e5bf7
_wait_for_run calling fix for OpenAIAssistantRunnable (#29927)
- **Description:** Fixed the `OpenAIAssistantRunnable` call of
`_wait_for_run`
- **Issue:**  #29923
2025-02-22 00:27:24 +00:00
Vadym Barda
437fe6d216
core[patch]: return ToolMessage from tools when tool call ID is empty string (#29921) 2025-02-21 11:53:15 -05:00
Taofiq Aiyelabegan
5ee8a8f063
[Integration]: Langchain-Permit (#29867)
## Which area of LangChain is being modified?
- This PR adds a new "Permit" integration to the `docs/integrations/`
folder.
- Introduces two new Tools (`LangchainJWTValidationTool` and
`LangchainPermissionsCheckTool`)
- Introduces two new Retrievers (`PermitSelfQueryRetriever` and
`PermitEnsembleRetriever`)
- Adds demo scripts in `examples/` showcasing usage.

## Description of Changes
- Created `langchain_permit/tools.py` for JWT validation and permission
checks with Permit.
- Created `langchain_permit/retrievers.py` for custom Permit-based
retrievers.
- Added documentation in `docs/integrations/providers/permit.ipynb` (or
`.mdx`) to explain setup, usage, and examples.
- Provided sample scripts in `examples/demo_scripts/` to illustrate
usage of these tools and retrievers.
- Ensured all code is linted and tested locally.

Thank you again for reviewing!

---------

Co-authored-by: Chester Curme <chester.curme@gmail.com>
2025-02-21 10:59:00 -05:00
Jean-Philippe Dournel
ebe38baaf9
community/mlx_pipeline: fix crash at mlx call (#29915)
- **Description:** 
Since mlx_lm 0.20, all calls to mlx crash due to deprecation of the way
parameters are passed to methods generate and generate_step.
Parameters top_p, temp, repetition_penalty and repetition_context_size
are not passed directly to those method anymore but wrapped into
"sampler" and "logit_processor".


- **Dependencies:** mlx_lm (optional)

-  **Tests:** 
I've had a new test to existing test file:
tests/integration_tests/llms/test_mlx_pipeline.py

---------

Co-authored-by: Jean-Philippe Dournel <jp@insightkeeper.io>
2025-02-21 09:14:53 -05:00
ccurme
1fa9f6bc20
docs: build mongo in api ref (#29908) 2025-02-20 19:58:35 -05:00
Chaunte W. Lacewell
d972c6d6ea
partners: add langchain-vdms (#29857)
**Description:** Deprecate vdms in community, add integration
langchain-vdms, and update any related files
**Issue:** n/a
**Dependencies:** langchain-vdms
**Twitter handle:** n/a

---------

Co-authored-by: Chester Curme <chester.curme@gmail.com>
2025-02-20 19:48:46 -05:00
Mohammad Mohtashim
8293142fa0
mistral[patch]: support model_kwargs (#29838)
- **Description:** Frequency_penalty added as a client parameter
- **Issue:** #29803

---------

Co-authored-by: Chester Curme <chester.curme@gmail.com>
2025-02-20 18:47:39 -05:00
ccurme
924d9b1b33
cli[patch]: fix retriever template (#29907)
Chat model tabs don't render correctly in .ipynb template.
2025-02-20 17:51:19 +00:00
Brayden Zhong
a70f31de5f
Community: RankLLMRerank AttributeError (Handle list-based rerank results) (#29840)
# community: Fix AttributeError in RankLLMRerank (`list` object has no
attribute `candidates`)

## **Description**
This PR fixes an issue in `RankLLMRerank` where reranking fails with the
following error:

```
AttributeError: 'list' object has no attribute 'candidates'
```

The issue arises because `rerank_batch()` returns a `List[Result]`
instead of an object containing `.candidates`.

### **Changes Introduced**
- Adjusted `compress_documents()` to support both:
  - Old API format: `rerank_results.candidates`
  - New API format: `rerank_results` as a list
  - Also fix wrong .txt location parsing while I was at it.

---

## **Issue**
Fixes **AttributeError** in `RankLLMRerank` when using
`compression_retriever.invoke()`. The issue is observed when
`rerank_batch()` returns a list instead of an object with `.candidates`.

**Relevant log:**
```
AttributeError: 'list' object has no attribute 'candidates'
```

## **Dependencies**
- No additional dependencies introduced.

---

## **Checklist**
- [x] **Backward compatible** with previous API versions
- [x] **Tested** locally with different RankLLM models
- [x] **No new dependencies introduced**
- [x] **Linted** with `make format && make lint`
- [x] **Ready for review**

---

## **Testing**
- Ran `compression_retriever.invoke(query)`

## **Reviewers**
If no review within a few days, please **@mention** one of:
- @baskaryan
- @efriis
- @eyurtsev
- @ccurme
- @vbarda
- @hwchase17
2025-02-20 12:38:31 -05:00
Levon Ghukasyan
ec403c442a
Separate deepale vector store (#29902)
Thank you for contributing to LangChain!

- [ ] **PR title**: "package: description"
- Where "package" is whichever of langchain, community, core, etc. is
being modified. Use "docs: ..." for purely docs changes, "infra: ..."
for CI changes.
  - Example: "community: add foobar LLM"

- [ ] **Add tests and docs**: If you're adding a new integration, please
include
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in
`docs/docs/integrations` directory.


- [ ] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/

Additional guidelines:
- Make sure optional dependencies are imported within a function.
- Please do not add dependencies to pyproject.toml files (even optional
ones) unless they are required for unit tests.
- Most PRs should not touch more than one package.
- Changes should be backwards compatible.
- If you are adding something to community, do not re-import it in
langchain.

If no one reviews your PR within a few days, please @-mention one of
baskaryan, efriis, eyurtsev, ccurme, vbarda, hwchase17.

---------

Co-authored-by: Chester Curme <chester.curme@gmail.com>
2025-02-20 17:37:19 +00:00
Jorge Piedrahita Ortiz
3acf842e35
core: add sambanova chat models to load module mapping (#29855)
- **Description:** add sambanova integration package chat models to load
module mapping, to allow serialization and deserialization
2025-02-20 12:30:50 -05:00
ccurme
d227e4a08e
mistralai[patch]: release 0.2.7 (#29906) 2025-02-20 17:27:12 +00:00
Hande
d8bab89e6e
community: add cognee retriever (#29878)
This PR adds a new cognee integration, knowledge graph based retrieval
enabling developers to ingest documents into cognee’s knowledge graph,
process them, and then retrieve context via CogneeRetriever.
It includes:
- langchain_cognee package with a CogneeRetriever class
- a test for the integration, demonstrating how to create, process, and
retrieve with cognee
- an example notebook showing its use. It lives in
`docs/docs/integrations` directory.


Followed additional guidelines:
- Make sure optional dependencies are imported within a function.
- Please do not add dependencies to pyproject.toml files (even optional
ones) unless they are required for unit tests.
- Most PRs should not touch more than one package.
- Changes should be backwards compatible.
- If you are adding something to community, do not re-import it in
langchain.

Thank you for the review!

---------

Co-authored-by: Chester Curme <chester.curme@gmail.com>
2025-02-20 17:15:23 +00:00
dokato
92b415a9f6
community: Made some Jira fields optional for agent to work correctly (#29876)
**Description:** Two small changes have been proposed here:
(1)
Previous code assumes that every issue has a priority field. If an issue
lacks this field, the code will raise a KeyError.
Now, the code checks if priority exists before accessing it. If priority
is missing, it assigns None instead of crashing. This prevents runtime
errors when processing issues without a priority.

(2)

Also If the "style" field is missing, the code throws a KeyError.
`.get("style", None)` safely retrieves the value if present.

**Issue:** #29875 

**Dependencies:** N/A
2025-02-20 12:10:11 -05:00
am-kinetica
ca7eccba1f
Handled a bug around empty query results differently (#29877)
Thank you for contributing to LangChain!

- [ ] **Handled query records properly**: "community:
vectorstores/kinetica"

- [ ] **Bugfix for empty query results handling**: 
- **Description:** checked for the number of records returned by a query
before processing further
- **Issue:** resulted in an `AttributeError` earlier which has now been
fixed

@efriis
2025-02-20 12:07:49 -05:00
Antonio Pisani
2c403a3ea9
docs: Add langchain-prolog documentation (#29788)
I want to add documentation for a new integration with SWI-Prolog.

@hwchase17 check this out:

https://github.com/apisani1/langchain-prolog/tree/main/examples/travel_agent

---------

Co-authored-by: Chester Curme <chester.curme@gmail.com>
2025-02-20 11:50:28 -05:00
Marlene
be7fa920fa
Partner: Azure AI Langchain Docs and Package Registry (#29879)
This PR adds documentation for the Azure AI package in Langchain to the
main mono-repo

No issue connected or updated dependencies.

Utilises existing tests and makes updates to the docs

---------

Co-authored-by: Chester Curme <chester.curme@gmail.com>
2025-02-20 14:35:26 +00:00
Hankyeol Kyung
2dd0ce3077
openai: Update reasoning_effort arg documentation (#29897)
**Description:** Update docstring for `reasoning_effort` argument to
specify that it applies to reasoning models only (e.g., OpenAI o1 and
o3-mini), clarifying its supported models.
**Issue:** None
**Dependencies:** None
2025-02-20 09:03:42 -05:00
ccurme
ed3c2bd557
core[patch]: set version="v2" as default in astream_events (#29894) 2025-02-19 23:21:37 +00:00
Fabian Blatz
a2d05a376c
community: ConfluenceLoader: add a filter method for attachments (#29882)
Adds a `attachment_filter_func` parameter to the ConfluenceLoader class
which can be used to determine which files are indexed. This is useful
if you are interested in excluding files based on their media type or
other metadata.
2025-02-19 18:20:45 -05:00
ccurme
9ed47a4d63
community[patch]: release 0.3.18 (#29896) 2025-02-19 20:13:00 +00:00
ccurme
92889edafd
core[patch]: release 0.3.37 (#29895) 2025-02-19 20:04:35 +00:00
ccurme
ffd6194060
core[patch]: de-beta rate limiters (#29891) 2025-02-19 19:19:59 +00:00
ccurme
fb4c8423f0
docs: fix builds (#29890)
Missed in https://github.com/langchain-ai/langchain/pull/29889
2025-02-19 13:35:59 -05:00
ccurme
68b13e5172
pinecone: delete from monorepo (#29889)
This now lives in https://github.com/langchain-ai/langchain-pinecone
2025-02-19 12:55:15 -05:00
Erick Friis
6c1e21d128
core: basemessage.text() (#29078) 2025-02-18 17:45:44 -08:00
Eugene Yurtsev
8e5074d82d
core: release 0.3.36 (#29869)
Release 0.3.36
2025-02-18 19:51:43 +00:00
Vadym Barda
d04fa1ae50
core[patch]: allow passing JSON schema as args_schema to tools (#29812) 2025-02-18 14:44:31 -05:00
ccurme
5034a8dc5c
xai[patch]: release 0.2.1 (#29854) 2025-02-17 14:30:41 -05:00
ccurme
83dcef234d
xai[patch]: support dedicated structured output feature (#29853)
https://docs.x.ai/docs/guides/structured-outputs

Interface appears identical to OpenAI's.
```python
from langchain.chat_models import init_chat_model
from pydantic import BaseModel

class Joke(BaseModel):
    setup: str
    punchline: str

llm = init_chat_model("xai:grok-2").with_structured_output(
    Joke, method="json_schema"
)
llm.invoke("Tell me a joke about cats.")
```
2025-02-17 14:19:51 -05:00
ccurme
9d6fcd0bfb
infra: add xai to scheduled testing (#29852) 2025-02-17 18:59:45 +00:00
ccurme
8a3b05ae69
langchain[patch]: release 0.3.19 (#29851) 2025-02-17 13:36:23 -05:00
ccurme
c9061162a1
langchain[patch]: add xai to extras (#29850) 2025-02-17 17:49:34 +00:00
Bagatur
1acf57e9bd
langchain[patch]: init_chat_model xai support (#29849) 2025-02-17 09:45:39 -08:00
hsm207
037b129b86
weaviate: Add-deprecation-warning (#29757)
- **Description:** add deprecation warning when using weaviate from
langchain_community
  - **Issue:** NA
  - **Dependencies:** NA
  - **Twitter handle:** NA

---------

Signed-off-by: hsm207 <hsm207@users.noreply.github.com>
Co-authored-by: Chester Curme <chester.curme@gmail.com>
2025-02-16 21:42:18 -05:00
Đỗ Quang Minh
cd198ac9ed
community: add custom model for OpenAIWhisperParser (#29831)
Add `model` properties for OpenAIWhisperParser. Defaulted to `whisper-1`
(previous value).
Please help me update the docs and other related components of this
repo.
2025-02-16 21:26:07 -05:00
Cole McIntosh
6874c9c1d0
docs: add notebook for langchain-salesforce package (#29800)
**Description:**  
This PR adds a Jupyter notebook that explains the features,
installation, and usage of the
[`langchain-salesforce`](https://github.com/colesmcintosh/langchain-salesforce)
package. The notebook includes:
- Setup instructions for configuring Salesforce credentials  
- Example code demonstrating common operations such as querying,
describing objects, creating, updating, and deleting records

**Issue:**  
N/A

**Dependencies:**  
No new dependencies are required.

**Tests and Docs:**  
- Added an example notebook demonstrating the usage of the
`langchain-salesforce` package, located in `docs/docs/integrations`.

**Lint and Test:**  
- Ran `make format`, `make lint`, and `make test` successfully.

---------

Co-authored-by: Chester Curme <chester.curme@gmail.com>
2025-02-16 08:34:57 -05:00
Jan Heimes
60f58df5b3
community: add top_k as param to Needle Retriever (#29821)
Thank you for contributing to LangChain!

- [X] **PR title**: "package: description"
- Where "package" is whichever of langchain, community, core, etc. is
being modified. Use "docs: ..." for purely docs changes, "infra: ..."
for CI changes.
  - Example: "community: add foobar LLM"


- [x] **PR message**: 
This PR adds top_k as a param to the Needle Retriever. By default we use
top 10.



- [X] **Add tests and docs**: If you're adding a new integration, please
include
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in
`docs/docs/integrations` directory.


- [X] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/

Additional guidelines:
- Make sure optional dependencies are imported within a function.
- Please do not add dependencies to pyproject.toml files (even optional
ones) unless they are required for unit tests.
- Most PRs should not touch more than one package.
- Changes should be backwards compatible.
- If you are adding something to community, do not re-import it in
langchain.

If no one reviews your PR within a few days, please @-mention one of
baskaryan, efriis, eyurtsev, ccurme, vbarda, hwchase17.
2025-02-16 08:30:52 -05:00
Jesus Fernandez Bes
1dfac909d8
community: Adding IN Operator to AzureCosmosDBNoSQLVectorStore (#29805)
- ** Description**: I have added a new operator in the operator map with
key `$in` and value `IN`, so that you can define filters using lists as
values. This was already contemplated but as IN operator was not in the
map they cannot be used.
- **Issue**: Fixes #29804.
- **Dependencies**: No extra.
2025-02-15 21:44:54 -05:00
Wahed Hemati
8901b113c3
docs: add Discord integration docs (#29822)
This PR adds documentation for the `langchain-discord-shikenso`
integration, including an example notebook at
`docs/docs/integrations/tools/discord.ipynb` and updates to
`libs/packages.yml` to track the new package.

  **Issue:**  
  N/A

  **Dependencies:**  
  None

  **Twitter handle:**  
  N/A

---------

Co-authored-by: Chester Curme <chester.curme@gmail.com>
2025-02-15 21:43:45 -05:00
Krishna Kulkarni
a98c5f1c4b
langchain_community: add image support to DuckDuckGoSearchAPIWrapper (#29816)
- [ ] **PR title**: langchain_community: add image support to
DuckDuckGoSearchAPIWrapper

- **Description:** This PR enhances the DuckDuckGoSearchAPIWrapper
within the langchain_community package by introducing support for image
searches. The enhancement includes:
  - Adding a new method _ddgs_images to handle image search queries.
- Updating the run and results methods to process and return image
search results appropriately.
- Modifying the source parameter to accept "images" as a valid option,
alongside "text" and "news".
- **Dependencies:** No additional dependencies are required for this
change.
2025-02-15 21:32:14 -05:00
Iris Liu
0d9f0b4215
docs: updates Chroma integration API ref docs (#29826)
- Description: updates Chroma integration API ref docs
- Issue: #29817
- Dependencies: N/A
- Twitter handle: @irieliu

Co-authored-by: “Iris <“liuirisny@gmail.com”>
2025-02-15 21:05:21 -05:00
ccurme
3fe7c07394
openai[patch]: release 0.3.6 (#29824) 2025-02-15 13:53:35 -05:00
ccurme
65a6dce428
openai[patch]: enable streaming for o1 (#29823)
Verified streaming works for the `o1-2024-12-17` snapshot as well.
2025-02-15 12:42:05 -05:00
Christophe Bornet
3dffee3d0b
all: Bump blockbuster version to 1.5.18 (#29806)
Has fixes for running on Windows and non-CPython runtimes.
2025-02-14 07:55:38 -08:00
ccurme
d9a069c414
tests[patch]: release 0.3.12 (#29797) 2025-02-13 23:57:44 +00:00
ccurme
e4f106ea62
groq[patch]: remove xfails (#29794)
These appear to pass.
2025-02-13 15:49:50 -08:00
Erick Friis
f34e62ef42
packages: add langchain-xai (#29795)
wasn't registered per the contribution guide:
https://python.langchain.com/docs/contributing/how_to/integrations/
2025-02-13 15:36:41 -08:00
ccurme
49cc6106f7
tests[patch]: fix query for test_tool_calling_with_no_arguments (#29793) 2025-02-13 23:15:52 +00:00
Erick Friis
1a225fad03
multiple: fix uv path deps (#29790)
file:// format wasn't working with updates - it doesn't install as an
editable dep

move to tool.uv.sources with path= instead
2025-02-13 21:32:34 +00:00
Erick Friis
ff13384eb6
packages: update counts, add command (#29789) 2025-02-13 20:45:25 +00:00
HackHuang
76d32754ff
core : update the class docs of InMemoryVectorStore in in_memory.py (#29781)
- **Description:** Add the new introduction about checking `store` in
in_memory.py, It’s necessary and useful for beginners.
```python
Check Documents:
    .. code-block:: python
    
        for doc in vector_store.store.values():
            print(doc)
```

---------

Co-authored-by: Chester Curme <chester.curme@gmail.com>
2025-02-13 16:41:47 +00:00
Mohammad Mohtashim
96ad09fa2d
(Community): Added API Key for Jina Search API Wrapper (#29622)
- **Description:** Simple change for adding the API Key for Jina Search
API Wrapper
- **Issue:** #29596
2025-02-12 20:12:07 -08:00
ccurme
f1c66a3040
docs: minor fix to provider table (#29771)
Langfair renders as LangfAIr
2025-02-13 04:06:58 +00:00
Jakub Kopecký
c8cb7c25bf
docs: update apify integration (#29553)
**Description:** Fixed and updated Apify integration documentation to
use the new [langchain-apify](https://github.com/apify/langchain-apify)
package.
**Twitter handle:** @apify
2025-02-12 20:02:55 -08:00
ccurme
16fb1f5371
chroma[patch]: release 0.2.2 (#29769)
Resolves https://github.com/langchain-ai/langchain/issues/29765
2025-02-13 02:39:16 +00:00
Mohammad Mohtashim
2310847c0f
(Chroma): Small Fix in add_texts when checking for embeddings (#29766)
- **Description:** Small fix in `add_texts` to make embedding
nullability is checked properly.
- **Issue:** #29765

---------

Co-authored-by: Chester Curme <chester.curme@gmail.com>
2025-02-13 02:26:13 +00:00
Eric Pinzur
716fd89d8e
docs: contributed Graph RAG Retriever integration (#29744)
**Description:** 

This adds the `Graph RAG` Retriever integration documentation, per
https://python.langchain.com/docs/contributing/how_to/integrations/.

* The integration exists in this public repository:
https://github.com/datastax/graph-rag
* We've implemented the standard langchain tests for retrievers:
https://github.com/datastax/graph-rag/blob/main/packages/langchain-graph-retriever/tests/test_langchain.py
* Our integration is published to PyPi:
https://pypi.org/project/langchain-graph-retriever/
2025-02-12 18:25:48 -08:00
Sunish Sheth
f42dafa809
Deprecating sql_database access for creating UC functions for agent tools (#29745)
Thank you for contributing to LangChain!

- [ ] **PR title**: "package: description"
- Where "package" is whichever of langchain, community, core, etc. is
being modified. Use "docs: ..." for purely docs changes, "infra: ..."
for CI changes.
  - Example: "community: add foobar LLM"


- [ ] **PR message**: ***Delete this entire checklist*** and replace
with
    - **Description:** a description of the change
    - **Issue:** the issue # it fixes, if applicable
    - **Dependencies:** any dependencies required for this change
- **Twitter handle:** if your PR gets announced, and you'd like a
mention, we'll gladly shout you out!


- [ ] **Add tests and docs**: If you're adding a new integration, please
include
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in
`docs/docs/integrations` directory.


- [ ] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/

Additional guidelines:
- Make sure optional dependencies are imported within a function.
- Please do not add dependencies to pyproject.toml files (even optional
ones) unless they are required for unit tests.
- Most PRs should not touch more than one package.
- Changes should be backwards compatible.
- If you are adding something to community, do not re-import it in
langchain.

If no one reviews your PR within a few days, please @-mention one of
baskaryan, efriis, eyurtsev, ccurme, vbarda, hwchase17.

---------

Co-authored-by: ccurme <chester.curme@gmail.com>
2025-02-13 02:24:44 +00:00
Thor 雷神 Schaeff
a0970d8d7e
[WIP] chore: update ElevenLabs tool. (#29722)
Thank you for contributing to LangChain!

- [ ] **PR title**: "package: description"
- Where "package" is whichever of langchain, community, core, etc. is
being modified. Use "docs: ..." for purely docs changes, "infra: ..."
for CI changes.
  - Example: "community: add foobar LLM"


- [ ] **PR message**: ***Delete this entire checklist*** and replace
with
    - **Description:** a description of the change
    - **Issue:** the issue # it fixes, if applicable
    - **Dependencies:** any dependencies required for this change
- **Twitter handle:** if your PR gets announced, and you'd like a
mention, we'll gladly shout you out!


- [ ] **Add tests and docs**: If you're adding a new integration, please
include
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in
`docs/docs/integrations` directory.


- [ ] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/

Additional guidelines:
- Make sure optional dependencies are imported within a function.
- Please do not add dependencies to pyproject.toml files (even optional
ones) unless they are required for unit tests.
- Most PRs should not touch more than one package.
- Changes should be backwards compatible.
- If you are adding something to community, do not re-import it in
langchain.

If no one reviews your PR within a few days, please @-mention one of
baskaryan, efriis, eyurtsev, ccurme, vbarda, hwchase17.

---------

Co-authored-by: Chester Curme <chester.curme@gmail.com>
2025-02-13 01:54:34 +00:00
Chaymae El Aattabi
4b08a7e8e8
Fix #29759: Use local chunk_size_ for looping in embed_documents (#29761)
This fix ensures that the chunk size is correctly determined when
processing text embeddings. Previously, the code did not properly handle
cases where chunk_size was None, potentially leading to incorrect
chunking behavior.

Now, chunk_size_ is explicitly set to either the provided chunk_size or
the default self.chunk_size, ensuring consistent chunking. This update
improves reliability when processing large text inputs in batches and
prevents unintended behavior when chunk_size is not specified.

---------

Co-authored-by: Chester Curme <chester.curme@gmail.com>
2025-02-13 01:28:26 +00:00
Sunish Sheth
043d78d85d
Deprecate langhchain community ucfunctiontoolkit in favor for databricks_langchain (#29746)
Thank you for contributing to LangChain!

- [ ] **PR title**: "package: description"
- Where "package" is whichever of langchain, community, core, etc. is
being modified. Use "docs: ..." for purely docs changes, "infra: ..."
for CI changes.
  - Example: "community: add foobar LLM"


- [ ] **PR message**: ***Delete this entire checklist*** and replace
with
    - **Description:** a description of the change
    - **Issue:** the issue # it fixes, if applicable
    - **Dependencies:** any dependencies required for this change
- **Twitter handle:** if your PR gets announced, and you'd like a
mention, we'll gladly shout you out!


- [ ] **Add tests and docs**: If you're adding a new integration, please
include
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in
`docs/docs/integrations` directory.


- [ ] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/

Additional guidelines:
- Make sure optional dependencies are imported within a function.
- Please do not add dependencies to pyproject.toml files (even optional
ones) unless they are required for unit tests.
- Most PRs should not touch more than one package.
- Changes should be backwards compatible.
- If you are adding something to community, do not re-import it in
langchain.

If no one reviews your PR within a few days, please @-mention one of
baskaryan, efriis, eyurtsev, ccurme, vbarda, hwchase17.
2025-02-12 15:50:35 -08:00
Hugues Chocart
e4eec9e9aa
community: add langchain-abso documentation (#29739)
Add the documentation for the community package `langchain-abso`. It
provides a new Chat Model class, that uses https://abso.ai

---------

Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>
2025-02-12 19:57:33 +00:00
ccurme
e61f463745
core[patch]: release 0.3.35 (#29764) 2025-02-12 18:13:10 +00:00
Nuno Campos
fe59f2cc88
core: Fix output of convert_messages when called with BaseMessage.model_dump() (#29763)
- additional_kwargs was being nested twice
- example, response_metadata was placed inside additional_kwargs
2025-02-12 10:05:33 -08:00
Jacob Lee
f4e3e86fbb
feat(langchain): Infer o3 modelstrings passed to init_chat_model as OpenAI (#29743) 2025-02-11 16:51:41 -08:00
Mohammad Mohtashim
9f3bcee30a
(Community): Adding Structured Support for ChatPerplexity (#29361)
- **Description:** Adding Structured Support for ChatPerplexity
- **Issue:** #29357
- This is implemented as per the Perplexity official docs:
https://docs.perplexity.ai/guides/structured-outputs

---------

Co-authored-by: ccurme <chester.curme@gmail.com>
2025-02-11 15:51:18 -08:00
Jawahar S
994c5465e0
feat: add support for IBM WatsonX AI chat models (#29688)
**Description:** Updated init_chat_model to support Granite models
deployed on IBM WatsonX
**Dependencies:**
[langchain-ibm](https://github.com/langchain-ai/langchain-ibm)

Tagging @baskaryan @efriis for review when you get a chance.
2025-02-11 15:34:29 -08:00
Shailendra Mishra
c7d74eb7a3
Oraclevs integration (#29723)
Thank you for contributing to LangChain!

- [ ] **PR title**: "package: description"
- Where "package" is whichever of langchain, community, core, etc. is
being modified. Use "docs: ..." for purely docs changes, "infra: ..."
for CI changes.
  - Example: "community: add foobar LLM"
  community: langchain_community/vectorstore/oraclevs.py


- [ ] **PR message**: ***Delete this entire checklist*** and replace
with
- **Description:** Refactored code to allow a connection or a connection
pool.
- **Issue:** Normally an idel connection is terminated by the server
side listener at timeout. A user thus has to re-instantiate the vector
store. The timeout in case of connection is not configurable. The
solution is to use a connection pool where a user can specify a user
defined timeout and the connections are managed by the pool.
    - **Dependencies:** None
    - **Twitter handle:** 


- [ ] **Add tests and docs**: This is not a new integration. A user can
pass either a connection or a connection pool. The determination of what
is passed is made at run time. Everything should work as before.

- [ ] **Lint and test**:  Already done.

Additional guidelines:
- Make sure optional dependencies are imported within a function.
- Please do not add dependencies to pyproject.toml files (even optional
ones) unless they are required for unit tests.
- Most PRs should not touch more than one package.
- Changes should be backwards compatible.
- If you are adding something to community, do not re-import it in
langchain.

If no one reviews your PR within a few days, please @-mention one of
baskaryan, efriis, eyurtsev, ccurme, vbarda, hwchase17.

---------

Co-authored-by: Erick Friis <erick@langchain.dev>
2025-02-11 14:56:55 -08:00
ccurme
42ebf6ae0c
deepseek[patch]: release 0.1.2 (#29742) 2025-02-11 11:53:43 -08:00
ccurme
ec55553807
pinecone[patch]: release 0.2.3 (#29741) 2025-02-11 19:27:39 +00:00
ccurme
001cf99253
pinecone[patch]: add support for python 3.13 (#29737) 2025-02-11 11:20:21 -08:00
ccurme
ba8f752bf5
openai[patch]: release 0.3.5 (#29740) 2025-02-11 19:20:11 +00:00
ccurme
9477f49409
openai, deepseek: make _convert_chunk_to_generation_chunk an instance method (#29731)
1. Make `_convert_chunk_to_generation_chunk` an instance method on
BaseChatOpenAI
2. Override on ChatDeepSeek to add `"reasoning_content"` to message
additional_kwargs.

Resolves https://github.com/langchain-ai/langchain/issues/29513
2025-02-11 11:13:23 -08:00
ccurme
d0c2dc06d5
mongodb[patch]: fix link in readme (#29738) 2025-02-11 18:19:59 +00:00
zzaebok
3b3d52206f
community: change wikidata rest api version from v0 to v1 (#29708)
**Description:**

According to the [wikidata
documentation](https://www.wikidata.org/wiki/Wikidata_talk:REST_API),
Wikibase REST API version 1 (stable) is released from November 11, 2024.
Their guide is to use the new v1 API and, it just requires replacing v0
in the routes with v1 in almost all cases.
So I replaced WIKIDATA_REST_API_URL from v0 to v1 for stable usage.

Co-authored-by: ccurme <chester.curme@gmail.com>
2025-02-10 17:12:38 -08:00
ccurme
4a389ef4c6
community: fix extended testing (#29715)
v0.3.100 of premai sdk appears to break on import:
89d9276cbf/premai/api/__init__.py (L230)
2025-02-10 16:57:34 -08:00
Bhav Sardana
624216aa64
community:Fix for Pydantic model validator of GoogleApiYoutubeLoader (#29694)
- **Description:** Community: bugfix for pedantic model validator for
GoogleApiYoutubeLoader
- **Issue:** #29165, #27432 
Fix is similar to #29346
2025-02-10 08:57:58 -05:00
Changyong Um
60740c44c5
community: Add configurable text key for indexing and the retriever in Pinecone Hybrid Search (#29697)
**issue**

In Langchain, the original content is generally stored under the `text`
key. However, the `PineconeHybridSearchRetriever` searches the `context`
field in the metadata and cannot change this key. To address this, I
have modified the code to allow changing the key to something other than
context.

In my opinion, following Langchain's conventions, the `text` key seems
more appropriate than `context`. However, since I wasn't sure about the
author's intent, I have left the default value as `context`.
2025-02-10 08:56:37 -05:00
manukychen
3de445d521
using getattr and default value to prevent 'OpenSearchVectorSearch' has no attribute 'bulk_size' (#29682)
- Description: Adding getattr methods and set default value 500 to
cls.bulk_size, it can prevent the error below:
Error: type object 'OpenSearchVectorSearch' has no attribute 'bulk_size'

- Issue: https://github.com/langchain-ai/langchain/issues/29071
2025-02-08 14:39:57 -05:00
Yao Tianjia
5d581ba22c
langchain: support the situation when action_input is null in json output_parser (#29680)
Description:
This PR fixes handling of null action_input in
[langchain.agents.output_parser]. Previously, passing null to
action_input could cause OutputParserException with unclear error
message which cause LLM don't know how to modify the action. The changes
include:

Added null-check validation before processing action_input
Implemented proper fallback behavior with default values
Maintained backward compatibility with existing implementations

Error Examples:
```
{
  "action":"some action",
  "action_input":null
}
```

Issue:
None

Dependencies:
None
2025-02-07 22:01:01 -05:00
Philippe PRADOS
beb75b2150
community[minor]: 05 - Refactoring PyPDFium2 parser (#29625)
This is one part of a larger Pull Request (PR) that is too large to be
submitted all at once. This specific part focuses on updating the
PyPDFium2 parser.

For more details, see
https://github.com/langchain-ai/langchain/pull/28970.
2025-02-07 21:31:12 -05:00
Christophe Bornet
723031d548
community: Bump ruff version to 0.9 (#29206)
Co-authored-by: Erick Friis <erick@langchain.dev>
2025-02-08 01:21:10 +00:00
Christophe Bornet
30f6c9f5c8
community: Use Blockbuster to detect blocking calls in asyncio during tests (#29609)
Same as https://github.com/langchain-ai/langchain/pull/29043 for
langchain-community.

**Dependencies:**
- blockbuster (test)

**Twitter handle:** cbornet_

Co-authored-by: Erick Friis <erick@langchain.dev>
2025-02-08 01:10:39 +00:00
Christophe Bornet
3a57a28daa
langchain: Use Blockbuster to detect blocking calls in asyncio during tests (#29616)
Same as https://github.com/langchain-ai/langchain/pull/29043 for the
langchain package.

**Dependencies:**
- blockbuster (test)

**Twitter handle:** cbornet_

---------

Co-authored-by: Erick Friis <erick@langchain.dev>
2025-02-08 01:08:15 +00:00
Keenan Pepper
c67d473397
core: Make abatch_as_completed respect max_concurrency (#29426)
- **Description:** Add tests for respecting max_concurrency and
implement it for abatch_as_completed so that test passes
- **Issue:** #29425
- **Dependencies:** none
- **Twitter handle:** keenanpepper
2025-02-07 16:51:22 -08:00
Aaron V
dcfaae85d2
Core: Fix __add__ for concatting two BaseMessageChunk's (#29531)
Description:

The change allows you to use the overloaded `+` operator correctly when
`+`ing two BaseMessageChunk subclasses. Without this you *must*
instantiate a subclass for it to work.

Which feels... wrong. Base classes should be decoupled from sub classes
and should have in no way a dependency on them.

Issue:

You can't `+` a BaseMessageChunk with a BaseMessageChunk

e.g. this will explode

```py
from langchain_core.outputs import (
    ChatGenerationChunk,
)
from langchain_core.messages import BaseMessageChunk


chunk1 = ChatGenerationChunk(
    message=BaseMessageChunk(
        type="customChunk",
        content="HI",
    ),
)

chunk2 = ChatGenerationChunk(
    message=BaseMessageChunk(
        type="customChunk",
        content="HI",
    ),
)

# this will throw
new_chunk = chunk1 + chunk2
```

In case anyone ran into this issue themselves, it's probably best to use
the AIMessageChunk:

a la 

```py
from langchain_core.outputs import (
    ChatGenerationChunk,
)
from langchain_core.messages import AIMessageChunk


chunk1 = ChatGenerationChunk(
    message=AIMessageChunk(
        content="HI",
    ),
)

chunk2 = ChatGenerationChunk(
    message=AIMessageChunk(
        content="HI",
    ),
)

# No explosion!
new_chunk = chunk1 + chunk2
```

Dependencies:

None!

Twitter handle: 
`aaron_vogler`

Keeping these for later if need be:
```
baskaryan
efriis 
eyurtsev
ccurme 
vbarda
hwchase17
baskaryan
efriis
```

Co-authored-by: Erick Friis <erick@langchain.dev>
2025-02-08 00:43:36 +00:00
Marlene
4fa3ef0d55
Community/Partner: Adding Azure community and partner user agent to better track usage in Python (#29561)
- This pull request includes various changes to add a `user_agent`
parameter to Azure OpenAI, Azure Search and Whisper in the Community and
Partner packages. This helps in identifying the source of API requests
so we can better track usage and help support the community better. I
will also be adding the user_agent to the new `langchain-azure` repo as
well.

- No issue connected or  updated dependencies. 
- Utilises existing tests and docs

---------

Co-authored-by: Erick Friis <erick@langchain.dev>
2025-02-07 23:28:30 +00:00
Ella Charlaix
c401254770
huggingface: Add ipex support to HuggingFaceEmbeddings (#29386)
ONNX and OpenVINO models are available by specifying the `backend`
argument (the model is loaded using `optimum`
https://github.com/huggingface/optimum)

```python
from langchain_huggingface import HuggingFaceEmbeddings

embedding = HuggingFaceEmbeddings(
    model_name=model_id,
    model_kwargs={"backend": "onnx"},
)
```

With this PR we also enable the IPEX backend 



```python
from langchain_huggingface import HuggingFaceEmbeddings

embedding = HuggingFaceEmbeddings(
    model_name=model_id,
    model_kwargs={"backend": "ipex"},
)
```
2025-02-07 15:21:09 -08:00
Bruno Alvisio
3eaf561561
core: Handle unterminated escape character when parsing partial JSON (#29065)
**Description**
Currently, when parsing a partial JSON, if a string ends with the escape
character, the whole key/value is removed. For example:

```
>>> from langchain_core.utils.json import parse_partial_json
>>> my_str = '{"foo": "bar", "baz": "qux\\'
>>> 
>>> parse_partial_json(my_str)
{'foo': 'bar'}
```

My expectation (and with this fix) would be for `parse_partial_json()`
to return:
```
>>> from langchain_core.utils.json import parse_partial_json
>>> 
>>> my_str = '{"foo": "bar", "baz": "qux\\'
>>> parse_partial_json(my_str)
{'foo': 'bar', 'baz': 'qux'}
```

Notes:
1. It could be argued that current behavior is still desired.
2. I have experienced this issue when the streaming output from an LLM
and the chunk happens to end with `\\`
3. I haven't included tests. Will do if change is accepted.
4. This is specially troublesome when this function is used by

187131c55c/libs/core/langchain_core/output_parsers/transform.py (L111)

since what happens is that, for example, if the received sequence of
chunks are: `{"foo": "b` , `ar\\` :

Then, the result of calling `self.parse_result()` is:
```
{"foo": "b"}
```
and the second time:
```
{}
```

Co-authored-by: Erick Friis <erick@langchain.dev>
2025-02-07 23:18:21 +00:00
Viren
252cf0af10
docs: add LangFair as a provider (#29390)
**Description:**
- Add `docs/docs/providers/langfair.mdx`
- Register langfair in `libs/packages.yml`

**Twitter handle:** @LangFair

**Tests and docs**
1. Integration tests not needed as this PR only adds a .mdx file to
docs.

---------

Co-authored-by: Chester Curme <chester.curme@gmail.com>
Co-authored-by: Dylan Bouchard <dylan.bouchard@cvshealth.com>
Co-authored-by: Dylan Bouchard <109233938+dylanbouchard@users.noreply.github.com>
Co-authored-by: Erick Friis <erickfriis@gmail.com>
Co-authored-by: Erick Friis <erick@langchain.dev>
2025-02-07 21:27:37 +00:00
Erick Friis
eb9eddae0c
docs: use init_chat_model (#29623) 2025-02-07 12:39:27 -08:00
ccurme
bff25b552c
community: release 0.3.17 (#29676) 2025-02-07 19:41:44 +00:00
ccurme
01314c51fa
langchain: release 0.3.18 (#29654) 2025-02-07 13:40:26 -05:00
ccurme
92e2239414
openai[patch]: make parallel_tool_calls explicit kwarg of bind_tools (#29669)
Improves discoverability and documentation.

cc @vbarda
2025-02-07 13:34:32 -05:00
Marc Ammann
5690575f13
openai: Removed tool_calls from completion chunk after other chunks have already been sent. (#29649)
- **Description:** Before sending a completion chunk at the end of an
OpenAI stream, removing the tool_calls as those have already been sent
as chunks.
- **Issue:** -
- **Dependencies:** -
- **Twitter handle:** -

@ccurme as mentioned in another PR

---------

Co-authored-by: Chester Curme <chester.curme@gmail.com>
2025-02-07 10:15:52 -05:00
Ikko Eltociear Ashimine
0d45ad57c1
community: update base_o365.py (#29657)
extention -> extension
2025-02-07 08:43:29 -05:00
Vincent Emonet
3645181d0e
qdrant: Add similarity_search_with_score_by_vector() function to the QdrantVectorStore (#29641)
Added `similarity_search_with_score_by_vector()` function to the
`QdrantVectorStore` class.

It is required when we want to query multiple time with the same
embeddings. It was present in the now deprecated original `Qdrant`
vectorstore implementation, but was absent from the new one. It is also
implemented in a number of others `VectorStore` implementations

I have added tests for this new function

Note that I also argued in this discussion that it should be part of the
general `VectorStore`
https://github.com/langchain-ai/langchain/discussions/29638

Co-authored-by: Erick Friis <erick@langchain.dev>
2025-02-07 00:55:58 +00:00
ccurme
488cb4a739
anthropic: release 0.3.7 (#29653) 2025-02-06 17:05:57 -05:00
ccurme
ab09490c20
openai: release 0.3.4 (#29652) 2025-02-06 17:02:21 -05:00
ccurme
29a0c38cc3
openai[patch]: add test for message.name (#29651) 2025-02-06 16:49:28 -05:00
ccurme
91cca827c0
tests: release 0.3.11 (#29648) 2025-02-06 21:48:09 +00:00
Sunish Sheth
25ce1e211a
docs: Updating the imports for langchain-databricks to databricks-langchain (#29646)
Thank you for contributing to LangChain!

- [x] **PR title**: "package: description"
- Where "package" is whichever of langchain, community, core, etc. is
being modified. Use "docs: ..." for purely docs changes, "infra: ..."
for CI changes.
  - Example: "community: add foobar LLM"


- [ ] **PR message**: ***Delete this entire checklist*** and replace
with
    - **Description:** a description of the change
    - **Issue:** the issue # it fixes, if applicable
    - **Dependencies:** any dependencies required for this change
- **Twitter handle:** if your PR gets announced, and you'd like a
mention, we'll gladly shout you out!


- [ ] **Add tests and docs**: If you're adding a new integration, please
include
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in
`docs/docs/integrations` directory.


- [ ] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/

Additional guidelines:
- Make sure optional dependencies are imported within a function.
- Please do not add dependencies to pyproject.toml files (even optional
ones) unless they are required for unit tests.
- Most PRs should not touch more than one package.
- Changes should be backwards compatible.
- If you are adding something to community, do not re-import it in
langchain.

If no one reviews your PR within a few days, please @-mention one of
baskaryan, efriis, eyurtsev, ccurme, vbarda, hwchase17.
2025-02-06 13:28:07 -08:00
ccurme
e1b593ae77
text-splitters[patch]: release 0.3.6 (#29647) 2025-02-06 16:16:05 -05:00
ccurme
a91e58bc10
core: release 0.3.34 (#29644) 2025-02-06 15:53:56 -05:00
Vincent Emonet
08b9eaaa6f
community: improve FastEmbedEmbeddings support for ONNX execution provider (e.g. GPU) (#29645)
I made a change to how was implemented the support for GPU in
`FastEmbedEmbeddings` to be more consistent with the existing
implementation `langchain-qdrant` sparse embeddings implementation

It is directly enabling to provide the list of ONNX execution providers:
https://github.com/langchain-ai/langchain/blob/master/libs/partners/qdrant/langchain_qdrant/fastembed_sparse.py#L15

It is a bit less clear to a user that just wants to enable GPU, but
gives more capabilities to work with other execution providers that are
not the `CUDAExecutionProvider`, and is more future proof

Sorry for the disturbance @ccurme

> Nice to see you just moved to `uv`! It is so much nicer to run
format/lint/test! No need to manually rerun the `poetry install` with
all required extras now
2025-02-06 15:31:23 -05:00
ccurme
3450bfc806
infra: add UV_FROZEN to makefiles (#29642)
These are set in Github workflows, but forgot to add them to most
makefiles for convenience when developing locally.

`uv run` will automatically sync the lock file. Because many of our
development dependencies are local installs, it will pick up version
changes and update the lock file. Passing `--frozen` or setting this
environment variable disables the behavior.
2025-02-06 14:36:54 -05:00
ccurme
d172984c91
infra: migrate to uv (#29566) 2025-02-06 13:36:26 -05:00
ccurme
9da06e6e94
standard-tests[patch]: use has_structured_output property to engage structured output tests (#29635)
Motivation: dedicated structured output features are becoming more
common, such that integrations can support structured output without
supporting tool calling.

Here we make two changes:

1. Update the `has_structured_output` method to default to True if a
model supports tool calling (in addition to defaulting to True if
`with_structured_output` is overridden).
2. Update structured output tests to engage if `has_structured_output`
is True.
2025-02-06 10:09:06 -08:00
Vincent Emonet
db8201d4da
community: fix typo in the module imported when using GPU with FastEmbedEmbeddings (#29631)
Made a mistake in the module to import (the module stay the same only
the installed package changes), fixed it and tested it

https://github.com/langchain-ai/langchain/pull/29627

@ccurme
2025-02-06 10:26:08 -05:00
Mohammed Abbadi
f8fd65dea2
community: Update deeplake.py (#29633)
Deep Lake recently released version 4, which introduces significant
architectural changes, including a new on-disk storage format, enhanced
indexing mechanisms, and improved concurrency. However, LangChain's
vector store integration currently does not support Deep Lake v4 due to
breaking API changes.

Previously, the installation command was:
`pip install deeplake[enterprise]`
This installs the latest available version, which now defaults to Deep
Lake v4. Since LangChain's vector store integration is still dependent
on v3, this can lead to compatibility issues when using Deep Lake as a
vector database within LangChain.

To ensure compatibility, the installation command has been updated to:
`pip install deeplake[enterprise]<4.0.0`
This constraint ensures that pip installs the latest available version
of Deep Lake within the v3 series while avoiding the incompatible v4
update.
2025-02-06 10:25:13 -05:00
Vincent Emonet
0ac5536f04
community: add support for using GPUs with FastEmbedEmbeddings (#29627)
- **Description:** add a `gpu: bool = False` field to the
`FastEmbedEmbeddings` class which enables to use GPU (through ONNX CUDA
provider) when generating embeddings with any fastembed model. It just
requires the user to install a different dependency and we use a
different provider when instantiating `fastembed.TextEmbedding`
- **Issue:** when generating embeddings for a really large amount of
documents this drastically increase performance (honestly that is a must
have in some situations, you can't just use CPU it is way too slow)
- **Dependencies:** no direct change to dependencies, but internally the
users will need to install `fastembed-gpu` instead of `fastembed`, I
made all the changes to the init function to properly let the user know
which dependency they should install depending on if they enabled `gpu`
or not
 
cf. fastembed docs about GPU for more details:
https://qdrant.github.io/fastembed/examples/FastEmbed_GPU/

I did not added test because it would require access to a GPU in the
testing environment
2025-02-06 08:04:19 -05:00
Dmitrii Rashchenko
0ceda557aa
add o1 and o3-mini to pricing (#29628)
### PR Title:  
**community: add latest OpenAI models pricing**  

### Description:  
This PR updates the OpenAI model cost calculation mapping by adding the
latest OpenAI models, **o1 (non-preview)** and **o3-mini**, based on the
pricing listed on the [OpenAI pricing
page](https://platform.openai.com/docs/pricing).

### Changes:  
- Added pricing for `o1`, `o1-2024-12-17`, `o1-cached`, and
`o1-2024-12-17-cached` for input tokens.
- Added pricing for `o1-completion` and `o1-2024-12-17-completion` for
output tokens.
- Added pricing for `o3-mini`, `o3-mini-2025-01-31`, `o3-mini-cached`,
and `o3-mini-2025-01-31-cached` for input tokens.
- Added pricing for `o3-mini-completion` and
`o3-mini-2025-01-31-completion` for output tokens.

### Issue:  
N/A  

### Dependencies:  
None  

### Testing & Validation:  
- No functional changes outside of updating the cost mapping.  
- No tests were added or modified.
2025-02-06 08:02:20 -05:00
ZhangShenao
ac53977dbc
[MistralAI] Improve MistralAIEmbeddings (#29242)
- Add static method decorator for method.
- Add expected exception for retry decorator

#29125
2025-02-05 21:31:54 -05:00
Andrew Wason
22aa5e07ed
standard-tests: Fix ToolsIntegrationTests to correctly handle "content_and_artifact" tools (#29391)
**Description:**

The response from `tool.invoke()` is always a ToolMessage, with content
and artifact fields, not a tuple.
The tuple is converted to a ToolMessage here

b6ae7ca91d/libs/core/langchain_core/tools/base.py (L726)

**Issue:**

Currently `ToolsIntegrationTests` requires `invoke()` to return a tuple
and so standard tests fail for "content_and_artifact" tools. This fixes
that to check the returned ToolMessage.

This PR also adds a test that now passes.
2025-02-05 21:27:09 -05:00
Mohammad Anash
f849305a56
fixed Bug in PreFilter of AzureCosmosDBNoSqlVectorSearch (#29613)
Description: Fixes PreFilter value handling in Azure Cosmos DB NoSQL
vectorstore. The current implementation fails to handle numeric values
in filter conditions, causing an undefined value variable error. This PR
adds support for numeric, boolean, and NULL values while maintaining the
existing string and list handling.

Changes:
Added handling for numeric types (int/float)
Added boolean value support
Added NULL value handling
Added type validation for unsupported values
Fixed scope of value variable initialization

Issue: 
Fixes #29610

Implementation Notes:
No changes to public API
Backwards compatible
Maintains consistent behavior with existing MongoDB-style filtering
Preserves SQL injection prevention through proper value handling

---------

Co-authored-by: Chester Curme <chester.curme@gmail.com>
2025-02-06 02:20:26 +00:00
Philippe PRADOS
6ff0d5c807
community[minor]: 04 - Refactoring PDFMiner parser (#29526)
This is one part of a larger Pull Request (PR) that is too large to be
submitted all at once. This specific part focuses on updating the XXX
parser.

For more details, see [PR
28970](https://github.com/langchain-ai/langchain/pull/28970).

---------

Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>
2025-02-05 21:08:27 -05:00
Isaac Francisco
91ffd7caad
core: allow passing message dicts into ChatPromptTemplate (#29363)
Co-authored-by: Erick Friis <erick@langchain.dev>
2025-02-05 09:45:52 -08:00
ccurme
69595b0914
docs: fix builds (#29607)
Failing with:
> ValueError: Provider page not found for databricks-langchain. Please
add one at docs/integrations/providers/databricks-langchain.{mdx,ipynb}
2025-02-05 14:24:53 +00:00
ccurme
91a33a9211
anthropic[patch]: release 0.3.6 (#29606) 2025-02-05 14:18:02 +00:00
ccurme
5cbe6aba8f
anthropic[patch]: support citations in streaming (#29591) 2025-02-05 09:12:07 -05:00
William FH
5ae4ed791d
Drop duplicate inputs (#29589) 2025-02-04 18:06:10 -08:00
Erick Friis
65f0deb81a
packages: databricks-langchain (#29593) 2025-02-05 01:53:34 +00:00
Yoav Levy
621bba7e26
docs: add nimble as a provider (#29579)
## Description:

- Add docs/docs/providers/nimbleway.ipynb
- Add docs/docs/integrations/retrievers/nimbleway.ipynb
- Register nimbleway in libs/packages.yml

- X (twitter) handle: @urielkn / @LevyNorbit8
2025-02-04 16:47:03 -08:00
Erick Friis
50d61eafa2
partners/deepseek: release 0.1.1 (#29592) 2025-02-04 23:46:38 +00:00
Erick Friis
7edfcbb090
docs: rename to langchain-deepseek in docs (#29587) 2025-02-04 14:22:17 -08:00
Erick Friis
df8fa882b2
deepseek: bump core (#29584) 2025-02-04 10:25:46 -08:00
Erick Friis
455f65947a
deepseek: rename to langchain-deepseek from langchain-deepseek-official (#29583) 2025-02-04 17:57:25 +00:00
Philippe PRADOS
5771e561fb
[Bugfix langchain_community] Fix PyMuPDFLoader (#29550)
- **Description:**  add legacy properties
    - **Issue:** #29470
    - **Twitter handle:** pprados
2025-02-04 09:24:40 -05:00
Ashutosh Kumar
65b404a2d1
[oci_generative_ai] Option to pass auth_file_location (#29481)
**PR title**: "community: Option to pass auth_file_location for
oci_generative_ai"

**Description:** Option to pass auth_file_location, to overwrite config
file default location "~/.oci/config" where profile name configs
present. This is not fixing any issues. Just added optional parameter
called "auth_file_location", which internally supported by any OCI
client including GenerativeAiInferenceClient.
2025-02-03 21:44:13 -05:00
Teruaki Ishizaki
aeb42dc900
partners: Fixed the procedure of initializing pad_token_id (#29500)
- **Description:** Add to check pad_token_id and eos_token_id of model
config. It seems that this is the same bug as the HuggingFace TGI bug.
It's same bug as #29434
- **Issue:** #29431
- **Dependencies:** none
- **Twitter handle:** tell14

Example code is followings:
```python
from langchain_huggingface.llms import HuggingFacePipeline

hf = HuggingFacePipeline.from_model_id(
    model_id="meta-llama/Llama-3.2-3B-Instruct",
    task="text-generation",
    pipeline_kwargs={"max_new_tokens": 10},
)

from langchain_core.prompts import PromptTemplate

template = """Question: {question}

Answer: Let's think step by step."""
prompt = PromptTemplate.from_template(template)

chain = prompt | hf

question = "What is electroencephalography?"

print(chain.invoke({"question": question}))
```
2025-02-03 21:40:33 -05:00
AmirPoursaberi
a6efd22ba1
Fix a tiny typo in create_retrieval_chain docstring (#29552)
Hi there!

To fix a tiny typo in `create_retrieval_chain` docstring.
2025-02-03 10:54:49 -05:00
Hemant Rawat
db1693aa70
community: fix issue #29429 in age_graph.py (#29506)
## Description:

This PR addresses issue #29429 by fixing the _wrap_query method in
langchain_community/graphs/age_graph.py. The method now correctly
handles Cypher queries with UNION and EXCEPT operators, ensuring that
the fields in the SQL query are ordered as they appear in the Cypher
query. Additionally, the method now properly handles cases where RETURN
* is not supported.

### Issue: #29429

### Dependencies: None


### Add tests and docs:

Added unit tests in tests/unit_tests/graphs/test_age_graph.py to
validate the changes.
No new integrations were added, so no example notebook is necessary.
Lint and test:

Ran make format, make lint, and make test to ensure code quality and
functionality.
2025-02-01 21:24:45 -05:00
Keenan Pepper
2f97916dea
docs: Add goodfire notebook and add to packages.yml (#29512)
- **Description:** Add Goodfire ipynb notebook and add
langchain-goodfire package to packages.yml
- **Issue:** n/a
- **Dependencies:** docs only
- **Twitter handle:** keenanpepper

---------

Co-authored-by: Chester Curme <chester.curme@gmail.com>
2025-02-01 19:43:20 -05:00
ccurme
a3c5e4d070
deepseek[patch]: bump langchain-openai and add to scheduled testing (#29535) 2025-02-01 18:40:59 -05:00
ccurme
16a422f3fa
community: add standard tests for Perplexity (#29534) 2025-02-01 17:02:57 -05:00
Amit Ghadge
0c405245c4
[Integrations][Tool] Added Jenkins tools support (#29516)
Thank you for contributing to LangChain!

- [x] **PR title**: "package: description"
- Where "package" is whichever of langchain, community, core, etc. is
being modified. Use "docs: ..." for purely docs changes, "infra: ..."
for CI changes.
  - Example: "community: add foobar LLM"


- [x] **PR message**: ***Delete this entire checklist*** and replace
with
    - **Description:** a description of the change
    - **Issue:** the issue # it fixes, if applicable
    - **Dependencies:** any dependencies required for this change
- **Twitter handle:** if your PR gets announced, and you'd like a
mention, we'll gladly shout you out!


- [x] **Add tests and docs**: If you're adding a new integration, please
include
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in
`docs/docs/integrations` directory.


- [x] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/

Additional guidelines:
- Make sure optional dependencies are imported within a function.
- Please do not add dependencies to pyproject.toml files (even optional
ones) unless they are required for unit tests.
- Most PRs should not touch more than one package.
- Changes should be backwards compatible.
- If you are adding something to community, do not re-import it in
langchain.

If no one reviews your PR within a few days, please @-mention one of
baskaryan, efriis, eyurtsev, ccurme, vbarda, hwchase17.

---------

Co-authored-by: Chester Curme <chester.curme@gmail.com>
2025-01-31 12:50:10 -05:00
Christophe Bornet
aab2e42169
core[patch]: Use Blockbuster to detect blocking calls in asyncio during tests (#29043)
This PR uses the [blockbuster](https://github.com/cbornet/blockbuster)
library in langchain-core to detect blocking calls made in the asyncio
event loop during unit tests.
Avoiding blocking calls is hard as these can be deeply buried in the
code or made in 3rd party libraries.
Blockbuster makes it easier to detect them by raising an exception when
a call is made to a known blocking function (eg: `time.sleep`).

Adding blockbuster allowed to find a blocking call in
`aconfig_with_context` (it ends up calling `get_function_nonlocals`
which loads function code).

**Dependencies:**
- blockbuster (test)

**Twitter handle:** cbornet_
2025-01-31 10:06:34 -05:00
Philippe PRADOS
ceda8bc050
community[minor]: 03 - Refactoring PyPDF parser (#29330)
This is one part of a larger Pull Request (PR) that is too large to be
submitted all at once.
This specific part focuses on updating the PyPDF parser.

For more details, see [PR
28970](https://github.com/langchain-ai/langchain/pull/28970).
2025-01-31 10:05:07 -05:00
Julian Castro Pulgarin
b7e3e337b1
community: Fix YahooFinanceNewsTool to handle updated yfinance data structure (#29498)
*Description:**
Updates the YahooFinanceNewsTool to handle the current yfinance news
data structure. The tool was failing with a KeyError due to changes in
the yfinance API's response format. This PR updates the code to
correctly extract news URLs from the new structure.

**Issue:** #29495

**Dependencies:** 
No new dependencies required. Works with existing yfinance package.

The changes maintain backwards compatibility while fixing the KeyError
that users were experiencing.

The modified code properly handles the new data structure where:
- News type is now at `content.contentType`
- News URL is now at `content.canonicalUrl.url`

---------

Co-authored-by: Chester Curme <chester.curme@gmail.com>
2025-01-31 02:31:44 +00:00
Erick Friis
332e303858
partners/mistralai: release 0.2.6 (#29491) 2025-01-29 22:23:14 +00:00
Erick Friis
2c795f5628
partners/openai: release 0.3.3 (#29490) 2025-01-29 22:23:03 +00:00
Erick Friis
f307b3cc5f
langchain: release 0.3.17 (#29485) 2025-01-29 22:22:49 +00:00
Erick Friis
5cad3683b4
partners/groq: release 0.2.4 (#29488) 2025-01-29 22:22:30 +00:00
Erick Friis
e074c26a6b
partners/fireworks: release 0.2.7 (#29487) 2025-01-29 22:22:18 +00:00
Erick Friis
685609e1ef
partners/anthropic: release 0.3.5 (#29486) 2025-01-29 22:22:11 +00:00
Erick Friis
ed3a5e664c
standard-tests: release 0.3.10 (#29484) 2025-01-29 22:21:05 +00:00