Commit Graph

7401 Commits

Author SHA1 Message Date
Mason Daugherty
b2099e15c6
Merge branch 'master' into eugene/expose_messages 2025-08-11 18:22:21 -04:00
Anderson
166c027434
docs: add scrapeless integration documentation (#32081)
Thank you for contributing to LangChain! 
- [x] **PR title**: "package: description"
- Where "package" is whichever of langchain, core, etc. is being
modified. Use "docs: ..." for purely docs changes, "infra: ..." for CI
changes.
  - Example: "core: add foobar LLM"

- **Description:** Integrated the Scrapeless package to enable Langchain
users to seamlessly incorporate Scrapeless into their agents.
- **Dependencies:** None
- **Twitter handle:** [Scrapelessteam](https://x.com/Scrapelessteam)

- [x] **Add tests and docs**: If you're adding a new integration, you
must include:
1. A test for the integration, preferably unit tests that do not rely on
network access,
2. An example notebook showing its use. It lives in
`docs/docs/integrations` directory.

- [x] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See [contribution
guidelines](https://python.langchain.com/docs/contributing/) for more.

Additional guidelines:

- Make sure optional dependencies are imported within a function.
- Please do not add dependencies to `pyproject.toml` files (even
optional ones) unless they are **required** for unit tests.
- Most PRs should not touch more than one package.
- Changes should be backwards compatible.

---------

Co-authored-by: Mason Daugherty <mason@langchain.dev>
Co-authored-by: Mason Daugherty <github@mdrxy.com>
2025-08-11 22:16:15 +00:00
GDanksAnchor
4a2a3fcd43
docs: add anchorbrowser (#32494)
# Description

This PR updates the docs for the
[langchain-anchorbrowser](https://pypi.org/project/langchain-anchorbrowser/)
package. It adds a few tools

[Anchor Browser](https://anchorbrowser.io/?utm=langchain) is the
platform for AI Agentic browser automation, which solves the challenge
of automating workflows for web applications that lack APIs or have
limited API coverage. It simplifies the creation, deployment, and
management of browser-based automations, transforming complex web
interactions into simple API endpoints.

---------

Co-authored-by: Mason Daugherty <mason@langchain.dev>
Co-authored-by: Mason Daugherty <github@mdrxy.com>
2025-08-11 21:48:10 +00:00
Anubhav Dhawan
1e38fd2ce3
docs: add integration guide for MCP Toolbox (#32344)
This PR introduces a new integration guide for MCP Toolbox. The primary
goal of this new documentation is to enhance the discoverability of MCP
Toolbox for developers working within the LangChain ecosystem, providing
them with a clear and direct path to using our tools.

This approach was chosen to provide users with a practical, hands-on
example that they can easily follow.

> [!NOTE]
> The page added in this PR is linked to from a section in Google
partners page added in #32356.

---------

Co-authored-by: Lauren Hirata Singh <lauren@langchain.dev>
Co-authored-by: Mason Daugherty <github@mdrxy.com>
Co-authored-by: Mason Daugherty <mason@langchain.dev>
2025-08-11 21:03:38 +00:00
Mason Daugherty
5ccdcd7b7b
feat(ollama): docs updates (#32507) 2025-08-11 15:39:44 -04:00
Mason Daugherty
ee4c2510eb
feat: port various nit changes from wip-v0.4 (#32506)
Lots of work that wasn't directly related to core
improvements/messages/testing functionality
2025-08-11 15:09:08 -04:00
Mason Daugherty
e5d0a4e4d6
feat(standard-tests): formatting (#32504)
Not touching `pyproject.toml` or chat model related items as to not
interfere with work in wip0.4 branch
2025-08-11 13:30:30 -04:00
Mason Daugherty
457ce9c4b0
feat(text-splitters): ruff fixes and rules (#32502) 2025-08-11 13:28:22 -04:00
Mason Daugherty
27b6b53f20
feat(xai): ruff fixes and rules (#32501) 2025-08-11 13:03:07 -04:00
Christophe Bornet
f55186b38f
fix(core): fix beta decorator for properties (#32497) 2025-08-11 12:43:53 -04:00
Mason Daugherty
374f414c91
feat(qdrant): ruff fixes and rules (#32500) 2025-08-11 12:43:41 -04:00
ccurme
9259eea846
fix(docs): use pepy for integration package download badges (#32491)
pypi stats has been down for some time.
2025-08-10 18:41:36 -04:00
ccurme
afcb097ef5
fix(docs): DigitalOcean Gradient: link to correct provider page and update page title (#32490) 2025-08-10 17:29:44 -04:00
ccurme
088095b663
release(openai): release 0.3.29 (#32463) 2025-08-08 11:04:33 -04:00
Mason Daugherty
c31236264e
chore: formatting across codebase (#32466) 2025-08-08 10:20:10 -04:00
ccurme
02001212b0
fix(openai): revert some changes (#32462)
Keep coverage on `output_version="v0"` (increasing coverage is being
managed in v0.4 branch).
2025-08-08 08:51:18 -04:00
Mason Daugherty
00244122bd
feat(openai): minimal and verbosity (#32455) 2025-08-08 02:24:21 +00:00
ccurme
6727d6e8c8
release(core): 0.3.74 (#32454) 2025-08-07 16:39:01 -04:00
Michael Matloka
5036bd7adb
fix(openai): don't crash get_num_tokens_from_messages on gpt-5 (#32451) 2025-08-07 16:33:19 -04:00
ccurme
ec2b34a02d
feat(openai): custom tools (#32449) 2025-08-07 16:30:01 -04:00
Mason Daugherty
145d38f7dd
test(openai): add tests for prompt_cache_key parameter and update docs (#32363)
Introduce tests to validate the behavior and inclusion of the
`prompt_cache_key` parameter in request payloads for the `ChatOpenAI`
model.
2025-08-07 15:29:47 -04:00
ccurme
68c70da33e
fix(openai): add in output_text (#32450)
This property was deleted in `openai==1.99.2`.
2025-08-07 15:23:56 -04:00
Eugene Yurtsev
754528d23f
feat(langchain): add stuff and map reduce chains (#32333)
* Add stuff and map reduce chains
* We'll need to rename and add unit tests to the chains prior to
official release
2025-08-07 15:20:05 -04:00
Christophe Bornet
a647073b26
feat(standard-tests): add a property to set the name of the parameter for the number of results to return (#32443)
Not all retrievers use `k` as param name to set the number of results to
return. Even in LangChain itself. Eg:
bc4251b9e0/libs/core/langchain_core/indexing/in_memory.py (L31)

So it's helpful to be able to change it for a given retriever.
The change also adds hints to disable the tests if the retriever doesn't
support setting the param in the constructor or in the invoke method
(for instance, the `InMemoryDocumentIndex` in the link supports in the
constructor but not in the invoke method).

This change is backward compatible.

---------

Co-authored-by: Mason Daugherty <mason@langchain.dev>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
2025-08-07 11:22:24 -04:00
ccurme
06d8754b0b
release(core): 0.3.73 (#32446) 2025-08-07 09:03:53 -04:00
ccurme
6e108c1cb4
feat(core): zero-out token costs for cache hits (#32437) 2025-08-07 08:49:34 -04:00
John Bledsoe
bc4251b9e0
fix(core): fix index checking when merging lists (#32431)
**Description:** fix an issue I discovered when attempting to merge
messages in which one message has an `index` key in its content
dictionary and another does not.
2025-08-06 12:47:33 -04:00
Mason Daugherty
ba83f58141
release(groq): 0.3.7 (#32417) 2025-08-05 15:13:08 -04:00
Mason Daugherty
fb490b0c39
feat(groq): losen restrictions on reasoning_effort, inject effort in meta, update tests (#32415) 2025-08-05 15:03:38 -04:00
Mason Daugherty
419c173225
feat(groq): openai-oss (#32411)
use new openai-oss for integration tests, set module-level testing model
names and improve robustness of tool tests
2025-08-05 14:18:56 -04:00
Narasimha Badrinath
dd9f5d7cde
feat(docs): add langchain-gradientai as provider (#32202)
langchain-gradientai is Digitalocean's integration with Langchain. It
will help users to build langchain applications using Digitalocean's
GradientAI platform.

---------

Co-authored-by: Mason Daugherty <github@mdrxy.com>
Co-authored-by: Mason Daugherty <mason@langchain.dev>
2025-08-04 14:57:59 +00:00
ccurme
a9e52ca605
chore(openai): bump openai sdk (#32322) 2025-07-30 10:58:18 -04:00
Mason Daugherty
fbd5a238d8
fix(core): revert "fix: tool call streaming bug with inconsistent indices from Qwen3" (#32307)
Reverts langchain-ai/langchain#32160

Original issue stems from using `ChatOpenAI` to interact with a `qwen`
model. Recommended to use
[langchain-qwq](https://python.langchain.com/docs/integrations/chat/qwq/)
which is built for Qwen
2025-07-29 10:26:38 -04:00
Mason Daugherty
0e287763cd
fix: lint 2025-07-28 18:49:43 -04:00
Copilot
0b56c1bc4b
fix: tool call streaming bug with inconsistent indices from Qwen3 (#32160)
Fixes a streaming bug where models like Qwen3 (using OpenAI interface)
send tool call chunks with inconsistent indices, resulting in
duplicate/erroneous tool calls instead of a single merged tool call.

## Problem

When Qwen3 streams tool calls, it sends chunks with inconsistent `index`
values:
- First chunk: `index=1` with tool name and partial arguments  
- Subsequent chunks: `index=0` with `name=None`, `id=None` and argument
continuation

The existing `merge_lists` function only merges chunks when their
`index` values match exactly, causing these logically related chunks to
remain separate, resulting in multiple incomplete tool calls instead of
one complete tool call.

```python
# Before fix: Results in 1 valid + 1 invalid tool call
chunk1 = AIMessageChunk(tool_call_chunks=[
    {"name": "search", "args": '{"query":', "id": "call_123", "index": 1}
])
chunk2 = AIMessageChunk(tool_call_chunks=[
    {"name": None, "args": ' "test"}', "id": None, "index": 0}  
])
merged = chunk1 + chunk2  # Creates 2 separate tool calls

# After fix: Results in 1 complete tool call
merged = chunk1 + chunk2  # Creates 1 merged tool call: search({"query": "test"})
```

## Solution

Enhanced the `merge_lists` function in `langchain_core/utils/_merge.py`
with intelligent tool call chunk merging:

1. **Preserves existing behavior**: Same-index chunks still merge as
before
2. **Adds special handling**: Tool call chunks with
`name=None`/`id=None` that don't match any existing index are now merged
with the most recent complete tool call chunk
3. **Maintains backward compatibility**: All existing functionality
works unchanged
4. **Targeted fix**: Only affects tool call chunks, doesn't change
behavior for other list items

The fix specifically handles the pattern where:
- A continuation chunk has `name=None` and `id=None` (indicating it's
part of an ongoing tool call)
- No matching index is found in existing chunks
- There exists a recent tool call chunk with a valid name or ID to merge
with

## Testing

Added comprehensive test coverage including:
-  Qwen3-style chunks with different indices now merge correctly
-  Existing same-index behavior preserved  
-  Multiple distinct tool calls remain separate
-  Edge cases handled (empty chunks, orphaned continuations)
-  Backward compatibility maintained

Fixes #31511.

<!-- START COPILOT CODING AGENT TIPS -->
---

💬 Share your feedback on Copilot coding agent for the chance to win a
$200 gift card! Click
[here](https://survey.alchemer.com/s3/8343779/Copilot-Coding-agent) to
start the survey.

---------

Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com>
Co-authored-by: mdrxy <61371264+mdrxy@users.noreply.github.com>
Co-authored-by: Mason Daugherty <github@mdrxy.com>
Co-authored-by: Mason Daugherty <mason@langchain.dev>
2025-07-28 22:31:41 +00:00
Copilot
ad88e5aaec
fix(core): resolve cache validation error by safely converting Generation to ChatGeneration objects (#32156)
## Problem

ChatLiteLLM encounters a `ValidationError` when using cache on
subsequent calls, causing the following error:

```
ValidationError(model='ChatResult', errors=[{'loc': ('generations', 0, 'type'), 'msg': "unexpected value; permitted: 'ChatGeneration'", 'type': 'value_error.const', 'ctx': {'given': 'Generation', 'permitted': ('ChatGeneration',)}}])
```

This occurs because:
1. The cache stores `Generation` objects (with `type="Generation"`)
2. But `ChatResult` expects `ChatGeneration` objects (with
`type="ChatGeneration"` and a required `message` field)
3. When cached values are retrieved, validation fails due to the type
mismatch

## Solution

Added graceful handling in both sync (`_generate_with_cache`) and async
(`_agenerate_with_cache`) cache methods to:

1. **Detect** when cached values contain `Generation` objects instead of
expected `ChatGeneration` objects
2. **Convert** them to `ChatGeneration` objects by wrapping the text
content in an `AIMessage`
3. **Preserve** all original metadata (`generation_info`)
4. **Allow** `ChatResult` creation to succeed without validation errors

## Example

```python
# Before: This would fail with ValidationError
from langchain_community.chat_models import ChatLiteLLM
from langchain_community.cache import SQLiteCache
from langchain.globals import set_llm_cache

set_llm_cache(SQLiteCache(database_path="cache.db"))
llm = ChatLiteLLM(model_name="openai/gpt-4o", cache=True, temperature=0)

print(llm.predict("test"))  # Works fine (cache empty)
print(llm.predict("test"))  # Now works instead of ValidationError

# After: Seamlessly handles both Generation and ChatGeneration objects
```

## Changes

- **`libs/core/langchain_core/language_models/chat_models.py`**: 
  - Added `Generation` import from `langchain_core.outputs`
- Enhanced cache retrieval logic in `_generate_with_cache` and
`_agenerate_with_cache` methods
- Added conversion from `Generation` to `ChatGeneration` objects when
needed

-
**`libs/core/tests/unit_tests/language_models/chat_models/test_cache.py`**:
- Added test case to validate the conversion logic handles mixed object
types

## Impact

- **Backward Compatible**: Existing code continues to work unchanged
- **Minimal Change**: Only affects cache retrieval path, no API changes
- **Robust**: Handles both legacy cached `Generation` objects and new
`ChatGeneration` objects
- **Preserves Data**: All original content and metadata is maintained
during conversion

Fixes #22389.

<!-- START COPILOT CODING AGENT TIPS -->
---

💡 You can make Copilot smarter by setting up custom instructions,
customizing its development environment and configuring Model Context
Protocol (MCP) servers. Learn more [Copilot coding agent
tips](https://gh.io/copilot-coding-agent-tips) in the docs.

---------

Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com>
Co-authored-by: mdrxy <61371264+mdrxy@users.noreply.github.com>
Co-authored-by: Mason Daugherty <github@mdrxy.com>
Co-authored-by: Mason Daugherty <mason@langchain.dev>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
2025-07-28 22:28:16 +00:00
Mason Daugherty
b7e4797e8b
release(anthropic): 0.3.18 (#32292) 2025-07-28 17:07:11 -04:00
Mason Daugherty
3a487bf720
refactor(anthropic): AnthropicLLM to use Messages API (#32290)
re: #32189
2025-07-28 16:22:58 -04:00
Mason Daugherty
8db16b5633
fix: use new Google model names in examples (#32288) 2025-07-28 19:03:42 +00:00
Mason Daugherty
e79e0bd6b4
fix(openai): add max_retries parameter to ChatOpenAI for handling 503 capacity errors (#32286)
Some integration tests were failing
2025-07-28 13:58:23 -04:00
ccurme
c55294ecb0
chore(core): add test for nested pydantic fields in schemas (#32285) 2025-07-28 17:27:24 +00:00
Mason Daugherty
7a26c3d233
fix: update bar_model to use the correct model version claude-3-7-sonnet-20250219 (#32284) 2025-07-28 12:57:40 -04:00
Mason Daugherty
a07d2c5016
refactor: remove references to unsupported model claude-3-sonnet-20240229 (#32281)
Addresses some (but not all) test issues brought about in #32280
2025-07-28 11:57:43 -04:00
Aleksandr Filippov
f0b6baa0ef
fix(core): track within-batch deduplication in indexing num_skipped count (#32273)
**Description:** Fixes incorrect `num_skipped` count in the LangChain
indexing API. The current implementation only counts documents that
already exist in RecordManager (cross-batch duplicates) but fails to
count documents removed during within-batch deduplication via
`_deduplicate_in_order()`.

This PR adds tracking of the original batch size before deduplication
and includes the difference in `num_skipped`, ensuring that `num_added +
num_skipped` equals the total number of input documents.

**Issue:** Fixes incorrect document count reporting in indexing
statistics

**Dependencies:** None

Fixes #32272

---------

Co-authored-by: Alex Feel <afilippov@spotware.com>
2025-07-28 09:58:51 -04:00
Mason Daugherty
12c0e9b7d8
fix(docs): local API reference documentation build (#32271)
ensure all relevant packages are correctly processed - cli wasn't
included, also fix ValueError
2025-07-28 00:50:20 -04:00
Mason Daugherty
96cbd90cba
fix: formatting issues in docstrings (#32265)
Ensures proper reStructuredText formatting by adding the required blank
line before closing docstring quotes, which resolves the "Block quote
ends without a blank line; unexpected unindent" warning.
2025-07-27 23:37:47 -04:00
Mason Daugherty
c6cb1fae61
fix: devcontainer (#32260) 2025-07-27 20:24:16 -04:00
Christophe Bornet
efdfa00d10
chore(langchain): add ruff rules ARG (#32110)
See https://docs.astral.sh/ruff/rules/#flake8-unused-arguments-arg

Co-authored-by: Mason Daugherty <mason@langchain.dev>
2025-07-26 18:32:34 -04:00
Christophe Bornet
a2ad5aca41
chore(langchain): add ruff rules TC (#31921)
See https://docs.astral.sh/ruff/rules/#flake8-type-checking-tc
2025-07-26 18:27:26 -04:00
Mason Daugherty
f624ad489a
feat(docs): improve devx, fix Makefile targets (#32237)
**TL;DR much of the provided `Makefile` targets were broken, and any
time I wanted to preview changes locally I either had to refer to a
command Chester gave me or try waiting on a Vercel preview deployment.
With this PR, everything should behave like normal.**

Significant updates to the `Makefile` and documentation files, focusing
on improving usability, adding clear messaging, and fixing/enhancing
documentation workflows.

### Updates to `Makefile`:

#### Enhanced build and cleaning processes:
- Added informative messages (e.g., "📚 Building LangChain
documentation...") to makefile targets like `docs_build`, `docs_clean`,
and `api_docs_build` for better user feedback during execution.
- Introduced a `clean-cache` target to the `docs` `Makefile` to clear
cached dependencies and ensure clean builds.

#### Improved dependency handling:
- Modified `install-py-deps` to create a `.venv/deps_installed` marker,
preventing redundant/duplicate dependency installations and improving
efficiency.

#### Streamlined file generation and infrastructure setup:
- Added caching for the LangServe README download and parallelized
feature table generation
- Added user-friendly completion messages for targets like `copy-infra`
and `render`.

#### Documentation server updates:
- Enhanced the `start` target with messages indicating server start and
URL for local documentation viewing.

---

### Documentation Improvements:

#### Content clarity and consistency:
- Standardized section titles for consistency across documentation
files.
[[1]](diffhunk://#diff-9b1a85ea8a9dcf79f58246c88692cd7a36316665d7e05a69141cfdc50794c82aL1-R1)
[[2]](diffhunk://#diff-944008ad3a79d8a312183618401fcfa71da0e69c75803eff09b779fc8e03183dL1-R1)
- Refined phrasing and formatting in sections like "Dependency
management" and "Formatting and linting" for better readability.
[[1]](diffhunk://#diff-2069d4f956ab606ae6d51b191439283798adaf3a6648542c409d258131617059L6-R6)
[[2]](diffhunk://#diff-2069d4f956ab606ae6d51b191439283798adaf3a6648542c409d258131617059L84-R82)

#### Enhanced workflows:
- Updated instructions for building and viewing documentation locally,
including tips for specifying server ports and handling API reference
previews.
[[1]](diffhunk://#diff-048deddcfd44b242e5b23aed9f2e9ec73afc672244ce14df2a0a316d95840c87L60-R94)
[[2]](diffhunk://#diff-048deddcfd44b242e5b23aed9f2e9ec73afc672244ce14df2a0a316d95840c87L82-R126)
- Expanded guidance on cleaning documentation artifacts and using
linting tools effectively.
[[1]](diffhunk://#diff-048deddcfd44b242e5b23aed9f2e9ec73afc672244ce14df2a0a316d95840c87L82-R126)
[[2]](diffhunk://#diff-048deddcfd44b242e5b23aed9f2e9ec73afc672244ce14df2a0a316d95840c87L107-R142)

#### API reference documentation:
- Improved instructions for generating and formatting in-code
documentation, highlighting best practices for docstring writing.
[[1]](diffhunk://#diff-048deddcfd44b242e5b23aed9f2e9ec73afc672244ce14df2a0a316d95840c87L107-R142)
[[2]](diffhunk://#diff-048deddcfd44b242e5b23aed9f2e9ec73afc672244ce14df2a0a316d95840c87L144-R186)

---

### Minor Changes:
- Added support for a new package name (`langchain_v1`) in the API
documentation generation script.
- Fixed minor capitalization and formatting issues in documentation
files.
[[1]](diffhunk://#diff-2069d4f956ab606ae6d51b191439283798adaf3a6648542c409d258131617059L40-R40)
[[2]](diffhunk://#diff-2069d4f956ab606ae6d51b191439283798adaf3a6648542c409d258131617059L166-R160)

---------

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
2025-07-25 14:49:03 -04:00