Commit Graph

14789 Commits

Author SHA1 Message Date
Sydney Runkle
6ba1177f4f Merge branch 'master' into sr/looser-api-for-summarization 2025-11-17 08:55:42 -05:00
Mason Daugherty
52b1516d44 style(langchain): fix some middleware ref syntax (#33988) 2025-11-16 00:33:17 -05:00
Mason Daugherty
8a3bb73c05 release(openai): 1.0.3 (#33981)
- Respect 300k token limit for embeddings API requests #33668
- fix create_agent / response_format for Responses API #33939
- fix response.incomplete event is not handled when using
stream_mode=['messages'] #33871
langchain-openai==1.0.3
2025-11-14 19:18:50 -05:00
Mason Daugherty
099c042395 refactor(openai): embedding utils and calculations (#33982)
Now returns (`_iter`, `tokens`, `indices`, token_counts`). The
`token_counts` are calculated directly during tokenization, which is
more accurate and efficient than splitting strings later.
2025-11-14 19:18:37 -05:00
Kaparthy Reddy
2d4f00a451 fix(openai): Respect 300k token limit for embeddings API requests (#33668)
## Description

Fixes #31227 - Resolves the issue where `OpenAIEmbeddings` exceeds
OpenAI's 300,000 token per request limit, causing 400 BadRequest errors.

## Problem

When embedding large document sets, LangChain would send batches
containing more than 300,000 tokens in a single API request, causing
this error:
```
openai.BadRequestError: Error code: 400 - {'error': {'message': 'Requested 673477 tokens, max 300000 tokens per request'}}
```

The issue occurred because:
- The code chunks texts by `embedding_ctx_length` (8191 tokens per
chunk)
- Then batches chunks by `chunk_size` (default 1000 chunks per request)
- **But didn't check**: Total tokens per batch against OpenAI's 300k
limit
- Result: `1000 chunks × 8191 tokens = 8,191,000 tokens` → Exceeds
limit!

## Solution

This PR implements dynamic batching that respects the 300k token limit:

1. **Added constant**: `MAX_TOKENS_PER_REQUEST = 300000`
2. **Track token counts**: Calculate actual tokens for each chunk
3. **Dynamic batching**: Instead of fixed `chunk_size` batches,
accumulate chunks until approaching the 300k limit
4. **Applied to both sync and async**: Fixed both
`_get_len_safe_embeddings` and `_aget_len_safe_embeddings`

## Changes

- Modified `langchain_openai/embeddings/base.py`:
  - Added `MAX_TOKENS_PER_REQUEST` constant
  - Replaced fixed-size batching with token-aware dynamic batching
  - Applied to both sync (line ~478) and async (line ~527) methods
- Added test in `tests/unit_tests/embeddings/test_base.py`:
- `test_embeddings_respects_token_limit()` - Verifies large document
sets are properly batched

## Testing

All existing tests pass (280 passed, 4 xfailed, 1 xpassed).

New test verifies:
- Large document sets (500 texts × 1000 tokens = 500k tokens) are split
into multiple API calls
- Each API call respects the 300k token limit

## Usage

After this fix, users can embed large document sets without errors:
```python
from langchain_openai import OpenAIEmbeddings
from langchain_chroma import Chroma
from langchain_text_splitters import CharacterTextSplitter

# This will now work without exceeding token limits
embeddings = OpenAIEmbeddings()
documents = CharacterTextSplitter().split_documents(large_documents)
Chroma.from_documents(documents, embeddings)
```

Resolves #31227

---------

Co-authored-by: Kaparthy Reddy <kaparthyreddy@Kaparthys-MacBook-Air.local>
Co-authored-by: Chester Curme <chester.curme@gmail.com>
Co-authored-by: Mason Daugherty <mason@langchain.dev>
Co-authored-by: Mason Daugherty <github@mdrxy.com>
2025-11-14 18:12:07 -05:00
Sydney Runkle
9bd401a6d4 fix: resumable shell, works w/ interrupts (#33978)
fixes https://github.com/langchain-ai/langchain/issues/33684

Now able to run this minimal snippet successfully

```py
import os

from langchain.agents import create_agent
from langchain.agents.middleware import (
    HostExecutionPolicy,
    HumanInTheLoopMiddleware,
    ShellToolMiddleware,
)
from langgraph.checkpoint.memory import InMemorySaver
from langgraph.types import Command


shell_middleware = ShellToolMiddleware(
    workspace_root=os.getcwd(),
    env=os.environ,  # danger
    execution_policy=HostExecutionPolicy()
)

hil_middleware = HumanInTheLoopMiddleware(interrupt_on={"shell": True})

checkpointer = InMemorySaver()

agent = create_agent(
    "openai:gpt-4.1-mini",
    middleware=[shell_middleware, hil_middleware],
    checkpointer=checkpointer,
)

input_message = {"role": "user", "content": "run `which python`"}

config = {"configurable": {"thread_id": "1"}}

result = agent.invoke(
    {"messages": [input_message]},
    config=config,
    durability="exit",
)
```
2025-11-14 15:32:25 -05:00
Sydney Runkle
62c05e09c1 readable tests 2025-11-14 14:34:24 -05:00
Sydney Runkle
83b9d9f810 parametrize 2025-11-14 14:31:28 -05:00
ccurme
6aa3794b74 feat(langchain): reference model profiles for provider strategy (#33974) 2025-11-14 19:24:18 +00:00
Sydney Runkle
6deee23d8d Revert "cleanup"
This reverts commit 690aabe8d4.
2025-11-14 14:14:32 -05:00
Sydney Runkle
690aabe8d4 cleanup 2025-11-14 14:12:14 -05:00
Sydney Runkle
80554df1e6 docs 2025-11-14 14:00:15 -05:00
Sydney Runkle
72c45e65e8 summarization edits 2025-11-14 13:56:26 -05:00
Sydney Runkle
189dcf7295 chore: increase coverage for shell, filesystem, and summarization middleware (#33928)
cc generated, just a start here but wanted to bump things up from 70%
ish
2025-11-14 13:30:36 -05:00
Sydney Runkle
1bc88028e6 fix(anthropic): execute bash + file tools via tool node (#33960)
* use `override` instead of directly patching things on `ModelRequest`
* rely on `ToolNode` for execution of tools related to said middleware,
using `wrap_model_call` to inject the relevant claude tool specs +
allowing tool node to forward them along to corresponding langchain tool
implementations
* making the same change for the native shell tool middleware
* allowing shell tool middleware to specify a name for the shell tool
(negative diff then for claude bash middleware)


long term I think the solution might be to attach metadata to a tool to
map the provider spec to a langchain implementation, which we could also
take some lessons from on the MCP front.
2025-11-14 13:17:01 -05:00
Mason Daugherty
d2942351ce release(core): 1.0.5 (#33973) langchain-core==1.0.5 2025-11-14 11:51:27 -05:00
Sydney Runkle
83c078f363 fix: adding missing async hooks (#33957)
* filling in missing async gaps
* using recommended tool runtime injection instead of injected state
  * updating tests to use helper function as well
2025-11-14 09:13:39 -05:00
ZhangShenao
26d39ffc4a docs: Fix doc links (#33964) 2025-11-14 09:07:32 -05:00
Mason Daugherty
421e2ceeee fix(core): don't mask exceptions (#33959) 2025-11-14 09:05:29 -05:00
Mason Daugherty
275dcbf69f docs(core): add clarity to base token counting methods (#33958)
Wasn't immediately obvious that `get_num_tokens_from_messages` adds
additional prefixes to represent user roles in conversation, which adds
to the overall token count.

```python
from langchain_google_genai import GoogleGenerativeAI

llm = GoogleGenerativeAI(model="gemini-2.5-flash")
num_tokens = llm.get_num_tokens("Hello, world!")
print(f"Number of tokens: {num_tokens}")
# Number of tokens: 4
```

```python
from langchain.messages import HumanMessage

messages = [HumanMessage(content="Hello, world!")]

num_tokens = llm.get_num_tokens_from_messages(messages)
print(f"Number of tokens: {num_tokens}")
# Number of tokens: 6
```
2025-11-13 17:15:47 -05:00
Sydney Runkle
9f87b27a5b fix: add filesystem middleware in init (#33955) 2025-11-13 15:07:33 -05:00
Mason Daugherty
b2e1196e29 chore(core,infra): nits (#33954) 2025-11-13 14:50:54 -05:00
Sydney Runkle
2dc1396380 chore(langchain): update deps (#33951) 2025-11-13 14:21:25 -05:00
Mason Daugherty
77941ab3ce feat(infra): add automatic issue labeling (#33952) 2025-11-13 14:13:52 -05:00
Mason Daugherty
ee19a30dde fix(groq): bump min ver for core dep (#33949)
Due to issue with unit tests and docs URL for exceptions
langchain-groq==1.0.1
2025-11-13 11:46:54 -05:00
Mason Daugherty
5d799b3174 release(nomic): 1.0.1 (#33948)
support Python 3.14 #33655
langchain-nomic==1.0.1
2025-11-13 11:25:39 -05:00
Mason Daugherty
8f33a985a2 release(groq): 1.0.1 (#33947)
- fix: handle tool calls with no args #33896
- add prompt caching token usage details #33708
2025-11-13 11:25:00 -05:00
Mason Daugherty
78eeccef0e release(deepseek): 1.0.1 (#33946)
- support strict beta structured output #32727
langchain-deepseek==1.0.1
2025-11-13 11:24:39 -05:00
ccurme
3d415441e8 fix(langchain, openai): backward compat for response_format (#33945) 2025-11-13 11:11:35 -05:00
ccurme
74385e0ebd fix(langchain, openai): fix create_agent / response_format for Responses API (#33939) 2025-11-13 10:18:15 -05:00
Christophe Bornet
2bfbc29ccc chore(core): fix some ruff TC rules (#33929)
fix some ruff TC rules but still don't enforce them as Pydantic model
fields use type annotations at runtime.
2025-11-12 14:07:19 -05:00
Christophe Bornet
ef79c26f18 chore(cli,standard-tests,text-splitters): fix some ruff TC rules (#33934)
Co-authored-by: Mason Daugherty <mason@langchain.dev>
2025-11-12 14:06:31 -05:00
ccurme
fbe32c8e89 release(anthropic): 1.0.3 (#33935) langchain-anthropic==1.0.3 2025-11-12 10:55:28 -05:00
Mohammad Mohtashim
2511c28f92 feat(anthropic): support code_execution_20250825 (#33925) 2025-11-12 10:44:51 -05:00
Sydney Runkle
637bb1cbbc feat: refactor tests coverage (#33927)
middleware tests have gotten quite unwieldy, major restructuring, sets
the stage for coverage increase

this is super hard to review -- as a proof that we've retained important
tests, I ran coverage on `master` and this branch and confirmed
identical coverage.

* moving all middleware related tests to `agents/middleware` folder
* consolidating related test files
* adding coverage utility to makefile
2025-11-11 10:40:12 -05:00
Mason Daugherty
3dfea96ec1 chore: update README.md files (#33919) 2025-11-10 22:51:35 -05:00
ccurme
68643153e5 feat(langchain): support async summarization in SummarizationMiddleware (#33918) 2025-11-10 15:48:51 -05:00
Abbas Syed
462762f75b test(core): add comprehensive tests for groq block translator (#33906) 2025-11-10 15:45:36 -05:00
ccurme
4f3729c004 release(model-profiles): 0.0.4 (#33917) langchain-model-profiles==0.0.4 2025-11-10 12:06:32 -05:00
Mason Daugherty
ba428cdf54 chore(infra): add note to pr linting workflow (#33916) 2025-11-10 11:49:31 -05:00
Mason Daugherty
69c7d1b01b test(groq,openai): add retries for flaky tests (#33914) 2025-11-10 10:36:11 -05:00
Mason Daugherty
733299ec13 revert(core): "applied secrets_map in load to plain string values" (#33913)
Reverts langchain-ai/langchain#33678

Breaking API change
2025-11-10 10:29:30 -05:00
ccurme
e1adf781c6 feat(langchain): (SummarizationMiddleware) support use of model context windows when triggering summarization (#33825) 2025-11-10 10:08:52 -05:00
Shahroz Ahmad
31b5e4810c feat(deepseek): support strict beta structured output (#32727)
**Description:** This PR adds support for DeepSeek's beta strict mode
feature for structured
outputs and tool calling. It overrides `bind_tools()` and
`with_structured_output()` to automatically use
DeepSeek's beta endpoint (https://api.deepseek.com/beta) when
`strict=True`. Both methods need overriding because they're independent
entry points and user can call either directly. When DeepSeek's strict
mode graduates from beta, we can just remove both overriden methods. You
can read more about the beta feature here:
https://api-docs.deepseek.com/guides/function_calling#strict-mode-beta
  
**Issue:** Implements #32670 


**Dependencies:** None


**Sample Code**

```python
from langchain_deepseek import ChatDeepSeek
from pydantic import BaseModel, Field
from typing import Optional
import os


# Enter your DeepSeek API Key here
API_KEY = "YOUR_API_KEY"


# location, temperature, condition are required fields
# humidity is optional field with default value
class WeatherInfo(BaseModel):
    location: str = Field(description="City name")
    temperature: int = Field(description="Temperature in Celsius")
    condition: str = Field(description="Weather condition (sunny, cloudy, rainy)")
    humidity: Optional[int] = Field(default=None, description="Humidity percentage")


llm = ChatDeepSeek(
    model="deepseek-chat",
    api_key=API_KEY,
)

# just to confirm that a new instance will use the default base url (instead of beta)
print(f"Default API base: {llm.api_base}")



# Test 1: bind_tools with strict=True shoud list all the tools calls
print("\nTest 1: bind_tools with strict=True")
llm_with_tools = llm.bind_tools([WeatherInfo], strict=True)
response = llm_with_tools.invoke("Tell me the weather in New York. It's 22 degrees, sunny.")
print(response.tool_calls)



# Test 2: with_structured_output with strict=True
print("\nTest 2: with_structured_output with strict=True")
structured_llm = llm.with_structured_output(WeatherInfo, strict=True)
result = structured_llm.invoke("Tell me the weather in New York.")
print(f"  Result: {result}")
assert isinstance(result, WeatherInfo), "Result should be a WeatherInfo instance"
```

---------

Co-authored-by: Mason Daugherty <mason@langchain.dev>
Co-authored-by: Mason Daugherty <github@mdrxy.com>
2025-11-09 22:24:33 -05:00
Mason Daugherty
c6801fe159 chore: fix URL underlining in README.md (#33905) 2025-11-09 22:22:56 -05:00
AmazingcatAndrew
1b563067f8 fix(chroma): resolve OpenCLIP + Chroma image embedding test regression (#33899)
**Description:**  
Fixes the OpenCLIP × Chroma regression that caused nested embedding
errors when adding or searching image data.
The test case `test_openclip_chroma_embed_no_nesting_error` has been
restored and verified to work correctly with the current LangChain core
dependencies.
Functional validation confirms that `similarity_search_by_image` now
returns correct, metadata‑preserving results.

**Issue:**  
Fixes #33851

**Dependencies:**  
No new dependencies introduced.  

**Testing:**  
All tests under  
```bash
uv run --group test pytest tests/unit_tests
```  
result:
```
30 passed in 91.26s (0:01:31)
```
have passed successfully using Python 3.13.9 and uv‑managed environment.
This confirms that the regression has been fixed.  

Running  
```bash
make test
```  
still produces cleanup‑time `AttributeError: 'ProactorEventLoop' object
has no attribute '_ssock'` on Windows (Python 3.13+).
This is a benign asyncio teardown message rather than a functional
failure.
`uv run pytest` closes event loops immediately after tests, while `make
test` invokes pytest through a secondary process layer that leaves a
background loop alive at interpreter shutdown.
This difference in teardown behavior explains the extra messages seen
only when using `make test`.

**Summary:**  
- Verified the OpenCLIP + Chroma image pipeline works correctly.  
- `uv run --group test pytest` fully passes; the fix is complete.  
- The residual `_ssock` warnings occur only during
Windows asyncio cleanup and are not related to this code change.

This is my first time contributing code, please contact me with any
questions

---

---------

Co-authored-by: Mason Daugherty <mason@langchain.dev>
Co-authored-by: Mason Daugherty <github@mdrxy.com>
2025-11-09 21:24:33 -05:00
Mason Daugherty
1996d81d72 chore(langchain): pass on reference docstrings (middleware) (#33904) 2025-11-09 21:18:28 -05:00
Mason Daugherty
ab0677c6f1 fix(groq): handle tool calls with no args (#33896)
When Groq returns tool calls with no arguments, it sends arguments:
`'null'` (JSON null), but LangChain's core parsing expects either a dict
or converts null to Python None, which fails the `isinstance(args_,
dict)` check and incorrectly marks the tool call as invalid.

Related to #32017
2025-11-08 22:30:44 -05:00
artreimus
bdb53c93cc docs(langchain): correct IBM provider link in chat_models docstring (#33897)
**PR title**

```
docs(langchain): correct IBM provider link in chat_models docstring
```

**PR message**

**Description**
Fix broken link in the `chat_models` docstring. The **ibm** bullet
incorrectly linked to the DeepSeek provider page; update it to the
canonical IBM provider docs.

This only affects generated API reference content on
`reference.langchain.com`. No runtime behavior changes.

**Issue**
N/A (documentation-only).

**Dependencies**
None.

**Testing & quality**

* Ran `make format`, `make lint`, and `make test` in the package (no
code changes expected to affect tests).
2025-11-08 07:02:33 -06:00
Alazar Genene
94d5271cb5 fix(standard-tests): fix semantic typo in if statement (#33890) langchain==1.0.5 2025-11-07 18:01:59 -05:00