langchain

mirror of https://github.com/hwchase17/langchain.git synced 2026-03-18 11:07:36 +00:00

Author	SHA1	Message	Date
Kaparthy Reddy	2d4f00a451	fix(openai): Respect 300k token limit for embeddings API requests (#33668 ) ## Description Fixes #31227 - Resolves the issue where `OpenAIEmbeddings` exceeds OpenAI's 300,000 token per request limit, causing 400 BadRequest errors. ## Problem When embedding large document sets, LangChain would send batches containing more than 300,000 tokens in a single API request, causing this error: ``` openai.BadRequestError: Error code: 400 - {'error': {'message': 'Requested 673477 tokens, max 300000 tokens per request'}} ``` The issue occurred because: - The code chunks texts by `embedding_ctx_length` (8191 tokens per chunk) - Then batches chunks by `chunk_size` (default 1000 chunks per request) - But didn't check: Total tokens per batch against OpenAI's 300k limit - Result: `1000 chunks × 8191 tokens = 8,191,000 tokens` → Exceeds limit! ## Solution This PR implements dynamic batching that respects the 300k token limit: 1. Added constant: `MAX_TOKENS_PER_REQUEST = 300000` 2. Track token counts: Calculate actual tokens for each chunk 3. Dynamic batching: Instead of fixed `chunk_size` batches, accumulate chunks until approaching the 300k limit 4. Applied to both sync and async: Fixed both `_get_len_safe_embeddings` and `_aget_len_safe_embeddings` ## Changes - Modified `langchain_openai/embeddings/base.py`: - Added `MAX_TOKENS_PER_REQUEST` constant - Replaced fixed-size batching with token-aware dynamic batching - Applied to both sync (line ~478) and async (line ~527) methods - Added test in `tests/unit_tests/embeddings/test_base.py`: - `test_embeddings_respects_token_limit()` - Verifies large document sets are properly batched ## Testing All existing tests pass (280 passed, 4 xfailed, 1 xpassed). New test verifies: - Large document sets (500 texts × 1000 tokens = 500k tokens) are split into multiple API calls - Each API call respects the 300k token limit ## Usage After this fix, users can embed large document sets without errors: ```python from langchain_openai import OpenAIEmbeddings from langchain_chroma import Chroma from langchain_text_splitters import CharacterTextSplitter # This will now work without exceeding token limits embeddings = OpenAIEmbeddings() documents = CharacterTextSplitter().split_documents(large_documents) Chroma.from_documents(documents, embeddings) ``` Resolves #31227 --------- Co-authored-by: Kaparthy Reddy <kaparthyreddy@Kaparthys-MacBook-Air.local> Co-authored-by: Chester Curme <chester.curme@gmail.com> Co-authored-by: Mason Daugherty <mason@langchain.dev> Co-authored-by: Mason Daugherty <github@mdrxy.com>	2025-11-14 18:12:07 -05:00
Sydney Runkle	9bd401a6d4	fix: resumable shell, works w/ interrupts (#33978 ) fixes https://github.com/langchain-ai/langchain/issues/33684 Now able to run this minimal snippet successfully ```py import os from langchain.agents import create_agent from langchain.agents.middleware import ( HostExecutionPolicy, HumanInTheLoopMiddleware, ShellToolMiddleware, ) from langgraph.checkpoint.memory import InMemorySaver from langgraph.types import Command shell_middleware = ShellToolMiddleware( workspace_root=os.getcwd(), env=os.environ, # danger execution_policy=HostExecutionPolicy() ) hil_middleware = HumanInTheLoopMiddleware(interrupt_on={"shell": True}) checkpointer = InMemorySaver() agent = create_agent( "openai:gpt-4.1-mini", middleware=[shell_middleware, hil_middleware], checkpointer=checkpointer, ) input_message = {"role": "user", "content": "run `which python`"} config = {"configurable": {"thread_id": "1"}} result = agent.invoke( {"messages": [input_message]}, config=config, durability="exit", ) ```	2025-11-14 15:32:25 -05:00
ccurme	6aa3794b74	feat(langchain): reference model profiles for provider strategy (#33974 )	2025-11-14 19:24:18 +00:00
Sydney Runkle	189dcf7295	chore: increase coverage for shell, filesystem, and summarization middleware (#33928 ) cc generated, just a start here but wanted to bump things up from 70% ish	2025-11-14 13:30:36 -05:00
Sydney Runkle	1bc88028e6	fix(anthropic): execute bash + file tools via tool node (#33960 ) * use `override` instead of directly patching things on `ModelRequest` * rely on `ToolNode` for execution of tools related to said middleware, using `wrap_model_call` to inject the relevant claude tool specs + allowing tool node to forward them along to corresponding langchain tool implementations * making the same change for the native shell tool middleware * allowing shell tool middleware to specify a name for the shell tool (negative diff then for claude bash middleware) long term I think the solution might be to attach metadata to a tool to map the provider spec to a langchain implementation, which we could also take some lessons from on the MCP front.	2025-11-14 13:17:01 -05:00
Mason Daugherty	d2942351ce	release(core): 1.0.5 (#33973 )	2025-11-14 11:51:27 -05:00
Sydney Runkle	83c078f363	fix: adding missing async hooks (#33957 ) * filling in missing async gaps * using recommended tool runtime injection instead of injected state * updating tests to use helper function as well	2025-11-14 09:13:39 -05:00
ZhangShenao	26d39ffc4a	docs: Fix doc links (#33964 )	2025-11-14 09:07:32 -05:00
Mason Daugherty	421e2ceeee	fix(core): don't mask exceptions (#33959 )	2025-11-14 09:05:29 -05:00
Mason Daugherty	275dcbf69f	docs(core): add clarity to base token counting methods (#33958 ) Wasn't immediately obvious that `get_num_tokens_from_messages` adds additional prefixes to represent user roles in conversation, which adds to the overall token count. ```python from langchain_google_genai import GoogleGenerativeAI llm = GoogleGenerativeAI(model="gemini-2.5-flash") num_tokens = llm.get_num_tokens("Hello, world!") print(f"Number of tokens: {num_tokens}") # Number of tokens: 4 ``` ```python from langchain.messages import HumanMessage messages = [HumanMessage(content="Hello, world!")] num_tokens = llm.get_num_tokens_from_messages(messages) print(f"Number of tokens: {num_tokens}") # Number of tokens: 6 ```	2025-11-13 17:15:47 -05:00
Sydney Runkle	9f87b27a5b	fix: add filesystem middleware in init (#33955 )	2025-11-13 15:07:33 -05:00
Mason Daugherty	b2e1196e29	chore(core,infra): nits (#33954 )	2025-11-13 14:50:54 -05:00
Sydney Runkle	2dc1396380	chore(langchain): update deps (#33951 )	2025-11-13 14:21:25 -05:00
Mason Daugherty	ee19a30dde	fix(groq): bump min ver for `core` dep (#33949 ) Due to issue with unit tests and docs URL for exceptions	2025-11-13 11:46:54 -05:00
Mason Daugherty	5d799b3174	release(nomic): 1.0.1 (#33948 ) support Python 3.14 #33655	2025-11-13 11:25:39 -05:00
Mason Daugherty	8f33a985a2	release(groq): 1.0.1 (#33947 ) - fix: handle tool calls with no args #33896 - add prompt caching token usage details #33708	2025-11-13 11:25:00 -05:00
Mason Daugherty	78eeccef0e	release(deepseek): 1.0.1 (#33946 ) - support strict beta structured output #32727	2025-11-13 11:24:39 -05:00
ccurme	3d415441e8	fix(langchain, openai): backward compat for response_format (#33945 )	2025-11-13 11:11:35 -05:00
ccurme	74385e0ebd	fix(langchain, openai): fix create_agent / response_format for Responses API (#33939 )	2025-11-13 10:18:15 -05:00
Christophe Bornet	2bfbc29ccc	chore(core): fix some ruff TC rules (#33929 ) fix some ruff TC rules but still don't enforce them as Pydantic model fields use type annotations at runtime.	2025-11-12 14:07:19 -05:00
Christophe Bornet	ef79c26f18	chore(cli,standard-tests,text-splitters): fix some ruff TC rules (#33934 ) Co-authored-by: Mason Daugherty <mason@langchain.dev>	2025-11-12 14:06:31 -05:00
ccurme	fbe32c8e89	release(anthropic): 1.0.3 (#33935 )	2025-11-12 10:55:28 -05:00
Mohammad Mohtashim	2511c28f92	feat(anthropic): support code_execution_20250825 (#33925 )	2025-11-12 10:44:51 -05:00
Sydney Runkle	637bb1cbbc	feat: refactor tests coverage (#33927 ) middleware tests have gotten quite unwieldy, major restructuring, sets the stage for coverage increase this is super hard to review -- as a proof that we've retained important tests, I ran coverage on `master` and this branch and confirmed identical coverage. * moving all middleware related tests to `agents/middleware` folder * consolidating related test files * adding coverage utility to makefile	2025-11-11 10:40:12 -05:00
Mason Daugherty	3dfea96ec1	chore: update `README.md` files (#33919 )	2025-11-10 22:51:35 -05:00
ccurme	68643153e5	feat(langchain): support async summarization in SummarizationMiddleware (#33918 )	2025-11-10 15:48:51 -05:00
Abbas Syed	462762f75b	test(core): add comprehensive tests for groq block translator (#33906 )	2025-11-10 15:45:36 -05:00
ccurme	4f3729c004	release(model-profiles): 0.0.4 (#33917 )	2025-11-10 12:06:32 -05:00
Mason Daugherty	69c7d1b01b	test(groq,openai): add retries for flaky tests (#33914 )	2025-11-10 10:36:11 -05:00
Mason Daugherty	733299ec13	revert(core): "applied `secrets_map` in `load` to plain string values" (#33913 ) Reverts langchain-ai/langchain#33678 Breaking API change	2025-11-10 10:29:30 -05:00
ccurme	e1adf781c6	feat(langchain): (SummarizationMiddleware) support use of model context windows when triggering summarization (#33825 )	2025-11-10 10:08:52 -05:00
Shahroz Ahmad	31b5e4810c	feat(deepseek): support `strict` beta structured output (#32727 ) Description: This PR adds support for DeepSeek's beta strict mode feature for structured outputs and tool calling. It overrides `bind_tools()` and `with_structured_output()` to automatically use DeepSeek's beta endpoint (https://api.deepseek.com/beta) when `strict=True`. Both methods need overriding because they're independent entry points and user can call either directly. When DeepSeek's strict mode graduates from beta, we can just remove both overriden methods. You can read more about the beta feature here: https://api-docs.deepseek.com/guides/function_calling#strict-mode-beta Issue: Implements #32670 Dependencies: None Sample Code ```python from langchain_deepseek import ChatDeepSeek from pydantic import BaseModel, Field from typing import Optional import os # Enter your DeepSeek API Key here API_KEY = "YOUR_API_KEY" # location, temperature, condition are required fields # humidity is optional field with default value class WeatherInfo(BaseModel): location: str = Field(description="City name") temperature: int = Field(description="Temperature in Celsius") condition: str = Field(description="Weather condition (sunny, cloudy, rainy)") humidity: Optional[int] = Field(default=None, description="Humidity percentage") llm = ChatDeepSeek( model="deepseek-chat", api_key=API_KEY, ) # just to confirm that a new instance will use the default base url (instead of beta) print(f"Default API base: {llm.api_base}") # Test 1: bind_tools with strict=True shoud list all the tools calls print("\nTest 1: bind_tools with strict=True") llm_with_tools = llm.bind_tools([WeatherInfo], strict=True) response = llm_with_tools.invoke("Tell me the weather in New York. It's 22 degrees, sunny.") print(response.tool_calls) # Test 2: with_structured_output with strict=True print("\nTest 2: with_structured_output with strict=True") structured_llm = llm.with_structured_output(WeatherInfo, strict=True) result = structured_llm.invoke("Tell me the weather in New York.") print(f" Result: {result}") assert isinstance(result, WeatherInfo), "Result should be a WeatherInfo instance" ``` --------- Co-authored-by: Mason Daugherty <mason@langchain.dev> Co-authored-by: Mason Daugherty <github@mdrxy.com>	2025-11-09 22:24:33 -05:00
AmazingcatAndrew	1b563067f8	fix(chroma): resolve OpenCLIP + Chroma image embedding test regression (#33899 ) Description: Fixes the OpenCLIP × Chroma regression that caused nested embedding errors when adding or searching image data. The test case `test_openclip_chroma_embed_no_nesting_error` has been restored and verified to work correctly with the current LangChain core dependencies. Functional validation confirms that `similarity_search_by_image` now returns correct, metadata‑preserving results. Issue: Fixes #33851 Dependencies: No new dependencies introduced. Testing: All tests under ```bash uv run --group test pytest tests/unit_tests ``` result: ``` 30 passed in 91.26s (0:01:31) ``` have passed successfully using Python 3.13.9 and uv‑managed environment. This confirms that the regression has been fixed. Running ```bash make test ``` still produces cleanup‑time `AttributeError: 'ProactorEventLoop' object has no attribute '_ssock'` on Windows (Python 3.13+). This is a benign asyncio teardown message rather than a functional failure. `uv run pytest` closes event loops immediately after tests, while `make test` invokes pytest through a secondary process layer that leaves a background loop alive at interpreter shutdown. This difference in teardown behavior explains the extra messages seen only when using `make test`. Summary: - Verified the OpenCLIP + Chroma image pipeline works correctly. - `uv run --group test pytest` fully passes; the fix is complete. - The residual `_ssock` warnings occur only during Windows asyncio cleanup and are not related to this code change. This is my first time contributing code, please contact me with any questions --- --------- Co-authored-by: Mason Daugherty <mason@langchain.dev> Co-authored-by: Mason Daugherty <github@mdrxy.com>	2025-11-09 21:24:33 -05:00
Mason Daugherty	1996d81d72	chore(langchain): pass on reference docstrings (middleware) (#33904 )	2025-11-09 21:18:28 -05:00
Mason Daugherty	ab0677c6f1	fix(groq): handle tool calls with no args (#33896 ) When Groq returns tool calls with no arguments, it sends arguments: `'null'` (JSON null), but LangChain's core parsing expects either a dict or converts null to Python None, which fails the `isinstance(args_, dict)` check and incorrectly marks the tool call as invalid. Related to #32017	2025-11-08 22:30:44 -05:00
artreimus	bdb53c93cc	docs(langchain): correct IBM provider link in `chat_models` docstring (#33897 ) PR title ``` docs(langchain): correct IBM provider link in chat_models docstring ``` PR message Description Fix broken link in the `chat_models` docstring. The ibm bullet incorrectly linked to the DeepSeek provider page; update it to the canonical IBM provider docs. This only affects generated API reference content on `reference.langchain.com`. No runtime behavior changes. Issue N/A (documentation-only). Dependencies None. Testing & quality * Ran `make format`, `make lint`, and `make test` in the package (no code changes expected to affect tests).	2025-11-08 07:02:33 -06:00
ccurme	e499db4266	release(langchain): 1.0.5 (#33893 )	2025-11-07 17:54:43 -05:00
npage902	cc3af82b47	fix(core): applied `secrets_map` in `load` to plain string values (#33678 ) Replaces #33618 Description: Fixes the bug in the `load()` function where secret placeholders in plain dicts were not replaced, even if they match a key in `secrets_map`, and adds a test case. Example: ```py obj = {"api_key": "__SECRET_API_KEY__"} secret_key = "secret_key_1234" secrets_map = {"__SECRET_API_KEY__": secret_key} result = load(obj, secrets_map=secrets_map) ``` Before this change, printing `api_key` in `result` would output `"__SECRET_API_KEY__"`. Now, it will properly output `"secret_key_1234"`. Issue: Fixes #31804 Dependencies: None `make format`, `make lint`, and `make test` have all passed on my machine. --------- Co-authored-by: Mason Daugherty <mason@langchain.dev> Co-authored-by: Mason Daugherty <github@mdrxy.com>	2025-11-07 17:14:13 -05:00
Mshari	9383b78be1	feat(groq): add prompt caching token usage details (#33708 ) Description: Adds support for prompt caching usage metadata in ChatGroq. The integration now captures cached token information from the Groq API response and includes it in the `input_token_details` field of the `usage_metadata`. Changes: - Created new `_create_usage_metadata()` helper function to centralize usage metadata creation logic - Extracts `cached_tokens` from `prompt_tokens_details` in API responses and maps to `input_token_details.cache_read` - Integrated the helper function in both streaming (`_convert_chunk_to_message_chunk`) and non-streaming (`_create_chat_result`) code paths - Added comprehensive unit tests to verify caching metadata handling and backward compatibility This enables users to monitor prompt caching effectiveness when using Groq models with prompt caching enabled. Issue: N/A Dependencies: None --------- Co-authored-by: Mason Daugherty <github@mdrxy.com> Co-authored-by: Mason Daugherty <mason@langchain.dev>	2025-11-07 17:05:22 -05:00
ccurme	3c492571ab	release(anthropic): 1.0.2 (#33888 )	2025-11-07 16:47:25 -05:00
ccurme	f2410f7ea7	revert: Support for SystemMessage in create_agent (#33889 ) Reverts langchain-ai/langchain#33640 Introduces lint errors into langchain-anthropic Should incorporate into 1.1 instead of patch release.	2025-11-07 16:44:11 -05:00
ccurme	b1dd448233	release(core): 1.0.4 (#33886 )	2025-11-07 16:26:44 -05:00
dy93	904daf6f40	feat(core): support draw subgraph using pygraphviz (#32966 ) The `draw_png()` method currently does not support drawing subgraphs. This PR adds the ability to render subgraph outlines, improving visualization clarity when working with nested structures.	2025-11-07 15:58:35 -05:00
Mohammad Mohtashim	8e31a5d7bd	fix(core): Fix tool name check in name_dict for PydanticToolsParser (#33479 ) - Description: The root cause of this issue is that when a user defines `model_config` in a `BaseModel`, the `{"type": <tool_name>}` value is derived from the title specified in `model_config` when the results are parsed [here](https://vscode.dev/github/keenborder786/langchain/blob/fix/tool_name_dict/libs/core/langchain_core/output_parsers/openai_tools.py#L199). However, [tool.__name__](https://vscode.dev/github/keenborder786/langchain/blob/fix/tool_name_dict/libs/core/langchain_core/output_parsers/openai_tools.py#L331) uses the class name (in uppercase) of the `BaseModel`, resulting in a `KeyError` when a custom title is provided in `model_config`. The Best Solution will be to use the title provided in `model_config` attribute if provided one since that is what `type` will be parsed to, if not then use `tool.__name__`. But need to make sure that this works only for Pydantic V2. - Issue: #27260 --------- Co-authored-by: Mason Daugherty <mason@langchain.dev>	2025-11-07 15:39:47 -05:00
Sydney Runkle	ee630b4539	fix: bump up default recursion limit (#33881 ) Fixes https://github.com/langchain-ai/langchain/issues/33740 We don't want to depend on recursion limit here, model call limit middleware is more appropriate	2025-11-07 13:49:12 -06:00
Jacob Lee	46971447df	fix(core): Filter empty content blocks from formatted prompts (#32519 ) Co-authored-by: Mason Daugherty <mason@langchain.dev> Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>	2025-11-07 14:39:25 -05:00
Azibek	d8b94007c1	fix(huggingface): pass llm params to `ChatHuggingFace` (#32368 ) This PR fixes #32234 and improves HuggingFace chat model integration by: Ensuring ChatHuggingFace inherits key parameters (temperature, max_tokens, top_p, streaming, etc.) from the underlying LLM when not explicitly set. Adding and updating unit tests to verify property inheritance. No breaking changes; these updates enhance reliability and maintainability. --------- Co-authored-by: Mason Daugherty <mason@langchain.dev> Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com> Co-authored-by: Mason Daugherty <github@mdrxy.com>	2025-11-07 14:29:15 -05:00
Mohammad Mohtashim	cf595dcc38	chore(langchain): Support for SystemMessage in create_agent (#33640 ) - Description: Updated Function Signature of `create_agent`, the system prompt can be both a list and string. I see no harm in doing this, since SystemMessage accepts both. - Issue: #33630 --------- Co-authored-by: Sydney Runkle <54324534+sydney-runkle@users.noreply.github.com>	2025-11-07 13:00:38 -06:00
Copilot	d27211cfa7	fix(core): context preservation in shielded async callbacks (#32163 ) The `@shielded` decorator in async callback managers was not preserving context variables, breaking OpenTelemetry instrumentation and other context-dependent functionality. ## Problem When using async callbacks with the `@shielded` decorator (applied to methods like `on_llm_end`, `on_chain_end`, etc.), context variables were not being preserved across the shield boundary. This caused issues with: - OpenTelemetry span context propagation - Other instrumentation that relies on context variables - Inconsistent context behavior between sync and async execution The issue was reproducible with: ```python from contextvars import copy_context import asyncio from langgraph.graph import StateGraph # Sync case: context remains consistent print("SYNC") print(copy_context()) # Same object graph.invoke({"result": "init"}) print(copy_context()) # Same object # Async case: context was inconsistent (before fix) print("ASYNC") asyncio.run(graph.ainvoke({"result": "init"})) print(copy_context()) # Different object than expected ``` ## Root Cause The original `shielded` decorator implementation: ```python async def wrapped(args: Any, kwargs: Any) -> Any: return await asyncio.shield(func(args, *kwargs)) ``` Used `asyncio.shield()` directly without preserving the current execution context, causing context variables to be lost. ## Solution Modified the `shielded` decorator to: 1. Capture the current context using `copy_context()` 2. Create a task with explicit context using `asyncio.create_task(coro, context=ctx)` for Python 3.11+ 3. Shield the context-aware task 4. Fallback to regular task creation for Python < 3.11 ```python async def wrapped(args: Any, *kwargs: Any) -> Any: # Capture the current context to preserve context variables ctx = copy_context() coro = func(args, **kwargs) try: # Create a task with the captured context to preserve context variables task = asyncio.create_task(coro, context=ctx) return await asyncio.shield(task) except TypeError: # Python < 3.11 fallback task = asyncio.create_task(coro) return await asyncio.shield(task) ``` ## Testing - Added comprehensive test `test_shielded_callback_context_preservation()` that validates context variables are preserved across shielded callback boundaries - Verified the fix resolves the original LangGraph context consistency issue - Confirmed all existing callback manager tests still pass - Validated OpenTelemetry-like instrumentation scenarios work correctly The fix is minimal, maintains backward compatibility, and ensures proper context preservation for both modern Python versions and older ones. Fixes #31398. <!-- START COPILOT CODING AGENT TIPS --> --- 💬 Share your feedback on Copilot coding agent for the chance to win a $200 gift card! Click [here](https://survey.alchemer.com/s3/8343779/Copilot-Coding-agent) to start the survey. --------- Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com> Co-authored-by: mdrxy <61371264+mdrxy@users.noreply.github.com> Co-authored-by: Mason Daugherty <github@mdrxy.com> Co-authored-by: Mason Daugherty <mason@langchain.dev>	2025-11-07 13:09:47 -05:00
Swastik-Swarup-Dash	ca1a3fbe88	fix(core): `RunnablePick` may not return a dict if keys is a string (#31321 ) Change made From: ```python class RunnablePick(RunnableSerializable[dict[str, Any], dict[str, Any]]): ``` To: ```python class RunnablePick(RunnableSerializable[dict[str, Any], Any]): ``` As suggested by @cbornet Fixes ##31309 --------- Co-authored-by: Mason Daugherty <mason@langchain.dev> Co-authored-by: Mason Daugherty <github@mdrxy.com>	2025-11-07 13:04:20 -05:00

1 2 3 4 5 ...

8034 Commits