Kaparthy Reddy 2d4f00a451 fix(openai): Respect 300k token limit for embeddings API requests (#33668)
## Description

Fixes #31227 - Resolves the issue where `OpenAIEmbeddings` exceeds
OpenAI's 300,000 token per request limit, causing 400 BadRequest errors.

## Problem

When embedding large document sets, LangChain would send batches
containing more than 300,000 tokens in a single API request, causing
this error:
```
openai.BadRequestError: Error code: 400 - {'error': {'message': 'Requested 673477 tokens, max 300000 tokens per request'}}
```

The issue occurred because:
- The code chunks texts by `embedding_ctx_length` (8191 tokens per
chunk)
- Then batches chunks by `chunk_size` (default 1000 chunks per request)
- **But didn't check**: Total tokens per batch against OpenAI's 300k
limit
- Result: `1000 chunks × 8191 tokens = 8,191,000 tokens` → Exceeds
limit!

## Solution

This PR implements dynamic batching that respects the 300k token limit:

1. **Added constant**: `MAX_TOKENS_PER_REQUEST = 300000`
2. **Track token counts**: Calculate actual tokens for each chunk
3. **Dynamic batching**: Instead of fixed `chunk_size` batches,
accumulate chunks until approaching the 300k limit
4. **Applied to both sync and async**: Fixed both
`_get_len_safe_embeddings` and `_aget_len_safe_embeddings`

## Changes

- Modified `langchain_openai/embeddings/base.py`:
  - Added `MAX_TOKENS_PER_REQUEST` constant
  - Replaced fixed-size batching with token-aware dynamic batching
  - Applied to both sync (line ~478) and async (line ~527) methods
- Added test in `tests/unit_tests/embeddings/test_base.py`:
- `test_embeddings_respects_token_limit()` - Verifies large document
sets are properly batched

## Testing

All existing tests pass (280 passed, 4 xfailed, 1 xpassed).

New test verifies:
- Large document sets (500 texts × 1000 tokens = 500k tokens) are split
into multiple API calls
- Each API call respects the 300k token limit

## Usage

After this fix, users can embed large document sets without errors:
```python
from langchain_openai import OpenAIEmbeddings
from langchain_chroma import Chroma
from langchain_text_splitters import CharacterTextSplitter

# This will now work without exceeding token limits
embeddings = OpenAIEmbeddings()
documents = CharacterTextSplitter().split_documents(large_documents)
Chroma.from_documents(documents, embeddings)
```

Resolves #31227

---------

Co-authored-by: Kaparthy Reddy <kaparthyreddy@Kaparthys-MacBook-Air.local>
Co-authored-by: Chester Curme <chester.curme@gmail.com>
Co-authored-by: Mason Daugherty <mason@langchain.dev>
Co-authored-by: Mason Daugherty <github@mdrxy.com>
2025-11-14 18:12:07 -05:00
2025-11-13 14:50:54 -05:00
2025-10-06 14:21:53 -04:00
2025-10-15 18:46:55 -04:00
2025-10-15 18:46:55 -04:00
2025-10-02 18:46:26 -04:00
2025-10-02 18:46:26 -04:00

The platform for reliable agents.

PyPI - License PyPI - Downloads Version Open in Dev Containers Open in Github Codespace CodSpeed Badge Twitter / X

LangChain is a framework for building agents and LLM-powered applications. It helps you chain together interoperable components and third-party integrations to simplify AI application development all while future-proofing decisions as the underlying technology evolves.

pip install langchain

If you're looking for more advanced customization or agent orchestration, check out LangGraph, our framework for building controllable agent workflows.


Documentation:

Discussions: Visit the LangChain Forum to connect with the community and share all of your technical questions, ideas, and feedback.

Note

Looking for the JS/TS library? Check out LangChain.js.

Why use LangChain?

LangChain helps developers build applications powered by LLMs through a standard interface for models, embeddings, vector stores, and more.

Use LangChain for:

  • Real-time data augmentation. Easily connect LLMs to diverse data sources and external/internal systems, drawing from LangChain's vast library of integrations with model providers, tools, vector stores, retrievers, and more.
  • Model interoperability. Swap models in and out as your engineering team experiments to find the best choice for your application's needs. As the industry frontier evolves, adapt quickly LangChain's abstractions keep you moving without losing momentum.
  • Rapid prototyping. Quickly build and iterate on LLM applications with LangChain's modular, component-based architecture. Test different approaches and workflows without rebuilding from scratch, accelerating your development cycle.
  • Production-ready features. Deploy reliable applications with built-in support for monitoring, evaluation, and debugging through integrations like LangSmith. Scale with confidence using battle-tested patterns and best practices.
  • Vibrant community and ecosystem. Leverage a rich ecosystem of integrations, templates, and community-contributed components. Benefit from continuous improvements and stay up-to-date with the latest AI developments through an active open-source community.
  • Flexible abstraction layers. Work at the level of abstraction that suits your needs - from high-level chains for quick starts to low-level components for fine-grained control. LangChain grows with your application's complexity.

LangChain ecosystem

While the LangChain framework can be used standalone, it also integrates seamlessly with any LangChain product, giving developers a full suite of tools when building LLM applications.

To improve your LLM application development, pair LangChain with:

  • LangGraph Build agents that can reliably handle complex tasks with LangGraph, our low-level agent orchestration framework. LangGraph offers customizable architecture, long-term memory, and human-in-the-loop workflows and is trusted in production by companies like LinkedIn, Uber, Klarna, and GitLab.
  • Integrations List of LangChain integrations, including chat & embedding models, tools & toolkits, and more
  • LangSmith Helpful for agent evals and observability. Debug poor-performing LLM app runs, evaluate agent trajectories, gain visibility in production, and improve performance over time.
  • LangSmith Deployment Deploy and scale agents effortlessly with a purpose-built deployment platform for long-running, stateful workflows. Discover, reuse, configure, and share agents across teams and iterate quickly with visual prototyping in LangSmith Studio.
  • Deep Agents (new!) Build agents that can plan, use subagents, and leverage file systems for complex tasks

Additional resources

  • API Reference Detailed reference on navigating base packages and integrations for LangChain.
  • Contributing Guide Learn how to contribute to LangChain projects and find good first issues.
  • Code of Conduct Our community guidelines and standards for participation.
Description
Building applications with LLMs through composability
Readme MIT Cite this repository 4.8 GiB
Languages
Python 82.7%
omnetpp-msg 16.2%
Jupyter Notebook 0.5%
Makefile 0.4%