- **Description:** Changing the key from `response` to
`structured_response` for middleware agent to keep it sync with agent
without middleware. This a breaking change.
- **Issue:** #33154
Porting the [planning
middleware](39c0138d0f/src/deepagents/middleware.py (L21))
over from deepagents.
Also adding the ability to configure:
* System prompt
* Tool description
```py
from langchain.agents.middleware.planning import PlanningMiddleware
from langchain.agents import create_agent
agent = create_agent("openai:gpt-4o", middleware=[PlanningMiddleware()])
result = await agent.invoke({"messages": [HumanMessage("Help me refactor my codebase")]})
print(result["todos"]) # Array of todo items with status tracking
```
Multiple improvements to HITL flow:
* On a `response` type resume, we should still append the tool call to
the last AIMessage (otherwise we have a ToolResult without a
corresponding ToolCall)
* When all interrupts have `response` types (so there's no pending tool
calls), we should jump back to the first node (instead of end) as we
enforced in the previous `post_model_hook_router`
* Added comments to `model_to_tools` router so clarify all of the
potential exit conditions
Additionally:
* Lockfile update to use latest LG alpha release
* Added test for `jump_to` behaving ephemerally, this was fixed in LG
but surfaced as a bug w/ `jump_to`.
* Bump version to v1.0.0a10 to prep for alpha release
---------
Co-authored-by: Sydney Runkle <sydneymarierunkle@gmail.com>
Co-authored-by: Sydney Runkle <54324534+sydney-runkle@users.noreply.github.com>
Remove redundant/outdated `@pytest.mark.requires("jinja2")` decorator
Pytest marks (like `@pytest.mark.requires(...)`) applied directly to
fixtures have no effect and are deprecated.
Excluded pydantic_v1 module from import testing
Acceptable since this pydantic_v1 is explicitly deprecated. Testing its
importability at this stage serves little purpose since users should
migrate away from it.
## Summary
Adds test coverage for the `stringify_value` utility function to handle
complex nested data structures that weren't previously tested.
## Changes
- Added `test_stringify_value_nested_structures()` to `test_strings.py`
- Tests nested dictionaries within lists
- Tests mixed-type lists with various data types
- Verifies proper stringification of complex nested structures
## Why This Matters
- Fills a gap in test coverage for edge cases
- Ensures `stringify_value` handles complex data structures correctly
- Improves confidence in string utility functions used throughout the
codebase
- Low risk addition that strengthens existing test suite
## Testing
```bash
uv run --group test pytest libs/core/tests/unit_tests/utils/test_strings.py::test_stringify_value_nested_structures -v
```
This test addition follows the project's testing patterns and adds
meaningful coverage without introducing any breaking changes.
---------
Co-authored-by: Mason Daugherty <mason@langchain.dev>
Enhance the pull request workflows by updating the `pull_request_target`
types and ensuring safety by avoiding checkout of the PR's head. Update
the action to use a specific commit from the archived repository.
**Description:** Right now, we interrupt even if the provided ToolConfig
has all false values. We should ignore ToolConfigs which do not have at
least one value marked as true (just as we would if tool_name: False was
passed into the dict).
# Main Changes
1. Adding decorator utilities for dynamically defining middleware with
single hook functions (see an example below for dynamic system prompt)
2. Adding better conditional edge drawing with jump configuration
attached to middleware. Can be registered w/ the decorator new
decorator!
## Decorator Utilities
```py
from langchain.agents.middleware_agent import create_agent, AgentState, ModelRequest
from langchain.agents.middleware.types import modify_model_request
from langchain_core.messages import HumanMessage
from langgraph.checkpoint.memory import InMemorySaver
@modify_model_request
def modify_system_prompt(request: ModelRequest, state: AgentState) -> ModelRequest:
request.system_prompt = (
"You are a helpful assistant."
f"Please record the number of previous messages in your response: {len(state['messages'])}"
)
return request
agent = create_agent(
model="openai:gpt-4o-mini",
middleware=[modify_system_prompt]
).compile(checkpointer=InMemorySaver())
```
## Visualization and Routing improvements
We now require that middlewares define the valid jumps for each hook.
If using the new decorator syntax, this can be done with:
```py
@before_model(jump_to=["__end__"])
@after_model(jump_to=["tools", "__end__"])
```
If using the subclassing syntax, you can use these two class vars:
```py
class MyMiddlewareAgentMiddleware):
before_model_jump_to = ["__end__"]
after_model_jump_to = ["tools", "__end__"]
```
Open for debate if we want to bundle these in a single jump map / config
for a middleware. Easy to migrate later if we decide to add more hooks.
We will need to **really clearly document** that these must be
explicitly set in order to enable conditional edges.
Notice for the below case, `Middleware2` does actually enable jumps.
<table>
<thead>
<tr>
<th>Before (broken), adding conditional edges unconditionally</th>
<th>After (fixed), adding conditional edges sparingly</th>
</tr>
</thead>
<tbody>
<tr>
<td>
<img width="619" height="508" alt="Screenshot 2025-09-23 at 10 23 23 AM"
src="https://github.com/user-attachments/assets/bba2d098-a839-4335-8e8c-b50dd8090959"
/>
</td>
<td>
<img width="469" height="490" alt="Screenshot 2025-09-23 at 10 23 13 AM"
src="https://github.com/user-attachments/assets/717abf0b-fc73-4d5f-9313-b81247d8fe26"
/>
</td>
</tr>
</tbody>
</table>
<details>
<summary>Snippet for the above</summary>
```py
from typing import Any
from langchain.agents.tool_node import InjectedState
from langgraph.runtime import Runtime
from langchain.agents.middleware.types import AgentMiddleware, AgentState
from langchain.agents.middleware_agent import create_agent
from langchain_core.tools import tool
from typing import Annotated
from langchain_core.messages import HumanMessage
from typing_extensions import NotRequired
@tool
def simple_tool(input: str) -> str:
"""A simple tool."""
return "successful tool call"
class Middleware1(AgentMiddleware):
"""Custom middleware that adds a simple tool."""
tools = [simple_tool]
def before_model(self, state: AgentState, runtime: Runtime) -> None:
return None
def after_model(self, state: AgentState, runtime: Runtime) -> None:
return None
class Middleware2(AgentMiddleware):
before_model_jump_to = ["tools", "__end__"]
def before_model(self, state: AgentState, runtime: Runtime) -> None:
return None
def after_model(self, state: AgentState, runtime: Runtime) -> None:
return None
class Middleware3(AgentMiddleware):
def before_model(self, state: AgentState, runtime: Runtime) -> None:
return None
def after_model(self, state: AgentState, runtime: Runtime) -> None:
return None
builder = create_agent(
model="openai:gpt-4o-mini",
middleware=[Middleware1(), Middleware2(), Middleware3()],
system_prompt="You are a helpful assistant.",
)
agent = builder.compile()
```
</details>
## More Examples
### Guardrails `after_model`
<img width="379" height="335" alt="Screenshot 2025-09-23 at 10 40 09 AM"
src="https://github.com/user-attachments/assets/45bac7dd-398e-45d1-ae58-6ecfa27dfc87"
/>
<details>
<summary>Code</summary>
```py
from langchain.agents.middleware_agent import create_agent, AgentState, ModelRequest
from langchain.agents.middleware.types import after_model
from langchain_core.messages import HumanMessage, AIMessage
from langgraph.checkpoint.memory import InMemorySaver
from typing import cast, Any
@after_model(jump_to=["model", "__end__"])
def after_model_hook(state: AgentState) -> dict[str, Any]:
"""Check the last AI message for safety violations."""
last_message_content = cast(AIMessage, state["messages"][-1]).content.lower()
print(last_message_content)
unsafe_keywords = ["pineapple"]
if any(keyword in last_message_content for keyword in unsafe_keywords):
# Jump back to model to regenerate response
return {"jump_to": "model", "messages": [HumanMessage("Please regenerate your response, and don't talk about pineapples. You can talk about apples instead.")]}
return {"jump_to": "__end__"}
# Create agent with guardrails middleware
agent = create_agent(
model="openai:gpt-4o-mini",
middleware=[after_model_hook],
system_prompt="Keep your responses to one sentence please!"
).compile()
# Test with potentially unsafe input
result = agent.invoke(
{"messages": [HumanMessage("Tell me something about pineapples")]},
)
for msg in result["messages"]:
print(msg.pretty_print())
"""
================================ Human Message =================================
Tell me something about pineapples
None
================================== Ai Message ==================================
Pineapples are tropical fruits known for their sweet, tangy flavor and distinctive spiky exterior.
None
================================ Human Message =================================
Please regenerate your response, and don't talk about pineapples. You can talk about apples instead.
None
================================== Ai Message ==================================
Apples are popular fruits that come in various varieties, known for their crisp texture and sweetness, and are often used in cooking and baking.
None
"""
```
</details>
Mostly adding a descriptive frontmatter to workflow files. Also address
some formatting and outdated artifacts
No functional changes outside of
[d5457c3](d5457c39ee),
[90708a0](90708a0d99),
and
[338c82d](338c82d21e)
The file-based and title-based labeler workflows were conflicting,
causing the bot to add and remove identical labels in the same
operation. Hopefully this fixes
- Removes Codespell from deps, docs, and `Makefile`s
- Python version requirements in all `pyproject.toml` files now use the
`~=` (compatible release) specifier
- All dependency groups and main dependencies now use explicit lower and
upper bounds, reducing potential for breaking changes
We want state schema as the input schema to middleware nodes because the
conditional edges after these nodes need access to the full state.
Also, we just generally want all state passed to middleware nodes, so we
should be specifying this explicitly. If we don't, the state annotations
used by users in their node signatures are used (so they might be
missing fields).
# Changes
## Adds support for `DynamicSystemPromptMiddleware`
```py
from langchain.agents.middleware import DynamicSystemPromptMiddleware
from langgraph.runtime import Runtime
from typing_extensions import TypedDict
class Context(TypedDict):
user_name: str
def system_prompt(state: AgentState, runtime: Runtime[Context]) -> str:
user_name = runtime.context.get("user_name", "n/a")
return f"You are a helpful assistant. Always address the user by their name: {user_name}"
middleware = DynamicSystemPromptMiddleware(system_prompt)
```
## Adds support for `runtime` in middleware hooks
```py
class AgentMiddleware(Generic[StateT, ContextT]):
def modify_model_request(
self,
request: ModelRequest,
state: StateT,
runtime: Runtime[ContextT], # Optional runtime parameter
) -> ModelRequest:
# upgrade model if runtime.context.subscription is `top-tier` or whatever
```
## Adds support for omitting state attributes from input / output
schemas
```py
from typing import Annotated, NotRequired
from langchain.agents.middleware.types import PrivateStateAttr, OmitFromInput, OmitFromOutput
class CustomState(AgentState):
# Private field - not in input or output schemas
internal_counter: NotRequired[Annotated[int, PrivateStateAttr]]
# Input-only field - not in output schema
user_input: NotRequired[Annotated[str, OmitFromOutput]]
# Output-only field - not in input schema
computed_result: NotRequired[Annotated[str, OmitFromInput]]
```
## Additionally
* Removes filtering of state before passing into middleware hooks
Typing is not foolproof here, still need to figure out some of the
generics stuff w/ state and context schema extensions for middleware.
TODO:
* More docs for middleware, should hold off on this until other prios
like MCP and deepagents are met
---------
Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>
## Summary
This PR fixes several bugs and improves the example code in
`BaseChatMessageHistory` docstring that would prevent it from working
correctly.
### Bugs Fixed
- **Critical bug**: Fixed `json.dump(messages, f)` →
`json.dump(serialized, f)` - was using wrong variable
- **NameError**: Fixed bare variable references to use
`self.storage_path` and `self.session_id`
- **Missing imports**: Added required imports (`json`, `os`, message
converters) to make example runnable
### Improvements
- Added missing type hints following project standards (`messages() ->
list[BaseMessage]`, `clear() -> None`)
- Added robust error handling with `FileNotFoundError` exception
handling
- Added directory creation with `os.makedirs(exist_ok=True)` to prevent
path errors
- Improved performance: `json.load(f)` instead of `json.loads(f.read())`
- Added explicit UTF-8 encoding to all file operations
- Updated stores.py to use modern union syntax (`int | None` vs
`Optional[int]`)
### Test Plan
- [x] Code passes linting (`ruff check`)
- [x] Example code now has all required imports and proper syntax
- [x] Fixed variable references prevent runtime errors
- [x] Follows project's type annotation standards
The example code in the docstring is now fully functional and follows
LangChain's coding standards.
---------
Co-authored-by: sadiqkhzn <sadiqkhzn@users.noreply.github.com>
- **Description:** Updated the dead/unreachable links to Docling from
the additional resources section of the langchain-docling docs
- **Issue:** Fixes langchain-ai/docs/issues/574
- **Dependencies:** None
# Main changes / new features
## Better support for parallel tool calls
1. Support for multiple tool calls requiring human input
2. Support for combination of tool calls requiring human input + those
that are auto-approved
3. Support structured output w/ tool calls requiring human input
4. Support structured output w/ standard tool calls
## Shortcut for allowed actions
Adds a shortcut where tool config can be specified as a `bool`, meaning
"all actions allowed"
```py
HumanInTheLoopMiddleware(tool_configs={"expensive_tool": True})
```
## A few design decisions here
* We only raise one interrupt w/ all `HumanInterrupt`s, currently we
won't be able to execute all tools until all of these are resolved. This
isn't super blocking bc we can't re-invoke the model until all tools
have finished execution. That being said, if you have a long running
auto-approved tool, this could slow things down.
## TODOs
* Ideally, we would rename `accept` -> `approve`
* Ideally, we would rename `respond` -> `reject`
* Docs update (@sydney-runkle to own)
* In another PR I'd like to refactor testing to have one file for each
prebuilt middleware :)
Fast follow to https://github.com/langchain-ai/langchain/pull/32962
which was deemed as too breaking
Adds documentation for the integration langchain-scraperapi, which
contains 3 tools using the ScraperAPI service.
The tools give AI agents the ability to
Scrape the web and return HTML/text/markdown
Perform Google search and return json output
Perform Amazon search and return json output
For reference, here is the official repo for langchain_scraperapi:
https://github.com/scraperapi/langchain-scraperapi
Replaced `input_message` parameter with a directly called tuple, e.g.
`{"messages": [("user", "What is my name?")]}`
Before, the memory function wasn't working with the agent, using the
format of the input_message parameter.
Specifically, on page [Build an
Agent#adding-in-memory](https://python.langchain.com/docs/tutorials/agents/#adding-in-memory)
In the previous code, the query "What's my name?" wasn't working, as the
agent could not recall memory correctly.
<img width="860" height="679" alt="image"
src="https://github.com/user-attachments/assets/dfbca21e-ffe9-4645-a810-3be7a46d81d5"
/>
This PR improves navigation in the summarization how-to section by
adding
cross-links from the single-call guide to the related map-reduce and
refine
guides. This mirrors the docs style guide’s emphasis on clear
cross-references
and should help readers discover the appropriate pattern for longer
texts.
- Source edited: docs/docs/how_to/summarize_stuff.ipynb
- Links added:
- /docs/how_to/summarize_map_reduce/
- /docs/how_to/summarize_refine/
Type: docs-only (no code changes)
Description:
Add a docstring to _load_map_reduce_chain in chains/summarize/ to
explain the purpose of the prompt argument and document function
parameters. This addresses an existing TODO in the codebase.
Issue:
N/A (documentation improvement only)
Dependencies:
None
**Description:**
Add a docstring to `_load_stuff_chain` in `chains/summarize/` to explain
the purpose of the `prompt` argument and document function parameters.
This addresses an existing TODO in the codebase.
**Issue:**
N/A (documentation improvement only)
**Dependencies:**
None
Bumps [CodSpeedHQ/action](https://github.com/codspeedhq/action) from 3
to 4.
<details>
<summary>Release notes</summary>
<p><em>Sourced from <a
href="https://github.com/codspeedhq/action/releases">CodSpeedHQ/action's
releases</a>.</em></p>
<blockquote>
<h2>v4.0.0</h2>
<h2>💥 BREAKING</h2>
<p>It's now required to explicitly set the runner mode to
<code>instrumentation</code> or <code>walltime</code> using either:</p>
<ul>
<li>the <code>mode</code> argument</li>
<li>or the <code>CODSPEED_RUNNER_MODE</code> environment variable</li>
</ul>
<blockquote>
<p>[!TIP]
Before, this variable was automatically set to
<code>instrumentation</code> on every runner except for <a
href="https://codspeed.io/docs/instruments/walltime">CodSpeed macro
runners</a> where it was set to <code>walltime</code> by default.</p>
</blockquote>
<p>Find more details in <a
href="https://codspeed.io/docs/instruments">the instruments
documentation</a>.</p>
<h2>Details</h2>
<h3><!-- raw HTML omitted -->🚀 Features</h3>
<ul>
<li>Make perf profiling enabled by default by <a
href="https://github.com/GuillaumeLagrange"><code>@GuillaumeLagrange</code></a>
in <a
href="https://redirect.github.com/CodSpeedHQ/runner/pull/110">#110</a></li>
<li>Make the runner mode argument required by <a
href="https://github.com/GuillaumeLagrange"><code>@GuillaumeLagrange</code></a></li>
<li>Use introspected node in walltime mode by <a
href="https://github.com/GuillaumeLagrange"><code>@GuillaumeLagrange</code></a>
in <a
href="https://redirect.github.com/CodSpeedHQ/runner/pull/108">#108</a></li>
<li>Add instrumented go shell script by <a
href="https://github.com/not-matthias"><code>@not-matthias</code></a>
in <a
href="https://redirect.github.com/CodSpeedHQ/runner/pull/102">#102</a></li>
</ul>
<h3><!-- raw HTML omitted -->🐛 Bug Fixes</h3>
<ul>
<li>Compute proper load bias by <a
href="https://github.com/not-matthias"><code>@not-matthias</code></a>
in <a
href="https://redirect.github.com/CodSpeedHQ/runner/pull/107">#107</a></li>
<li>Increase timeout for first perf ping by <a
href="https://github.com/GuillaumeLagrange"><code>@GuillaumeLagrange</code></a></li>
<li>Prevent running with valgrind by <a
href="https://github.com/not-matthias"><code>@not-matthias</code></a>
in <a
href="https://redirect.github.com/CodSpeedHQ/runner/pull/106">#106</a></li>
</ul>
<h3><!-- raw HTML omitted -->🏗️ Refactor</h3>
<ul>
<li>Change go-runner binary name by <a
href="https://github.com/not-matthias"><code>@not-matthias</code></a>
in <a
href="https://redirect.github.com/CodSpeedHQ/runner/pull/111">#111</a></li>
</ul>
<p><strong>Full Runner Changelog</strong>: <a
href="https://github.com/CodSpeedHQ/runner/blob/main/CHANGELOG.md">https://github.com/CodSpeedHQ/runner/blob/main/CHANGELOG.md</a></p>
<h2>v3.8.1</h2>
<h2>What's Changed</h2>
<h3><!-- raw HTML omitted -->🐛 Bug Fixes</h3>
<ul>
<li>Don't show error when libpython is not found by <a
href="https://github.com/not-matthias"><code>@not-matthias</code></a></li>
</ul>
<h3><!-- raw HTML omitted -->🏗️ Refactor</h3>
<ul>
<li>Improve conditional compilation in
<code>get_pipe_open_options</code> by <a
href="https://github.com/art049"><code>@art049</code></a> in <a
href="https://redirect.github.com/CodSpeedHQ/runner/pull/100">#100</a></li>
</ul>
<h3><!-- raw HTML omitted -->⚙️ Internals</h3>
<ul>
<li>Change log level to warn for venv_compat error by <a
href="https://github.com/not-matthias"><code>@not-matthias</code></a>
in <a
href="https://redirect.github.com/CodSpeedHQ/runner/pull/104">#104</a></li>
</ul>
<p><strong>Full Changelog</strong>: <a
href="https://github.com/CodSpeedHQ/action/compare/v3.8.0...v3.8.1">https://github.com/CodSpeedHQ/action/compare/v3.8.0...v3.8.1</a>
<strong>Full Runner Changelog</strong>: <a
href="https://github.com/CodSpeedHQ/runner/blob/main/CHANGELOG.md">https://github.com/CodSpeedHQ/runner/blob/main/CHANGELOG.md</a></p>
<h2>v3.8.0</h2>
<h2>What's Changed</h2>
<!-- raw HTML omitted -->
</blockquote>
<p>... (truncated)</p>
</details>
<details>
<summary>Commits</summary>
<ul>
<li><a
href="653fdc30e6"><code>653fdc3</code></a>
Release v4.0.1 🚀</li>
<li><a
href="4da7be1bda"><code>4da7be1</code></a>
chore: bump runner version to 4.0.1</li>
<li><a
href="172d6c5630"><code>172d6c5</code></a>
chore: make the comment about input validation more discrete</li>
<li><a
href="d15e1ce813"><code>d15e1ce</code></a>
chore: improve the release script</li>
<li><a
href="6eeb021fd0"><code>6eeb021</code></a>
Release v4.0.0 🚀</li>
<li><a
href="74312dabbe"><code>74312da</code></a>
chore: improve the release script</li>
<li><a
href="8a17a350a8"><code>8a17a35</code></a>
ci: add modes to the matrix</li>
<li><a
href="8e3f02a649"><code>8e3f02a</code></a>
feat: make the mode argument required</li>
<li><a
href="97c7a6f5fc"><code>97c7a6f</code></a>
chore: bump runner version to 4.0.0</li>
<li><a
href="8a4cadd026"><code>8a4cadd</code></a>
chore: point the changelog to the runner</li>
<li>See full diff in <a
href="https://github.com/codspeedhq/action/compare/v3...v4">compare
view</a></li>
</ul>
</details>
<br />
[](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores)
Dependabot will resolve any conflicts with this PR as long as you don't
alter it yourself. You can also trigger a rebase manually by commenting
`@dependabot rebase`.
[//]: # (dependabot-automerge-start)
[//]: # (dependabot-automerge-end)
---
<details>
<summary>Dependabot commands and options</summary>
<br />
You can trigger Dependabot actions by commenting on this PR:
- `@dependabot rebase` will rebase this PR
- `@dependabot recreate` will recreate this PR, overwriting any edits
that have been made to it
- `@dependabot merge` will merge this PR after your CI passes on it
- `@dependabot squash and merge` will squash and merge this PR after
your CI passes on it
- `@dependabot cancel merge` will cancel a previously requested merge
and block automerging
- `@dependabot reopen` will reopen this PR if it is closed
- `@dependabot close` will close this PR and stop Dependabot recreating
it. You can achieve the same result by closing it manually
- `@dependabot show <dependency name> ignore conditions` will show all
of the ignore conditions of the specified dependency
- `@dependabot ignore this major version` will close this PR and stop
Dependabot creating any more for this major version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this minor version` will close this PR and stop
Dependabot creating any more for this minor version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this dependency` will close this PR and stop
Dependabot creating any more for this dependency (unless you reopen the
PR or upgrade to it yourself)
</details>
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: Mason Daugherty <mason@langchain.dev>
## Description
This PR adds documentation for the new ZeusDB vector store integration
with LangChain.
## Motivation
ZeusDB is a high-performance vector database (Python/Rust backend)
designed for AI applications that need fast similarity search and
real-time vector ops. This integration brings ZeusDB's capabilities to
the LangChain ecosystem, giving developers another production-oriented
option for vector storage and retrieval.
**Key Features:**
- **User-Friendly Python API**: Intuitive interface that integrates
seamlessly with Python ML workflows
- **High Performance**: Powered by a robust Rust backend for
lightning-fast vector operations
- **Enterprise Logging**: Comprehensive logging capabilities for
monitoring and debugging production systems
- **Advanced Features**: Includes product quantization and persistence
capabilities
- **AI-Optimized**: Purpose-built for modern AI applications and RAG
pipelines
## Changes
- Added provider documentation:
`docs/docs/integrations/providers/zeusdb.mdx` (installation, setup).
- Added vector store documentation:
`docs/docs/integrations/vectorstores/zeusdb.ipynb` (quickstart for
creating/querying a ZeusDBVectorStore).
- Registered langchain-zeusdb in `libs/packages.yml` for discovery.
## Target users
- AI/ML engineers building RAG pipelines
- Data scientists working with large document collections
- Developers needing high-throughput vector search
- Teams requiring near real-time vector operations
## Testing
- Followed LangChain's "How to add standard tests to an integration"
guidance.
- Code passes format, lint, and test checks locally.
- Tested with LangChain Core 0.3.74
- Works with Python 3.10 to 3.13
## Package Information
**PyPI:** https://pypi.org/project/langchain-zeusdb
**Github:** https://github.com/ZeusDB/langchain-zeusdb
## Summary
- Add comprehensive type hints to the MyInMemoryStore example code in
BaseStore docstring
- Improve documentation quality and educational value for developers
- Align with LangChain's coding standards requiring type hints on all
Python code
## Changes Made
- Added return type annotations to all methods (__init__, mget, mset,
mdelete, yield_keys)
- Added parameter type annotations using proper generic types (Sequence,
Iterator)
- Added instance variable type annotation for the store attribute
- Used modern Python union syntax (str | None) for optional types
## Test Plan
- Verified Python syntax validity with ast.parse()
- No functional changes to actual code, only documentation improvements
- Example code now follows best practices and coding standards
This change improves the educational value of the example code and
ensures consistency with LangChain's requirement that "All Python code
MUST include type hints and return types" as specified in the
development guidelines.
---------
Co-authored-by: sadiqkhzn <sadiqkhzn@users.noreply.github.com>
Co-authored-by: Mason Daugherty <mason@langchain.dev>
**Description:**
Introduces documentation notebooks for AI/ML API integration covering
the following use cases:
- Chat models (`ChatAimlapi`)
- Text completion models (`AimlapiLLM`)
- Provider usage examples
- Text embedding models (`AimlapiEmbeddings`)
Additionally, adds the `langchain-aimlapi` package entry to
`libs/packages.yml` for package management.
This PR aims to provide a comprehensive starting point for developers
integrating AI/ML API models with LangChain via the new
`langchain-aimlapi` package.
**Issue:** N/A
**Dependencies:** None
**Twitter handle:** @aimlapi
---
### **To-Do Before Submitting PR:**
* [x] Run `make format`
* [x] Run `make lint`
* [x] Confirm all documentation notebooks are in
`docs/docs/integrations/`
* [x] Double-check `libs/packages.yml` has the correct repo path
* [x] Confirm no `pyproject.toml` modifications were made unnecessarily
Co-authored-by: Mason Daugherty <mason@langchain.dev>
**Description:**
This PR updates the free searches per month from **100** to **250** and
renames SerpAPI to [SerpApi](https://serpapi.com/) to prevent confusion.
Add import API keys and enhance usage instructions in the Jupyter
notebook
**Issue:** N/A
**Dependencies:** N/A
- [x] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. **We will not consider
a PR unless these three are passing in CI.** See [contribution
guidelines](https://python.langchain.com/docs/contributing/) for more.
**Description:**
This PR updated links to the latest Anthropic documentation. Changes
include revised links for model overview, tool usage, web search tool,
text editor tool, and more.
**Issue:**
N/A
**Dependencies:**
None
**Twitter handle:**
N/A
- **Description:** The `langchain-yugabytedb` package implementations of
core LangChain abstractions using `YugabyteDB` Distributed SQL Database.
YugabyteDB is a cloud-native distributed PostgreSQL-compatible database
that combines strong consistency with ultra-resilience, seamless
scalability, geo-distribution, and highly flexible data locality to
deliver business-critical, transactional applications.
[YugabyteDB](https://www.yugabyte.com/ai/) combines the power of the
`pgvector` PostgreSQL extension with an inherently distributed
architecture. This future-proofed foundation helps you build GenAI
applications using RAG retrieval that demands high-performance vector
search.
- [ ] **tests and docs**:
1. `langchain-yugabytedb`
[github](https://github.com/yugabyte/langchain-yugabytedb) repo.
2. YugabyteDB VectorStore example notebook showing its use. It lives in
`langchain/docs/docs/integrations/vectorstores/yugabytedb.ipynb`
directory.
3. Running `langchain-yugabytedb` unit tests
- Setting up a Development Environment
This document details how to set up a local development environment that
will
allow you to contribute changes to the project.
Acquire sources and create virtualenv.
```shell
git clone https://github.com/yugabyte/langchain-yugabytedb
cd langchain-yugabytedb
uv venv --python=3.13
source .venv/bin/activate
```
Install package in editable mode.
```shell
uv pip install pipx
pipx install poetry
poetry install
uv pip install pytest pytest_asyncio pytest-timeout langchain-core langchain_tests sqlalchemy psycopg psycopg-binary numpy pgvector
```
Start YugabyteDB RF-1 Universe.
```shell
docker run -d --name yugabyte_node01 --hostname yugabyte01 \
-p 7000:7000 -p 9000:9000 -p 15433:15433 -p 5433:5433 -p 9042:9042 \
yugabytedb/yugabyte:2.25.2.0-b359 bin/yugabyted start --background=false \
--master_flags="allowed_preview_flags_csv=ysql_yb_enable_advisory_locks,ysql_yb_enable_advisory_locks=true" \
--tserver_flags="allowed_preview_flags_csv=ysql_yb_enable_advisory_locks,ysql_yb_enable_advisory_locks=true"
docker exec -it yugabyte_node01 bin/ysqlsh -h yugabyte01 -c "CREATE extension vector;"
```
Invoke test cases.
```shell
pytest -vvv tests/unit_tests/yugabytedb_tests
```
Thank you for contributing to LangChain! Follow these steps to mark your
pull request as ready for review. **If any of these steps are not
completed, your PR will not be considered for review.**
- [x] **feat(docs)**: add Bigtable Key-value store doc
- [X] **feat(docs)**: add Bigtable Vector store doc
This PR adds a doc for Bigtable and LangChain Key-value store
integration. It contains guides on how to add, delete, get, and yield
key-value pairs from Bigtable Key-value Store for LangChain.
- [x] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. **We will not consider
a PR unless these three are passing in CI.** See [contribution
guidelines](https://python.langchain.com/docs/contributing/) for more.
Additional guidelines:
- Make sure optional dependencies are imported within a function.
- Please do not add dependencies to `pyproject.toml` files (even
optional ones) unless they are **required** for unit tests.
- Most PRs should not touch more than one package.
- Changes should be backwards compatible.
---------
Co-authored-by: Mason Daugherty <mason@langchain.dev>
# feat(integrations): Add Timbr tools integration
## DESCRIPTION
This PR adds comprehensive documentation and integration support for
Timbr's semantic layer tools in LangChain.
[Timbr](https://timbr.ai/) provides an ontology-driven semantic layer
that enables natural language querying of databases through
business-friendly concepts. It connects raw data to governed business
measures for consistent access across BI, APIs, and AI applications.
[`langchain-timbr`](https://pypi.org/project/langchain-timbr/) is a
Python SDK that extends
[LangChain](https://github.com/WPSemantix/Timbr-GenAI/tree/main/LangChain)
and
[LangGraph](https://github.com/WPSemantix/Timbr-GenAI/tree/main/LangGraph)
with custom agents, chains, and nodes for seamless integration with the
Timbr semantic layer. It enables converting natural language prompts
into optimized semantic-SQL queries and executing them directly against
your data.
**What's Added:**
- Complete integration documentation for `langchain-timbr` package
- Tool documentation page with usage examples and API reference
**Integration Components:**
- `IdentifyTimbrConceptChain` - Identify relevant concepts from user
prompts
- `GenerateTimbrSqlChain` - Generate SQL queries from natural language
- `ValidateTimbrSqlChain` - Validate queries against knowledge graph
schemas
- `ExecuteTimbrQueryChain` - Execute queries against semantic databases
- `GenerateAnswerChain` - Generate human-readable answers from results
## Documentation Added
- `/docs/integrations/providers/timbr.mdx` - Provider overview and
configuration
- `/docs/integrations/tools/timbr.ipynb` - Comprehensive tool usage
examples
## Links
- [PyPI Package](https://pypi.org/project/langchain-timbr/)
- [GitHub Repository](https://github.com/WPSemantix/langchain-timbr)
- [Official
Documentation](https://docs.timbr.ai/doc/docs/integration/langchain-sdk/)
---------
Co-authored-by: Mason Daugherty <mason@langchain.dev>
Thank you for contributing to LangChain! Follow these steps to mark your
pull request as ready for review. **If any of these steps are not
completed, your PR will not be considered for review.**
**Description:**
Add documentation for Qwen integration in LangChain, including setup
instructions, usage examples, and configuration details. Update related
qwq documentation to reflect current best practices and improve clarity
for users.
This PR enhances the documentation ecosystem by:
- Adding a new guide for integrating Qwen models
- Updating outdated or incomplete qwq documentation
- Improving structure and readability of relevant sections
**Issue:** N/A
**Dependencies:** None
---------
Co-authored-by: Mason Daugherty <mason@langchain.dev>
**Description:** Adds documentation for ZenRows integration with
LangChain, including provider overview and detailed tool documentation.
ZenRows is an enterprise-grade web scraping solution that enables
LangChain agents to extract web content at scale with advanced features
like JavaScript rendering, anti-bot bypass, geo-targeting, and multiple
output formats.
This PR includes:
- Provider documentation
(`docs/docs/integrations/providers/zenrows.ipynb`)
- Tool documentation
(`docs/docs/integrations/tools/zenrows_universal_scraper.ipynb`)
- Complete usage examples and API reference links
**Issue:** N/A
**Dependencies:**
- [langchain-zenrows](https://github.com/ZenRows-Hub/langchain-zenrows)
package (external, available on
[PyPI](https://pypi.org/project/langchain-zenrows/))
- No changes to core LangChain dependencies
**LinkedIn handle:** https://www.linkedin.com/company/zenrows/
---------
Co-authored-by: Mason Daugherty <mason@langchain.dev>
Adding Oracle Generative AI as one of the providers for langchain.
Updated the old examples in the documentation with the new working
examples.
---------
Co-authored-by: Vishal Karwande <vishalkarwande@Vishals-MacBook-Pro.local>
Co-authored-by: Mason Daugherty <mason@langchain.dev>
**Description:** Fixes a small typo in `_get_document_with_hash` inside
`libs/core/langchain_core/indexing/api.py`.
**Issue:** N/A (no related issue)
**Dependencies:** None
Especially helpful for the text splitters tests where we're installing
pytorch (expensive and slow slow slow). Should speed up CI by 5-10 mins.
w/o caches, CI taking 20 minutes 😨
w/ caches, CI taking 3 minutes
Taking advantage of [partial
runs](https://codspeed.io/docs/features/partial-runs)!
This should save us minutes on every CI job, we only run codspeed for
libs w/ changes and this doesn't affect benchmarking drops
Oversight when moving back to basic function call for
`modify_model_request` rather than implementation as its own node.
Basic test right now failing on main, passing on this branch
Revealed a gap in testing. Will write up a more robust test suite for
basic middleware features.
### Description
* Replace the Mermaid graph node label escaping logic
(`_escape_node_label`) with `_to_safe_id`, which converts a string into
a unique, Mermaid-compatible node id. Ensures nodes with special
characters always render correctly.
**Before**
* Invalid characters (e.g. `开`) replaced with `_`. Causes collisions
between nodes with names that are the same length and contain all
non-safe characters:
```python
_escape_node_label("开") # '_'
_escape_node_label("始") # '_' same as above, but different character passed in. not a unique mapping.
```
**After**
```python
_to_safe_id("开") # \5f00
_to_safe_id("始") # \59cb unique!
```
### Tests
* Rename `test_graph_mermaid_escape_node_label()` to
`test_graph_mermaid_to_safe_id()` and update function logic to use
`_to_safe_id`
* Add `test_graph_mermaid_special_chars()`
### Issue
Fixeslangchain-ai/langgraph#6036
Reusable workflows are not currently supported by PyPI's Trusted
Publishing
functionality, and are subject to breakage. Users are strongly
encouraged
to avoid using reusable workflows for Trusted Publishing until support
becomes official. Please, do not report bugs if this breaks.
Description: Fixes a bug in RunnableRetry where .batch / .abatch could
return misordered outputs (e.g. inputs [0,1,2] yielding [1,1,2]) when
some items succeeded on an earlier attempt and others were retried. Root
cause: successful results were stored keyed by the index within the
shrinking “pending” subset rather than the original input index, causing
collisions and reordered/duplicated outputs after retries. Fix updates
_batch and _abatch to:
- Track remaining original indices explicitly.
- Call underlying batch/abatch only on remaining inputs.
- Map results back to original indices.
- Preserve final ordering by reconstructing outputs in original
positional order.
Issue: Fixes#21326
Tests:
- Added regression tests: test_retry_batch_preserves_order and
test_async_retry_batch_preserves_order asserting correct ordering after
a single controlled failure + retry.
- Existing retry tests still pass.
Dependencies:
- None added or changed.
---------
Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>
This removes langchain-experimental from api reference.
We do not recommend it to users for production use cases, so let's also
deprecate it from documentation
**Description:** Fixes infinite recursion issue in JSON schema
dereferencing when objects contain both $ref and other properties (e.g.,
nullable, description, additionalProperties). This was causing Apollo
MCP server schemas to hang indefinitely during tool binding.
**Problem:**
- Commit fb5da8384 changed the condition from `set(obj.keys()) ==
{"$ref"}` to `"$ref" in set(obj.keys())`
- This caused objects with $ref + other properties to be treated as pure
$ref nodes
- Result: other properties were lost and infinite recursion occurred
with complex schemas
**Solution:**
- Restore pure $ref detection for objects with only $ref key
- Add proper handling for mixed $ref objects that preserves all
properties
- Merge resolved reference content with other properties
- Maintain cycle detection to prevent infinite recursion
**Impact:**
- Fixes Apollo MCP server schema integration
- Resolves tool binding infinite recursion with complex GraphQL schemas
- Preserves backward compatibility with existing functionality
- No performance impact - actually improves handling of complex schemas
**Issue:** Fixes#32511
**Dependencies:** None
**Testing:**
- Added comprehensive unit tests covering mixed $ref scenarios
- All existing tests pass (1326 passed, 0 failed)
- Tested with realistic Apollo GraphQL schemas
- Stress tested with 100 iterations of complex schemas
**Verification:**
- ✅ `make format` - All files properly formatted
- ✅ `make lint` - All linting checks pass
- ✅ `make test` - All 1326 unit tests pass
- ✅ No breaking changes - full backwards compatibility maintained
---------
Co-authored-by: Marcus <marcus@Marcus-M4-MAX.local>
Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>
On Friday, October 10th, the moonshotai/kimi-k2-instruct model will be
decommissioned in favor of the latest version,
moonshotai/kimi-k2-instruct-0905.
Until then, requests to moonshotai/kimi-k2-instruct will automatically
be routed to moonshotai/kimi-k2-instruct-0905.
# Description
This PR fixes a bug in _recursive_set_additional_properties_false used
in function_calling.convert_to_openai_function.
Previously, schemas with "additionalProperties=True" were not correctly
overridden when strict validation was expected, which could lead to
invalid OpenAI function schemas.
The updated implementation ensures that:
- Any schema with "additionalProperties" already set will now be forced
to False under strict mode.
- Recursive traversal of properties, items, and anyOf is preserved.
- Function signature remains unchanged for backward compatibility.
# Issue
When using tool calling in OpenAI structured output strict mode
(strict=True), 400: "Invalid schema for response_format XXXXX
'additionalProperties' is required to be supplied and to be false" error
raises for the parameter that contains dict type. OpenAI requires
additionalProperties to be set to False.
Some PRs try to resolved the issue.
- PR #25169 introduced _recursive_set_additional_properties_false to
recursively set additionalProperties=False.
- PR #26287 fixed handling of empty parameter tools for OpenAI function
generation.
- PR #30971 added support for Union type arguments in strict mode of
OpenAI function calling / structured output.
Despite these improvements, since Pydantic 2.11, it will always add
`additionalProperties: True` for arbitrary dictionary schemas dict or
Any (https://pydantic.dev/articles/pydantic-v2-11-release#changes).
Schemas that already had additionalProperties=True in such cases were
not being overridden, which this PR addresses to ensure strict mode
behaves correctly in all cases.
# Dependencies
No Changes
---------
Co-authored-by: Zhong, Yu <yzhong@freewheel.com>
This PR adds a new cookbook demonstrating how to build a RAG pipeline
with LangChain and track + evaluate it using MLflow.
Currently not much documentation on LangChain MLflow integration, hope
this can help folks trying to monitor and evaluate their LangChain
applications.
- ArXiv document loader
- In Memory vector store
- LCEL rag pipeline
- MLflow tracing
- MLflow evaluation
Issue:
N/A
Dependencies:
N/A
**Description:**
Updates the Confident AI integration documentation to use modern
patterns and improve code quality. This change:
- Replaces deprecated `DeepEvalCallbackHandler` with the new
`CallbackHandler` from `deepeval.integrations.langchain`
- Updates installation and authentication instructions to match current
best practices
- Adds modern integration examples using LangChain's latest patterns
- Removes deprecated metrics and outdated code examples
- Updates code samples to follow current best practices
The changes make the documentation more maintainable and ensure users
follow the recommended integration patterns.
**Issue:** Fixes#32444
**Dependencies:**
- deepeval
- langchain
- langchain-openai
**Twitter handle:** @Muwinuddin
---------
Co-authored-by: Mason Daugherty <mason@langchain.dev>
Co-authored-by: Mason Daugherty <github@mdrxy.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Description:
Added "Method Two: Quick Setup (Linux)" section to prerequisites,
providing a curl-based installation method for deploying JaguarDB
without Docker. Retained original Docker setup instructions for
flexibility.
Thank you for contributing to LangChain! Follow these steps to mark your
pull request as ready for review. **If any of these steps are not
completed, your PR will not be considered for review.**
- **Description:** Aerospike Vector Store has been retired. It is no
longer supported so It should no longer be documented on the Langchain
site.
- **Add tests and docs**: Removes docs for retired Aerospike vector
store.
- **Lint and test**: NA
Added a short section to the Weaviate integration docs showing how to
connect to an existing collection (reuse an index) with
`WeaviateVectorStore`. This helps clarify required parameters
(`index_name`, `text_key`) when loading a pre-existing store, which was
previously missing.
Thank you for contributing to LangChain! Follow these steps to mark your
pull request as ready for review. **If any of these steps are not
completed, your PR will not be considered for review.**
### Description
Added a short section to the Weaviate integration docs showing how to
connect to an existing collection (reuse an index) with
`WeaviateVectorStore`. This helps clarify required parameters
(`index_name`, `text_key`) when loading a pre-existing store, which was
previously missing.
### Issue
Fixeslangchain-ai/langchain-weaviate#197
### Dependencies
None
Thank you for contributing to LangChain! Follow these steps to mark your
pull request as ready for review. **If any of these steps are not
completed, your PR will not be considered for review.**
- [x] **PR title**: Follows the format: {TYPE}({SCOPE}): {DESCRIPTION}
- Examples:
- feat(core): add multi-tenant support
- fix(cli): resolve flag parsing error
- docs(openai): update API usage examples
- Allowed `{TYPE}` values:
- feat, fix, docs, style, refactor, perf, test, build, ci, chore,
revert, release
- Allowed `{SCOPE}` values (optional):
- core, cli, langchain, standard-tests, docs, anthropic, chroma,
deepseek, exa, fireworks, groq, huggingface, mistralai, nomic, ollama,
openai, perplexity, prompty, qdrant, xai
- Note: the `{DESCRIPTION}` must not start with an uppercase letter.
- Once you've written the title, please delete this checklist item; do
not include it in the PR.
- [x] **PR message**:
- **Description:** Fixing the import path for `WatsonxToolkit` in
examples after releasing `lnagchain-ibm==0.3.17`
- [ ] **Add tests and docs**: If you're adding a new integration, you
must include:
1. A test for the integration, preferably unit tests that do not rely on
network access,
2. An example notebook showing its use. It lives in
`docs/docs/integrations` directory.
- [ ] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. **We will not consider
a PR unless these three are passing in CI.** See [contribution
guidelines](https://python.langchain.com/docs/contributing/) for more.
Additional guidelines:
- Make sure optional dependencies are imported within a function.
- Please do not add dependencies to `pyproject.toml` files (even
optional ones) unless they are **required** for unit tests.
- Most PRs should not touch more than one package.
- Changes should be backwards compatible.
### Description
This PR is primarily aimed at updating some usage methods in the
`modelscope.mdx` file.
Specifically, it changes from `ModelScopeLLM` to `ModelScopeEndpoint`.
### Relevant PR
The relevant PR link is:
https://github.com/langchain-ai/langchain/pull/28941
**Description:**
Raise a more descriptive OutputParserException when JSON parsing results
in a non-dict type. This improves debugging and aligns behavior with
expectations when using expected_keys.
**Issue:**
Fixes#32233
**Twitter handle:**
@yashvtobre
**Testing:**
- Ran make format and make lint from the root directory; both passed
cleanly.
- Attempted make test but no such target exists in the root Makefile.
- Executed tests directly via pytest targeting the relevant test file,
confirming all tests pass except for unrelated async test failures
outside the scope of this change.
**Notes:**
- No additional dependencies introduced.
- Changes are backward compatible and isolated within the output parser
module.
---------
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Co-authored-by: Mason Daugherty <mason@langchain.dev>
Co-authored-by: Mason Daugherty <github@mdrxy.com>
- **Description:** Currently,
`langchain_core.runnables.graph_mermaid.py` is hardcoded to use
mermaid.ink to render graph diagrams. It would be nice to allow users to
specify a custom URL, e.g. for self-hosted instances of the Mermaid
server.
- **Issue:** [Langchain Forum: allow custom mermaid API
URL](https://forum.langchain.com/t/feature-request-allow-custom-mermaid-api-url/1472)
- **Dependencies:** None
- [X] **Add tests and docs**: Added unit tests using mock requests.
- [X] **Lint and test**: Run `make format`, `make lint` and `make test`.
Minimal example using the feature:
```python
import os
import operator
from pathlib import Path
from typing import Any, Annotated, TypedDict
from langgraph.graph import StateGraph
class State(TypedDict):
messages: Annotated[list[dict[str, Any]], operator.add]
def hello_node(state: State) -> State:
return {"messages": [{"role": "assistant", "content": "pong!"}]}
builder = StateGraph(State)
builder.add_node("hello_node", hello_node)
builder.add_edge("__start__", "hello_node")
builder.add_edge("hello_node", "__end__")
graph = builder.compile()
# Run graph
output = graph.invoke({"messages": [{"role": "user", "content": "ping?"}]})
# Draw graph
Path("graph.png").write_bytes(graph.get_graph().draw_mermaid_png(base_url="https://custom-mermaid.ink"))
```
---------
Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>
- Beta isn't needed for search result tests anymore
- Add TODO for other tests to come back when generally available
- Regenerate remote MCP snapshot after some testing (now the same, but
fresher)
- Bump deps
This pull request introduces a failing unit test to reproduce the bug
reported in issue #32028.
The test asserts the expected behavior: `BaseCallbackManager.merge()`
should combine `handlers` and `inheritable_handlers` independently,
without mixing them. This test will fail on the current codebase and is
intended to guide the fix and prevent future regressions.
---------
Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>
Co-authored-by: Mason Daugherty <mason@langchain.dev>
The Ollama chat model adapter does not support all of the possible
message content formats. That leads to Ollama model adapter crashing on
some messages from different models (e.g. Gemini 2.5 Flash).
These changes should fix one known scenario - when `content` is a list
containing a string.
This allows to use PEP604 syntax for `ToolNode` error handlers
```python
def error_handler(e: ValueError | ToolException) -> str:
return "error"
ToolNode(my_tool, handle_tool_errors=error_handler).invoke(...)
```
Without this change, this fails with `AttributeError: 'types.UnionType'
object has no attribute '__mro__'`
This is better than using a subclass as returning a `property` works
with `ClassWithBetaMethods.beta_property.__doc__`
Co-authored-by: Mason Daugherty <mason@langchain.dev>
Added an id field to the Document passed to filter for
InMemoryVectorStore similarity search. This allows filtering by Document
id and brings the input to the filter in line with the result returned
by the vector similarity search.
---------
Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>
- stars badge redundant (look at the top of the page)
- remove version badge since we have many pkgs (and it was only showing
core) -- also, just look at the releases tab to the right of the readme
- **Description:** The vectorstore standard-test mistakenly assumes that
the store's `get_by_ids` respects the order of the provided `ids`. This
is not the case (as the base class docstring states). This PR fixes
those tests that would fail otherwise (see issue #32820 for details,
repro and all). Fixes#32820
- **Issue:** Fixes#32820
- **Dependencies:** none
Co-authored-by: Stefano Lottini <stefano.lottini@ibm.com>
## Overview
Adding new `AgentMiddleware` primitive that supports `before_model`,
`after_model`, and `prepare_model_request` hooks.
This is very exciting! It makes our `create_agent` prebuilt much more
extensible + capable. Still in alpha and subject to change.
This is different than the initial
[implementation](https://github.com/langchain-ai/langgraph/tree/nc/25aug/agent)
in that it:
* Fills in gaps w/ missing features, for ex -- new structured output,
optionality of tools + system prompt, sync and async model requests,
provider builtin tools
* Exposes private state extensions for middleware, enabling things like
model call tracking, etc
* Middleware can register tools
* Uses a `TypedDict` for `AgentState` -- dataclass subclassing is tricky
w/ required values + required decorators
* Addition of `model_settings` to `ModelRequest` so that we can pass
through things to bind (like cache kwargs for anthropic middleware)
## TODOs
### top prio
- [x] add middleware support to existing agent
- [x] top prio middlewares
- [x] summarization node
- [x] HITL
- [x] prompt caching
other ones
- [x] model call limits
- [x] tool calling limits
- [ ] usage (requires output state)
### secondary prio
- [x] improve typing for state updates from middleware (not working
right now w/ simple `AgentUpdate` and `AgentJump`, at least in Python)
- [ ] add support for public state (input / output modifications via
pregel channel mods) -- to be tackled in another PR
- [x] testing!
### docs
See https://github.com/langchain-ai/docs/pull/390
- [x] high level docs about middleware
- [x] summarization node
- [x] HITL
- [x] prompt caching
## open questions
Lots of open questions right now, many of them inlined as comments for
the short term, will catalog some more significant ones here.
---------
Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>
**Description:**
Remove a character in tool_calling.ipynb that causes a grammatical error
Verification: Local docs build passed after fix
**Issue:**
None (direct hotfix for rendering issue identified during documentation
review)
**Dependencies:**
None
**Description:** This PR fixes the broken Anthropic model example in the
documentation introduction page and adds a comment field to display
model version warnings in code blocks. The changes ensure that users can
successfully run the example code and are reminded to check for the
latest model versions.
**Issue:** https://github.com/langchain-ai/langchain/issues/32806
**Changes made:**
- Update Anthropic model from broken "claude-3-5-sonnet-latest" to
working "claude-3-7-sonnet-20250219"
- Add comment field to display model version warnings in code blocks
- Improve user experience by providing working examples and version
guidance
**Dependencies:** None required
Fixes#32747
SpaCy integration test fixture was trying to use pip to download the
SpaCy language model (`en_core_web_sm`), but uv-managed environments
don't include pip by default. Fail test if not installed as opposed to
downloading.
Removed a period in bulleted list for consistency
Thank you for contributing to LangChain! Follow these steps to mark your
pull request as ready for review. **If any of these steps are not
completed, your PR will not be considered for review.**
- [ ] **PR title**: Follows the format: {TYPE}({SCOPE}): {DESCRIPTION}
- Examples:
- feat(core): add multi-tenant support
- fix(cli): resolve flag parsing error
- docs(openai): update API usage examples
- Allowed `{TYPE}` values:
- feat, fix, docs, style, refactor, perf, test, build, ci, chore,
revert, release
- Allowed `{SCOPE}` values (optional):
- core, cli, langchain, standard-tests, docs, anthropic, chroma,
deepseek, exa, fireworks, groq, huggingface, mistralai, nomic, ollama,
openai, perplexity, prompty, qdrant, xai
- Note: the `{DESCRIPTION}` must not start with an uppercase letter.
- Once you've written the title, please delete this checklist item; do
not include it in the PR.
- [ ] **PR message**: ***Delete this entire checklist*** and replace
with
- **Description:** a description of the change. Include a [closing
keyword](https://docs.github.com/en/issues/tracking-your-work-with-issues/using-issues/linking-a-pull-request-to-an-issue#linking-a-pull-request-to-an-issue-using-a-keyword)
if applicable to a relevant issue.
- **Issue:** the issue # it fixes, if applicable (e.g. Fixes#123)
- **Dependencies:** any dependencies required for this change
- **Twitter handle:** if your PR gets announced, and you'd like a
mention, we'll gladly shout you out!
- [ ] **Add tests and docs**: If you're adding a new integration, you
must include:
1. A test for the integration, preferably unit tests that do not rely on
network access,
2. An example notebook showing its use. It lives in
`docs/docs/integrations` directory.
- [ ] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. **We will not consider
a PR unless these three are passing in CI.** See [contribution
guidelines](https://python.langchain.com/docs/contributing/) for more.
Additional guidelines:
- Make sure optional dependencies are imported within a function.
- Please do not add dependencies to `pyproject.toml` files (even
optional ones) unless they are **required** for unit tests.
- Most PRs should not touch more than one package.
- Changes should be backwards compatible.
Completed the sentence by adding a period ".", in sync with other points
>> Click "Propose changes"
to
>> Click "Propose changes".
Thank you for contributing to LangChain! Follow these steps to mark your
pull request as ready for review. **If any of these steps are not
completed, your PR will not be considered for review.**
- [ ] **PR title**: Follows the format: {TYPE}({SCOPE}): {DESCRIPTION}
- Examples:
- feat(core): add multi-tenant support
- fix(cli): resolve flag parsing error
- docs(openai): update API usage examples
- Allowed `{TYPE}` values:
- feat, fix, docs, style, refactor, perf, test, build, ci, chore,
revert, release
- Allowed `{SCOPE}` values (optional):
- core, cli, langchain, standard-tests, docs, anthropic, chroma,
deepseek, exa, fireworks, groq, huggingface, mistralai, nomic, ollama,
openai, perplexity, prompty, qdrant, xai
- Note: the `{DESCRIPTION}` must not start with an uppercase letter.
- Once you've written the title, please delete this checklist item; do
not include it in the PR.
- [ ] **PR message**: ***Delete this entire checklist*** and replace
with
- **Description:** a description of the change. Include a [closing
keyword](https://docs.github.com/en/issues/tracking-your-work-with-issues/using-issues/linking-a-pull-request-to-an-issue#linking-a-pull-request-to-an-issue-using-a-keyword)
if applicable to a relevant issue.
- **Issue:** the issue # it fixes, if applicable (e.g. Fixes#123)
- **Dependencies:** any dependencies required for this change
- **Twitter handle:** if your PR gets announced, and you'd like a
mention, we'll gladly shout you out!
- [ ] **Add tests and docs**: If you're adding a new integration, you
must include:
1. A test for the integration, preferably unit tests that do not rely on
network access,
2. An example notebook showing its use. It lives in
`docs/docs/integrations` directory.
- [ ] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. **We will not consider
a PR unless these three are passing in CI.** See [contribution
guidelines](https://python.langchain.com/docs/contributing/) for more.
Additional guidelines:
- Make sure optional dependencies are imported within a function.
- Please do not add dependencies to `pyproject.toml` files (even
optional ones) unless they are **required** for unit tests.
- Most PRs should not touch more than one package.
- Changes should be backwards compatible.
Update `langchain-core` dependency min from `>=0.3.63` to `>=0.3.75`.
### Motivation
- We located the `langchain-core` package locally in the monorepo and
need to align `langchain-tests` with the new minimum version.
### Overview
Preparing the `1.0.0a1` release of `langchain-tests` to align with
`langchain-core` version `1.0.0a1`.
### Changes
- Bump package version to `1.0.0a1`
- Relax `langchain-core` requirement from `<1.0.0,>=0.3.63` to
`<2.0.0,>=0.3.63`
### Motivation
All main LangChain packages are now publishing `1.0.0a` prereleases.
`langchain-tests` needs a matching prerelease so downstreams can install
tests alongside the 1.0 series without conflicts.
### Tests
- Verified installation and tests against both `0.3.75` and `1.0.0a1`.
Description:
Added the content= keyword when creating SystemMessage and HumanMessage
in the messages list, making it consistent with the API reference.
### Summary
This PR updates the sentence on the "How-to guides" landing page to
replace smart (curly) quotes with straight quotes in the phrase:
> "How do I...?"
### Why This Change?
- Ensures formatting consistency across documentation
- Avoids encoding or rendering issues with smart quotes
- Matches standard Markdown and inline code formatting
This is a small change, but improves clarity and polish on a key landing
page.
Change "Linkedin" to "LinkedIn" to be consistent with LinkedIn's
spelling.
Thank you for contributing to LangChain! Follow these steps to mark your
pull request as ready for review. **If any of these steps are not
completed, your PR will not be considered for review.**
- [x] **Add tests and docs**: If you're adding a new integration, you
must include:
1. A test for the integration, preferably unit tests that do not rely on
network access,
2. An example notebook showing its use. It lives in
`docs/docs/integrations` directory.
- [x] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. **We will not consider
a PR unless these three are passing in CI.** See [contribution
guidelines](https://python.langchain.com/docs/contributing/) for more.
Additional guidelines:
- Make sure optional dependencies are imported within a function.
- Please do not add dependencies to `pyproject.toml` files (even
optional ones) unless they are **required** for unit tests.
- Most PRs should not touch more than one package.
- Changes should be backwards compatible.
Adding `create_react_agent` and introducing `langchain.agents`!
## Enhanced Structured Output
`create_react_agent` supports coercion of outputs to structured data
types like `pydantic` models, dataclasses, typed dicts, or JSON schemas
specifications.
### Structural Changes
In langgraph < 1.0, `create_react_agent` implemented support for
structured output via an additional LLM call to the model after the
standard model / tool calling loop finished. This introduced extra
expense and was unnecessary.
This new version implements structured output support in the main loop,
allowing a model to choose between calling tools or generating
structured output (or both).
The same basic pattern for structured output generation works:
```py
from langchain.agents import create_react_agent
from langchain_core.messages import HumanMessage
from pydantic import BaseModel
class Weather(BaseModel):
temperature: float
condition: str
def weather_tool(city: str) -> str:
"""Get the weather for a city."""
return f"it's sunny and 70 degrees in {city}"
agent = create_react_agent("openai:gpt-4o-mini", tools=[weather_tool], response_format=Weather)
print(repr(result["structured_response"]))
#> Weather(temperature=70.0, condition='sunny')
```
### Advanced Configuration
The new API exposes two ways to configure how structured output is
generated. Under the hood, LangChain will attempt to pick the best
approach if not explicitly specified. That is, if provider native
support is available for a given model, that takes priority over
artificial tool calling.
1. Artificial tool calling (the default for most models)
LangChain generates a tool (or tools) under the hood that match the
schema of your response format. When the model calls those tools,
LangChain coerces the args to the desired format. Note, LangChain does
not validate outputs adhering to JSON schema specifications.
<details>
<summary>Extended example</summary>
```py
from langchain.agents import create_react_agent
from langchain_core.messages import HumanMessage
from langchain.agents.structured_output import ToolStrategy
from pydantic import BaseModel
class Weather(BaseModel):
temperature: float
condition: str
def weather_tool(city: str) -> str:
"""Get the weather for a city."""
return f"it's sunny and 70 degrees in {city}"
agent = create_react_agent(
"openai:gpt-4o-mini",
tools=[weather_tool],
response_format=ToolStrategy(
schema=Weather, tool_message_content="Final Weather result generated"
),
)
result = agent.invoke({"messages": [HumanMessage("What's the weather in Tokyo?")]})
for message in result["messages"]:
message.pretty_print()
"""
================================ Human Message =================================
What's the weather in Tokyo?
================================== Ai Message ==================================
Tool Calls:
weather_tool (call_Gg933BMHMwck50Q39dtBjXm7)
Call ID: call_Gg933BMHMwck50Q39dtBjXm7
Args:
city: Tokyo
================================= Tool Message =================================
Name: weather_tool
it's sunny and 70 degrees in Tokyo
================================== Ai Message ==================================
Tool Calls:
Weather (call_9xOkYUM7PuEXl9DQq9sWGv5l)
Call ID: call_9xOkYUM7PuEXl9DQq9sWGv5l
Args:
temperature: 70
condition: sunny
================================= Tool Message =================================
Name: Weather
Final Weather result generated
"""
print(repr(result["structured_response"]))
#> Weather(temperature=70.0, condition='sunny')
```
</details>
2. Provider implementations (limited to OpenAI, Groq)
Some providers support structured output generating directly. For those
cases, we offer the `ProviderStrategy` hint:
<details>
<summary>Extended example</summary>
```py
from langchain.agents import create_react_agent
from langchain_core.messages import HumanMessage
from langchain.agents.structured_output import ProviderStrategy
from pydantic import BaseModel
class Weather(BaseModel):
temperature: float
condition: str
def weather_tool(city: str) -> str:
"""Get the weather for a city."""
return f"it's sunny and 70 degrees in {city}"
agent = create_react_agent(
"openai:gpt-4o-mini",
tools=[weather_tool],
response_format=ProviderStrategy(Weather),
)
result = agent.invoke({"messages": [HumanMessage("What's the weather in Tokyo?")]})
for message in result["messages"]:
message.pretty_print()
"""
================================ Human Message =================================
What's the weather in Tokyo?
================================== Ai Message ==================================
Tool Calls:
weather_tool (call_OFJq1FngIXS6cvjWv5nfSFZp)
Call ID: call_OFJq1FngIXS6cvjWv5nfSFZp
Args:
city: Tokyo
================================= Tool Message =================================
Name: weather_tool
it's sunny and 70 degrees in Tokyo
================================== Ai Message ==================================
{"temperature":70,"condition":"sunny"}
Weather(temperature=70.0, condition='sunny')
"""
print(repr(result["structured_response"]))
#> Weather(temperature=70.0, condition='sunny')
```
Note! The final tool message has the custom content provided by the dev.
</details>
Prompted output was previously supported and is no longer supported via
the `response_format` argument to `create_react_agent`. If there's
significant demand for this, we'd be happy to engineer a solution.
## Error Handling
`create_react_agent` now exposes an API for managing errors associated
with structured output generation. There are two common problems with
structured output generation (w/ artificial tool calling):
1. **Parsing error** -- the model generates data that doesn't match the
desired structure for the output
2. **Multiple tool calls error** -- the model generates 2 or more tool
calls associated with structured output schemas
A developer can control the desired behavior for this via the
`handle_errors` arg to `ToolStrategy`.
<details>
<summary>Extended example</summary>
```py
from langchain_core.messages import HumanMessage
from pydantic import BaseModel
from langchain.agents import create_react_agent
from langchain.agents.structured_output import StructuredOutputValidationError, ToolStrategy
class Weather(BaseModel):
temperature: float
condition: str
def weather_tool(city: str) -> str:
"""Get the weather for a city."""
return f"it's sunny and 70 degrees in {city}"
def handle_validation_error(error: Exception) -> str:
if isinstance(error, StructuredOutputValidationError):
return (
f"Please call the {error.tool_name} call again with the correct arguments. "
f"Your mistake was: {error.source}"
)
raise error
agent = create_react_agent(
"openai:gpt-5",
tools=[weather_tool],
response_format=ToolStrategy(
schema=Weather,
handle_errors=handle_validation_error,
),
)
```
</details>
## Error Handling for Tool Calling
Tools fail for two main reasons:
1. **Invocation failure** -- the args generated by the model for the
tool are incorrect (missing, incompatible data types, etc)
2. **Execution failure** -- the tool execution itself fails due to a
developer error, network error, or some other exception.
By default, when tool **invocation** fails, the react agent will return
an artificial `ToolMessage` to the model asking it to correct its
mistakes and retry.
Now, when tool **execution** fails, the react agent raises the
`ToolException` by default instead of asking the model to retry. This
helps to avoid looping that should be avoided due to the aforementioned
issues.
Developers can configure their desired behavior for retries / error
handling via the `handle_tool_errors` arg to `ToolNode`.
## Pre-Bound Models
`create_react_agent` no longer supports inputs to `model` that have been
pre-bound w/ tools or other configuration. To properly support
structured output generation, the agent itself needs the power to bind
tools + structured output kwargs.
This also makes the devx cleaner - it's always expected that `model` is
an instance of `BaseChatModel` (or `str` that we coerce into a chat
model instance).
Dynamic model functions can return a pre-bound model **IF** structured
output is not also used. Dynamic model functions can then bind tools /
structured output logic.
## Import Changes
Users should now use `create_react_agent` from `langchain.agents`
instead of `langgraph.prebuilts`.
Other imports have a similar migration path, `ToolNode` and `AgentState`
for example.
* `chat_agent_executor.py` -> `react_agent.py`
Some notes:
1. Disabled blockbuster + some linting in `langchain/agents` -- beyond
ideal, but necessary to get this across the line for the alpha. We
should re-enable before official release.
- **Description:** Updated Docker command to use ClickHouse 25.7 (has
`vector_similarity` index support). Added `CLICKHOUSE_SKIP_USER_SETUP=1`
env param to [bypass default user
setup](https://clickhouse.com/docs/install/docker#managing-default-user)
and allow external network access. There was also a bug where if you try
to access results using `similarity_search_with_relevance_scores`, they
need to unpacked first.
- **Issue:** Fixes#32094 if someone following tutorial with default
Clickhouse configurations.
# Description
Updated documentation to reflect Microsoft’s rebranding of Azure AI
Studio to Azure AI Foundry. This ensures consistency with current Azure
terminology across the docs.
# Issue
N/A
# Dependencies
None
The async version of the test should use the `ayield_keys` method
instead of `yield_keys`.
Otherwise tools such as `blockbuster` may trigger on a blocking call.
**Description:**
Fixed corrupted text in the code cell output of the documentation
notebook. The code cell itself was correct, but the saved output
contained garbage text.
**Issue:**
The saved output in the documentation notebook contained garbage/typo
text in the table name.
**Dependencies:**
None
Having vercel attempt to deploy on each commit (even if unrelated to
docs) was getting annoying. Options:
- `[skip-preview]`
- `[no-preview]`
- `[skip-deploy]`
Full example: `fix(core): resolve memory leak [no-preview]`
* Create usage metadata on
[`message_delta`](https://docs.anthropic.com/en/docs/build-with-claude/streaming#event-types)
instead of at the beginning. Consequently, token counts are not included
during streaming but instead at the end. This allows for accurate
reporting of server-side tool usage (important for billing)
* Add some clarifying comments
* Fix some outstanding Pylance warnings
* Remove unnecessary `text` popping in thinking blocks
* Also now correctly reports `input_cache_read`/`input_cache_creation`
as a result
When citations are returned from streaming, they include a `file_id:
null` field in their `content_block_location` structure.
When these citations are passed back to the API in subsequent messages,
the API rejects them with "Extra inputs are not permitted" for the
`file_id` field.
**Description:**
Corrected LangGraph documentation link (changed to “guides”), and added
a link to LangGraph JS how-to guides for clarity.
**Issue:**
N/A
**Dependencies:**
None
---------
Co-authored-by: Mason Daugherty <mason@langchain.dev>
The appropriate `ToolNode` attribute for error handling is called
`handle_tool_errors` instead of `handle_tool_error`.
For further info see [ToolNode source code in
LangGraph](https://github.com/langchain-ai/langgraph/blob/main/libs/prebuilt/langgraph/prebuilt/tool_node.py#L255)
**Twitter handle:** gitaroktato
- [x] **Add tests and docs**: If you're adding a new integration, you
must include:
1. A test for the integration, preferably unit tests that do not rely on
network access,
2. An example notebook showing its use. It lives in
`docs/docs/integrations` directory.
- [x] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. **We will not consider
a PR unless these three are passing in CI.** See [contribution
guidelines](https://python.langchain.com/docs/contributing/) for more.
Additional guidelines:
- Make sure optional dependencies are imported within a function.
- Please do not add dependencies to `pyproject.toml` files (even
optional ones) unless they are **required** for unit tests.
- Most PRs should not touch more than one package.
- Changes should be backwards compatible.
## Description
This PR adds support for custom header patterns in
`MarkdownHeaderTextSplitter`, allowing users to define non-standard
Markdown header formats (like `**Header**`) and specify their hierarchy
levels.
**Issue:** Fixes#22738
**Dependencies:** None - this change has no new dependencies
**Key Changes:**
- Added optional `custom_header_patterns` parameter to support
non-standard header formats
- Enable splitting on patterns like `**Header**` and `***Header***`
- Maintain full backward compatibility with existing usage
- Added comprehensive tests for custom and mixed header scenarios
## Example Usage
```python
from langchain_text_splitters import MarkdownHeaderTextSplitter
headers_to_split_on = [
("**", "Chapter"),
("***", "Section"),
]
custom_header_patterns = {
"**": 1, # Level 1 headers
"***": 2, # Level 2 headers
}
splitter = MarkdownHeaderTextSplitter(
headers_to_split_on=headers_to_split_on,
custom_header_patterns=custom_header_patterns,
)
# Now **Chapter 1** is treated as a level 1 header
# And ***Section 1.1*** is treated as a level 2 header
```
## Testing
- ✅ Added unit tests for custom header patterns
- ✅ Added tests for mixed standard and custom headers
- ✅ All existing tests pass (backward compatibility maintained)
- ✅ Linting and formatting checks pass
---
The implementation provides a flexible solution while maintaining the
simplicity of the existing API. Users can continue using the splitter
exactly as before, with the new functionality being entirely opt-in
through the `custom_header_patterns` parameter.
---------
Co-authored-by: Mason Daugherty <mason@langchain.dev>
Co-authored-by: Claude <noreply@anthropic.com>
Supersedes #32461
Fixed incorrect input token reporting during streaming when tools are
used. Previously, input tokens were counted at `message_start` before
tool execution, leading to inaccurate counts. Now input tokens are
properly deferred until `message_delta` (completion), aligning with
Anthropic's billing model and SDK expectations.
**Before Fix:**
- Streaming with tools: Input tokens = 0 ❌
- Non-streaming with tools: Input tokens = 472 ✅
**After Fix:**
- Streaming with tools: Input tokens = 472 ✅
- Non-streaming with tools: Input tokens = 472 ✅
Aligns with Anthropic's SDK expectations. The SDK handles input token
updates in `message_delta` events:
```python
# https://github.com/anthropics/anthropic-sdk-python/blob/main/src/anthropic/lib/streaming/_messages.py
if event.usage.input_tokens is not None:
current_snapshot.usage.input_tokens = event.usage.input_tokens
```
Supersedes #32544
Changes to the `trimmer` behavior resulted in the call `"What math
problem was asked?"` to no longer see the relevant query due to the
number of the queries' tokens. Adjusted to not trigger trimming the
relevant part of the message history. Also, add print to the trimmer to
increase observability on what is leaving the context window.
Add note to trimming tut & format links as inline
Thank you for contributing to LangChain! Follow these steps to mark your
pull request as ready for review. **If any of these steps are not
completed, your PR will not be considered for review.**
- [x] **PR title**: Follows the format: {TYPE}({SCOPE}): {DESCRIPTION}
- Examples:
- feat(core): add multi-tenant support
- fix(cli): resolve flag parsing error
- docs(openai): update API usage examples
- Allowed `{TYPE}` values:
- feat, fix, docs, style, refactor, perf, test, build, ci, chore,
revert, release
- Allowed `{SCOPE}` values (optional):
- core, cli, langchain, standard-tests, docs, anthropic, chroma,
deepseek, exa, fireworks, groq, huggingface, mistralai, nomic, ollama,
openai, perplexity, prompty, qdrant, xai
- Note: the `{DESCRIPTION}` must not start with an uppercase letter.
- Once you've written the title, please delete this checklist item; do
not include it in the PR.
- [x] **PR message**: ***Delete this entire checklist*** and replace
with
- **Description:** a description of the change. Include a [closing
keyword](https://docs.github.com/en/issues/tracking-your-work-with-issues/using-issues/linking-a-pull-request-to-an-issue#linking-a-pull-request-to-an-issue-using-a-keyword)
if applicable to a relevant issue.
- **Issue:** the issue # it fixes, if applicable (e.g. Fixes#123)
- **Dependencies:** any dependencies required for this change
- **Twitter handle:** if your PR gets announced, and you'd like a
mention, we'll gladly shout you out!
- [x] **Add tests and docs**: If you're adding a new integration, you
must include:
1. A test for the integration, preferably unit tests that do not rely on
network access,
2. An example notebook showing its use. It lives in
`docs/docs/integrations` directory.
- [x] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. **We will not consider
a PR unless these three are passing in CI.** See [contribution
guidelines](https://python.langchain.com/docs/contributing/) for more.
Additional guidelines:
- Make sure optional dependencies are imported within a function.
- Please do not add dependencies to `pyproject.toml` files (even
optional ones) unless they are **required** for unit tests.
- Most PRs should not touch more than one package.
- Changes should be backwards compatible.
---------
Co-authored-by: Mason Daugherty <github@mdrxy.com>
Enhance the integrations table by adding the `js:
'@langchain/community'` reference for several packages and updating the
titles of specific integrations to avoid improper capitalization
Supersedes #32408
Description:
This PR ensures that tool calls without explicitly provided `args` will
default to an empty dictionary (`{}`), allowing tools with no parameters
(e.g. `def foo() -> str`) to be registered and invoked without
validation errors. This change improves compatibility with agent
frameworks that may omit the `args` field when generating tool calls.
Issue:
See
[langgraph#5722](https://github.com/langchain-ai/langgraph/issues/5722)
–
LangGraph currently emits tool calls without `args`, which leads to
validation errors
when tools with no parameters are invoked. This PR ensures compatibility
by defaulting
`args` to `{}` when missing.
Dependencies:
None
---------
Thank you for contributing to LangChain! Follow these steps to mark your
pull request as ready for review. **If any of these steps are not
completed, your PR will not be considered for review.**
- [ ] **PR title**: Follows the format: {TYPE}({SCOPE}): {DESCRIPTION}
- Examples:
- feat(core): add multi-tenant support
- fix(cli): resolve flag parsing error
- docs(openai): update API usage examples
- Allowed `{TYPE}` values:
- feat, fix, docs, style, refactor, perf, test, build, ci, chore,
revert, release
- Allowed `{SCOPE}` values (optional):
- core, cli, langchain, standard-tests, docs, anthropic, chroma,
deepseek, exa, fireworks, groq, huggingface, mistralai, nomic, ollama,
openai, perplexity, prompty, qdrant, xai
- Note: the `{DESCRIPTION}` must not start with an uppercase letter.
- Once you've written the title, please delete this checklist item; do
not include it in the PR.
- [ ] **PR message**: ***Delete this entire checklist*** and replace
with
- **Description:** a description of the change. Include a [closing
keyword](https://docs.github.com/en/issues/tracking-your-work-with-issues/using-issues/linking-a-pull-request-to-an-issue#linking-a-pull-request-to-an-issue-using-a-keyword)
if applicable to a relevant issue.
- **Issue:** the issue # it fixes, if applicable (e.g. Fixes#123)
- **Dependencies:** any dependencies required for this change
- **Twitter handle:** if your PR gets announced, and you'd like a
mention, we'll gladly shout you out!
- [ ] **Add tests and docs**: If you're adding a new integration, you
must include:
1. A test for the integration, preferably unit tests that do not rely on
network access,
2. An example notebook showing its use. It lives in
`docs/docs/integrations` directory.
- [ ] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. **We will not consider
a PR unless these three are passing in CI.** See [contribution
guidelines](https://python.langchain.com/docs/contributing/) for more.
Additional guidelines:
- Make sure optional dependencies are imported within a function.
- Please do not add dependencies to `pyproject.toml` files (even
optional ones) unless they are **required** for unit tests.
- Most PRs should not touch more than one package.
- Changes should be backwards compatible.
---------
Signed-off-by: jitokim <pigberger70@gmail.com>
Co-authored-by: jito <pigberger70@gmail.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
**Description**
Corrected a typo in the Ollama chatbot example output in
`docs/docs/integrations/chat/ollama.ipynb` where `"got-oss"` was
mistakenly used instead of `"gpt-oss"`.
No functional changes to code; documentation-only update.
All notebook outputs were cleared to keep the diff minimal.
**Issue**
N/A
**Dependencies**
None
**Twitter handle**
N/A
Thank you for contributing to LangChain! Follow these steps to mark your
pull request as ready for review. **If any of these steps are not
completed, your PR will not be considered for review.**
- [x] **PR title**: Follows the format: {TYPE}({SCOPE}): {DESCRIPTION}
- [x] **PR message**: ***Delete this entire checklist*** and replace
with
fix#30146
- [x] **Add tests and docs**: If you're adding a new integration, you
must include:
- [x] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. **We will not consider
a PR unless these three are passing in CI.** See [contribution
guidelines](https://python.langchain.com/docs/contributing/) for more.
Additional guidelines:
- Make sure optional dependencies are imported within a function.
- Please do not add dependencies to `pyproject.toml` files (even
optional ones) unless they are **required** for unit tests.
- Most PRs should not touch more than one package.
- Changes should be backwards compatible.
---------
Co-authored-by: Mason Daugherty <mason@langchain.dev>
Co-authored-by: Mason Daugherty <github@mdrxy.com>
```python
from langchain_anthropic import ChatAnthropic
llm = ChatAnthropic(model="claude-3-5-haiku-latest")
caching_llm = llm.bind(cache_control={"type": "ephemeral"})
caching_llm.invoke(
[
HumanMessage("..."),
AIMessage("..."),
HumanMessage("..."), # <-- final message / content block gets cache annotation
]
)
```
Potentially useful given's Anthropic's [incremental
caching](https://docs.anthropic.com/en/docs/build-with-claude/prompt-caching#continuing-a-multi-turn-conversation)
capabilities:
> During each turn, we mark the final block of the final message with
cache_control so the conversation can be incrementally cached. The
system will automatically lookup and use the longest previously cached
prefix for follow-up messages.
---------
Co-authored-by: Mason Daugherty <mason@langchain.dev>
Co-authored-by: Mason Daugherty <github@mdrxy.com>
This commit removes redundant integration info from details page,
additionally, changing reference from "DigitalOcean GradientAI" to
"DigitalOcean Gradient™ AI" and updating the setup instructions
accordingly.
**Description:**
Two broken links were reported by another LangChain employee. This PR
fixes those links.
Fixed and tested locally.
**Dependencies:**
None
This PR adds documentation for integrating [TrueFoundry’s AI
Gateway](https://www.truefoundry.com/ai-gateway) with Langfuse using the
Langraph OpenAI SDK.
The integration sends requests through TrueFoundry’s AI Gateway for
unified governance, observability, and routing, while Langraph runs on
the client side to capture execution traces and telemetry.
- Issue: N/A
- Dependencies: None
- Twitter - https://x.com/truefoundry
tests - Not applicable
---------
Co-authored-by: Mason Daugherty <mason@langchain.dev>
Co-authored-by: Mason Daugherty <github@mdrxy.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Thank you for contributing to LangChain!
- [x] **PR title**: "package: description"
- Where "package" is whichever of langchain, core, etc. is being
modified. Use "docs: ..." for purely docs changes, "infra: ..." for CI
changes.
- Example: "core: add foobar LLM"
- **Description:** Integrated the Scrapeless package to enable Langchain
users to seamlessly incorporate Scrapeless into their agents.
- **Dependencies:** None
- **Twitter handle:** [Scrapelessteam](https://x.com/Scrapelessteam)
- [x] **Add tests and docs**: If you're adding a new integration, you
must include:
1. A test for the integration, preferably unit tests that do not rely on
network access,
2. An example notebook showing its use. It lives in
`docs/docs/integrations` directory.
- [x] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See [contribution
guidelines](https://python.langchain.com/docs/contributing/) for more.
Additional guidelines:
- Make sure optional dependencies are imported within a function.
- Please do not add dependencies to `pyproject.toml` files (even
optional ones) unless they are **required** for unit tests.
- Most PRs should not touch more than one package.
- Changes should be backwards compatible.
---------
Co-authored-by: Mason Daugherty <mason@langchain.dev>
Co-authored-by: Mason Daugherty <github@mdrxy.com>
# Description
This PR updates the docs for the
[langchain-anchorbrowser](https://pypi.org/project/langchain-anchorbrowser/)
package. It adds a few tools
[Anchor Browser](https://anchorbrowser.io/?utm=langchain) is the
platform for AI Agentic browser automation, which solves the challenge
of automating workflows for web applications that lack APIs or have
limited API coverage. It simplifies the creation, deployment, and
management of browser-based automations, transforming complex web
interactions into simple API endpoints.
---------
Co-authored-by: Mason Daugherty <mason@langchain.dev>
Co-authored-by: Mason Daugherty <github@mdrxy.com>
This PR introduces a new Google partner guide for MCP Toolbox. The
primary goal of this new documentation is to enhance the discoverability
of MCP Toolbox for developers working within the Google ecosystem,
providing them with a clear and direct path to using our tools.
> [!IMPORTANT]
> This PR contains link to a page which is added in #32344. This will
cause deployment failure until that PR is merged.
---------
Co-authored-by: Mason Daugherty <mason@langchain.dev>
Co-authored-by: Mason Daugherty <github@mdrxy.com>
This PR introduces a new integration guide for MCP Toolbox. The primary
goal of this new documentation is to enhance the discoverability of MCP
Toolbox for developers working within the LangChain ecosystem, providing
them with a clear and direct path to using our tools.
This approach was chosen to provide users with a practical, hands-on
example that they can easily follow.
> [!NOTE]
> The page added in this PR is linked to from a section in Google
partners page added in #32356.
---------
Co-authored-by: Lauren Hirata Singh <lauren@langchain.dev>
Co-authored-by: Mason Daugherty <github@mdrxy.com>
Co-authored-by: Mason Daugherty <mason@langchain.dev>
In [Rag Part 1
Tutorial](https://python.langchain.com/docs/tutorials/rag/), when QDrant
vector store is selected, the sample code does not work
It fails with error `ValueError: Collection test not found`
So, this fix is creating that collection and ensuring its dimension size
is matching the selection the embedding size of the selected LLM Model
---------
Co-authored-by: Mason Daugherty <mason@langchain.dev>
Co-authored-by: Mason Daugherty <github@mdrxy.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
```messages_to_pass = [
HumanMessage(content="What's the capital of France?"),
AIMessage(content="The capital of France is Paris."),
HumanMessage(content="And what about Germany?")
]
formatted_prompt = prompt_template.invoke({"msgs": messages_to_pass})
print(formatted_prompt)```
---------
Co-authored-by: Mason Daugherty <mason@langchain.dev>
Co-authored-by: Mason Daugherty <github@mdrxy.com>
**Description:**
I've added a small clarification to the chatbot tutorial. The tutorial
mentions setting the `LANGSMITH_API_KEY`, but doesn't explain how a new
user can get the key from the website. This change adds a brief note to
guide them to the Settings page.
P.S. This is my first pull request, so I'm excited to learn and
contribute!
**Issue:**
N/A
**Dependencies:**
N/A
**Twitter handle:**
@sohamactive
Co-authored-by: Mason Daugherty <mason@langchain.dev>
Closes#32320
This PR updates the `langgraph_agentic_rag.ipynb` notebook to clarify
that LangGraph does not automatically prepend a `SystemMessage`. A
markdown note and an inline Python comment have been added to guide
users to explicitly include a `SystemMessage` when needed.
This improves documentation for developers working with LangGraph-based
agents and avoids confusion about system-level behavior not being
applied.
---------
Co-authored-by: Mason Daugherty <mason@langchain.dev>
Bumps
[actions/download-artifact](https://github.com/actions/download-artifact)
from 4 to 5.
<details>
<summary>Release notes</summary>
<p><em>Sourced from <a
href="https://github.com/actions/download-artifact/releases">actions/download-artifact's
releases</a>.</em></p>
<blockquote>
<h2>v5.0.0</h2>
<h2>What's Changed</h2>
<ul>
<li>Update README.md by <a
href="https://github.com/nebuk89"><code>@nebuk89</code></a> in <a
href="https://redirect.github.com/actions/download-artifact/pull/407">actions/download-artifact#407</a></li>
<li>BREAKING fix: inconsistent path behavior for single artifact
downloads by ID by <a
href="https://github.com/GrantBirki"><code>@GrantBirki</code></a> in <a
href="https://redirect.github.com/actions/download-artifact/pull/416">actions/download-artifact#416</a></li>
</ul>
<h2>v5.0.0</h2>
<h3>🚨 Breaking Change</h3>
<p>This release fixes an inconsistency in path behavior for single
artifact downloads by ID. <strong>If you're downloading single artifacts
by ID, the output path may change.</strong></p>
<h4>What Changed</h4>
<p>Previously, <strong>single artifact downloads</strong> behaved
differently depending on how you specified the artifact:</p>
<ul>
<li><strong>By name</strong>: <code>name: my-artifact</code> → extracted
to <code>path/</code> (direct)</li>
<li><strong>By ID</strong>: <code>artifact-ids: 12345</code> → extracted
to <code>path/my-artifact/</code> (nested)</li>
</ul>
<p>Now both methods are consistent:</p>
<ul>
<li><strong>By name</strong>: <code>name: my-artifact</code> → extracted
to <code>path/</code> (unchanged)</li>
<li><strong>By ID</strong>: <code>artifact-ids: 12345</code> → extracted
to <code>path/</code> (fixed - now direct)</li>
</ul>
<h4>Migration Guide</h4>
<h5>✅ No Action Needed If:</h5>
<ul>
<li>You download artifacts by <strong>name</strong></li>
<li>You download <strong>multiple</strong> artifacts by ID</li>
<li>You already use <code>merge-multiple: true</code> as a
workaround</li>
</ul>
<h5>⚠️ Action Required If:</h5>
<p>You download <strong>single artifacts by ID</strong> and your
workflows expect the nested directory structure.</p>
<p><strong>Before v5 (nested structure):</strong></p>
<pre lang="yaml"><code>- uses: actions/download-artifact@v4
with:
artifact-ids: 12345
path: dist
# Files were in: dist/my-artifact/
</code></pre>
<blockquote>
<p>Where <code>my-artifact</code> is the name of the artifact you
previously uploaded</p>
</blockquote>
<p><strong>To maintain old behavior (if needed):</strong></p>
<pre lang="yaml"><code></tr></table>
</code></pre>
</blockquote>
<p>... (truncated)</p>
</details>
<details>
<summary>Commits</summary>
<ul>
<li><a
href="634f93cb29"><code>634f93c</code></a>
Merge pull request <a
href="https://redirect.github.com/actions/download-artifact/issues/416">#416</a>
from actions/single-artifact-id-download-path</li>
<li><a
href="b19ff43027"><code>b19ff43</code></a>
refactor: resolve download path correctly in artifact download tests
(mainly ...</li>
<li><a
href="e262cbee4a"><code>e262cbe</code></a>
bundle dist</li>
<li><a
href="bff23f9308"><code>bff23f9</code></a>
update docs</li>
<li><a
href="fff8c148a8"><code>fff8c14</code></a>
fix download path logic when downloading a single artifact by id</li>
<li><a
href="448e3f862a"><code>448e3f8</code></a>
Merge pull request <a
href="https://redirect.github.com/actions/download-artifact/issues/407">#407</a>
from actions/nebuk89-patch-1</li>
<li><a
href="47225c44b3"><code>47225c4</code></a>
Update README.md</li>
<li>See full diff in <a
href="https://github.com/actions/download-artifact/compare/v4...v5">compare
view</a></li>
</ul>
</details>
<br />
[](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores)
Dependabot will resolve any conflicts with this PR as long as you don't
alter it yourself. You can also trigger a rebase manually by commenting
`@dependabot rebase`.
[//]: # (dependabot-automerge-start)
[//]: # (dependabot-automerge-end)
---
<details>
<summary>Dependabot commands and options</summary>
<br />
You can trigger Dependabot actions by commenting on this PR:
- `@dependabot rebase` will rebase this PR
- `@dependabot recreate` will recreate this PR, overwriting any edits
that have been made to it
- `@dependabot merge` will merge this PR after your CI passes on it
- `@dependabot squash and merge` will squash and merge this PR after
your CI passes on it
- `@dependabot cancel merge` will cancel a previously requested merge
and block automerging
- `@dependabot reopen` will reopen this PR if it is closed
- `@dependabot close` will close this PR and stop Dependabot recreating
it. You can achieve the same result by closing it manually
- `@dependabot show <dependency name> ignore conditions` will show all
of the ignore conditions of the specified dependency
- `@dependabot ignore this major version` will close this PR and stop
Dependabot creating any more for this major version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this minor version` will close this PR and stop
Dependabot creating any more for this minor version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this dependency` will close this PR and stop
Dependabot creating any more for this dependency (unless you reopen the
PR or upgrade to it yourself)
</details>
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
**Description:**
In the `docs/docs/how_to/structured_output.ipynb` notebook, an
`AIMessage` within the tool-calling few-shot example was missing the
`name="example_assistant"` parameter. This was inconsistent with the
other `AIMessage` instances in the same list.
This change adds the missing `name` parameter to ensure all examples in
the section are consistent, improving the clarity and correctness of the
documentation.
**Issue:** N/A
**Dependencies:** N/A
While trying the line People.schema got a warning.
```The `schema` method is deprecated; use `model_json_schema` instead```
So made the changes and now working file.
Thank you for contributing to LangChain! Follow these steps to mark your pull request as ready for review. **If any of these steps are not completed, your PR will not be considered for review.**
- [ ] **PR title**: Follows the format: {TYPE}({SCOPE}): {DESCRIPTION}
- Examples:
- feat(core): add multi-tenant support
- fix(cli): resolve flag parsing error
- docs(openai): update API usage examples
- Allowed `{TYPE}` values:
- feat, fix, docs, style, refactor, perf, test, build, ci, chore, revert, release
- Allowed `{SCOPE}` values (optional):
- core, cli, langchain, standard-tests, docs, anthropic, chroma, deepseek, exa, fireworks, groq, huggingface, mistralai, nomic, ollama, openai, perplexity, prompty, qdrant, xai
- Note: the `{DESCRIPTION}` must not start with an uppercase letter.
- Once you've written the title, please delete this checklist item; do not include it in the PR.
- [ ] **PR message**: ***Delete this entire checklist*** and replace with
- **Description:** a description of the change. Include a [closing keyword](https://docs.github.com/en/issues/tracking-your-work-with-issues/using-issues/linking-a-pull-request-to-an-issue#linking-a-pull-request-to-an-issue-using-a-keyword) if applicable to a relevant issue.
- **Issue:** the issue # it fixes, if applicable (e.g. Fixes#123)
- **Dependencies:** any dependencies required for this change
- **Twitter handle:** if your PR gets announced, and you'd like a mention, we'll gladly shout you out!
- [ ] **Add tests and docs**: If you're adding a new integration, you must include:
1. A test for the integration, preferably unit tests that do not rely on network access,
2. An example notebook showing its use. It lives in `docs/docs/integrations` directory.
- [ ] **Lint and test**: Run `make format`, `make lint` and `make test` from the root of the package(s) you've modified. **We will not consider a PR unless these three are passing in CI.** See [contribution guidelines](https://python.langchain.com/docs/contributing/) for more.
Additional guidelines:
- Make sure optional dependencies are imported within a function.
- Please do not add dependencies to `pyproject.toml` files (even optional ones) unless they are **required** for unit tests.
- Most PRs should not touch more than one package.
- Changes should be backwards compatible.
Description:
Corrected the guide title from "How deal with high cardinality
categoricals" to "How to deal with high-cardinality categoricals".
- Added missing "to" for grammatical correctness.
- Hyphenated "high-cardinality" for standard compound adjective usage.
Issue:
N/A
Dependencies:
None
Twitter handle:
https://x.com/mishraravibhush
**Description**
Updated the quick setup instructions for JaguarDB in the documentation.
Replaced the outdated Docker image `jaguardb/jaguardb_with_http` with
the current recommended image `jaguardb/jaguardb` for pulling and
running the server.
Not all retrievers use `k` as param name to set the number of results to
return. Even in LangChain itself. Eg:
bc4251b9e0/libs/core/langchain_core/indexing/in_memory.py (L31)
So it's helpful to be able to change it for a given retriever.
The change also adds hints to disable the tests if the retriever doesn't
support setting the param in the constructor or in the invoke method
(for instance, the `InMemoryDocumentIndex` in the link supports in the
constructor but not in the invoke method).
This change is backward compatible.
---------
Co-authored-by: Mason Daugherty <mason@langchain.dev>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
**Description:** fix an issue I discovered when attempting to merge
messages in which one message has an `index` key in its content
dictionary and another does not.
**Description:** This PR improves the contribution setup guide by adding
comprehensive Windows-specific instructions. The changes address a
common pain point for Windows contributors who don't have `make`
installed by default, making the LangChain contribution process more
accessible across different operating systems.
The main improvements include:
- Added a dedicated "Windows Users" section with multiple installation
options for `make` (Chocolatey, Scoop, WSL)
- Provided direct `uv` commands as alternatives to all `make` commands
throughout the setup guide
- Included Windows-specific instructions for testing, formatting,
linting, and spellchecking
- Enhanced the documentation to be more inclusive for Windows developers
This change makes it easier for Windows users to contribute to LangChain
without requiring additional tool installation, while maintaining the
existing workflow for users who already have `make` available.
**Issue:** This addresses the common barrier Windows users face when
trying to contribute to LangChain due to missing `make` commands.
**Dependencies:** None required - this is purely a documentation
improvement.
---------
Co-authored-by: Mason Daugherty <mason@langchain.dev>
## **Description:**
Updated incorrect package names across multiple integration docs by
replacing underscores with hyphens to reflect their actual names on
PyPI. This aligns with the actual PyPI package names and prevents
potential confusion or installation issues.
## **Issue:** N/A
## **Dependencies:** None
## **Twitter handle:** N/A
---------
Co-authored-by: Mason Daugherty <mason@langchain.dev>
langchain-gradientai is Digitalocean's integration with Langchain. It
will help users to build langchain applications using Digitalocean's
GradientAI platform.
---------
Co-authored-by: Mason Daugherty <github@mdrxy.com>
Co-authored-by: Mason Daugherty <mason@langchain.dev>
Description:
Fixed minor typos in the `google_imagen.ipynb` integration notebook
related to image generation prompt formatting. No functional changes
were made — just a documentation correction to improve clarity.
## **Description:**
Updated incorrect package names in `FeatureTables.js` by replacing
underscores with hyphens to reflect their actual names on PyPI. This
aligns with the actual PyPI package names and prevents potential
confusion or installation issues.
The following package names were corrected:
- `langchain_aws` ➝ `langchain-aws`
- `langchain_community` ➝ `langchain-community`
- `langchain_elasticsearch` ➝ `langchain-elasticsearch`
- `langchain_google_community` ➝ `langchain-google-community`
## **Issue:** N/A
## **Dependencies:** None
## **Twitter handle:** N/A
Description: Documentation is inconsistent with API docs.
Current documentation implies that to use the integration you must have
credentials configured AND store the path to a service account JSON
file.
API docs explain that you must only complete EITHER of the steps
regarding credentials.
I have updated the docs to make them consistent with the API wording.
## **Description:**
Refactored multiple entries in `kv_store_feat_table.py` to ensure that
all vector store metadata is accurate, consistent, and aligned with
LangChain's latest documentation structure and PyPI naming standards.
**Key improvements across all updated entries:**
- Updated `class` links to point to their respective **docs-based
integration pages** (e.g., `/docs/integrations/stores/...`) instead of
raw API reference URLs.
- Corrected `package` display names to use **hyphenated PyPI-compliant
names** (e.g., `langchain-astradb` instead of `langchain_astradb`).
- Updated `package` links to point to the **specific class-level API
references** (e.g., `/api_reference/.../storage/...ClassName.html`) for
precision.
These improvements enhance:
- Navigation experience for users
- Alignment with PyPI and docs naming conventions
- Clarity across LangChain’s integrations documentation
## **Issue:** N/A
## **Dependencies:** None
## **Twitter handle:** N/A
docs(alpha_vantage): add link for ALPHAVANTAGE_API_KEY generation in
integration notebook
**Description:**
This PR updates the `docs/docs/integrations/tools/alpha_vantage.ipynb`
integration notebook to help users locate the API key registration page
for Alpha Vantage. The following markdown line was added:
---------
Co-authored-by: Mason Daugherty <mason@langchain.dev>
## **Description:**
This PR updates the internal documentation link for the RAG tutorials to
reflect the updated path. Previously, the link pointed to the root
`/docs/tutorials/`, which was generic. It now correctly routes to the
RAG-specific tutorial page for the following vector store docs.
1. AstraDBVectorStore
2. Clickhouse
3. CouchbaseSearchVectorStore
4. DatabricksVectorSearch
5. ElasticsearchStore
6. FAISS
7. Milvus
8. MongoDBAtlasVectorSearch
9. openGauss
10. PGVector
11. PGVectorStore
12. PineconeVectorStore
13. QdrantVectorStore
14. Redis
15. SQLServer
## **Issue:** N/A
## **Dependencies:** None
## **Twitter handle:** N/A
Fixes a streaming bug where models like Qwen3 (using OpenAI interface)
send tool call chunks with inconsistent indices, resulting in
duplicate/erroneous tool calls instead of a single merged tool call.
## Problem
When Qwen3 streams tool calls, it sends chunks with inconsistent `index`
values:
- First chunk: `index=1` with tool name and partial arguments
- Subsequent chunks: `index=0` with `name=None`, `id=None` and argument
continuation
The existing `merge_lists` function only merges chunks when their
`index` values match exactly, causing these logically related chunks to
remain separate, resulting in multiple incomplete tool calls instead of
one complete tool call.
```python
# Before fix: Results in 1 valid + 1 invalid tool call
chunk1 = AIMessageChunk(tool_call_chunks=[
{"name": "search", "args": '{"query":', "id": "call_123", "index": 1}
])
chunk2 = AIMessageChunk(tool_call_chunks=[
{"name": None, "args": ' "test"}', "id": None, "index": 0}
])
merged = chunk1 + chunk2 # Creates 2 separate tool calls
# After fix: Results in 1 complete tool call
merged = chunk1 + chunk2 # Creates 1 merged tool call: search({"query": "test"})
```
## Solution
Enhanced the `merge_lists` function in `langchain_core/utils/_merge.py`
with intelligent tool call chunk merging:
1. **Preserves existing behavior**: Same-index chunks still merge as
before
2. **Adds special handling**: Tool call chunks with
`name=None`/`id=None` that don't match any existing index are now merged
with the most recent complete tool call chunk
3. **Maintains backward compatibility**: All existing functionality
works unchanged
4. **Targeted fix**: Only affects tool call chunks, doesn't change
behavior for other list items
The fix specifically handles the pattern where:
- A continuation chunk has `name=None` and `id=None` (indicating it's
part of an ongoing tool call)
- No matching index is found in existing chunks
- There exists a recent tool call chunk with a valid name or ID to merge
with
## Testing
Added comprehensive test coverage including:
- ✅ Qwen3-style chunks with different indices now merge correctly
- ✅ Existing same-index behavior preserved
- ✅ Multiple distinct tool calls remain separate
- ✅ Edge cases handled (empty chunks, orphaned continuations)
- ✅ Backward compatibility maintained
Fixes#31511.
<!-- START COPILOT CODING AGENT TIPS -->
---
💬 Share your feedback on Copilot coding agent for the chance to win a
$200 gift card! Click
[here](https://survey.alchemer.com/s3/8343779/Copilot-Coding-agent) to
start the survey.
---------
Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com>
Co-authored-by: mdrxy <61371264+mdrxy@users.noreply.github.com>
Co-authored-by: Mason Daugherty <github@mdrxy.com>
Co-authored-by: Mason Daugherty <mason@langchain.dev>
## Problem
ChatLiteLLM encounters a `ValidationError` when using cache on
subsequent calls, causing the following error:
```
ValidationError(model='ChatResult', errors=[{'loc': ('generations', 0, 'type'), 'msg': "unexpected value; permitted: 'ChatGeneration'", 'type': 'value_error.const', 'ctx': {'given': 'Generation', 'permitted': ('ChatGeneration',)}}])
```
This occurs because:
1. The cache stores `Generation` objects (with `type="Generation"`)
2. But `ChatResult` expects `ChatGeneration` objects (with
`type="ChatGeneration"` and a required `message` field)
3. When cached values are retrieved, validation fails due to the type
mismatch
## Solution
Added graceful handling in both sync (`_generate_with_cache`) and async
(`_agenerate_with_cache`) cache methods to:
1. **Detect** when cached values contain `Generation` objects instead of
expected `ChatGeneration` objects
2. **Convert** them to `ChatGeneration` objects by wrapping the text
content in an `AIMessage`
3. **Preserve** all original metadata (`generation_info`)
4. **Allow** `ChatResult` creation to succeed without validation errors
## Example
```python
# Before: This would fail with ValidationError
from langchain_community.chat_models import ChatLiteLLM
from langchain_community.cache import SQLiteCache
from langchain.globals import set_llm_cache
set_llm_cache(SQLiteCache(database_path="cache.db"))
llm = ChatLiteLLM(model_name="openai/gpt-4o", cache=True, temperature=0)
print(llm.predict("test")) # Works fine (cache empty)
print(llm.predict("test")) # Now works instead of ValidationError
# After: Seamlessly handles both Generation and ChatGeneration objects
```
## Changes
- **`libs/core/langchain_core/language_models/chat_models.py`**:
- Added `Generation` import from `langchain_core.outputs`
- Enhanced cache retrieval logic in `_generate_with_cache` and
`_agenerate_with_cache` methods
- Added conversion from `Generation` to `ChatGeneration` objects when
needed
-
**`libs/core/tests/unit_tests/language_models/chat_models/test_cache.py`**:
- Added test case to validate the conversion logic handles mixed object
types
## Impact
- **Backward Compatible**: Existing code continues to work unchanged
- **Minimal Change**: Only affects cache retrieval path, no API changes
- **Robust**: Handles both legacy cached `Generation` objects and new
`ChatGeneration` objects
- **Preserves Data**: All original content and metadata is maintained
during conversion
Fixes#22389.
<!-- START COPILOT CODING AGENT TIPS -->
---
💡 You can make Copilot smarter by setting up custom instructions,
customizing its development environment and configuring Model Context
Protocol (MCP) servers. Learn more [Copilot coding agent
tips](https://gh.io/copilot-coding-agent-tips) in the docs.
---------
Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com>
Co-authored-by: mdrxy <61371264+mdrxy@users.noreply.github.com>
Co-authored-by: Mason Daugherty <github@mdrxy.com>
Co-authored-by: Mason Daugherty <mason@langchain.dev>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
**Description:** Fixes incorrect `num_skipped` count in the LangChain
indexing API. The current implementation only counts documents that
already exist in RecordManager (cross-batch duplicates) but fails to
count documents removed during within-batch deduplication via
`_deduplicate_in_order()`.
This PR adds tracking of the original batch size before deduplication
and includes the difference in `num_skipped`, ensuring that `num_added +
num_skipped` equals the total number of input documents.
**Issue:** Fixes incorrect document count reporting in indexing
statistics
**Dependencies:** None
Fixes#32272
---------
Co-authored-by: Alex Feel <afilippov@spotware.com>
Ensures proper reStructuredText formatting by adding the required blank
line before closing docstring quotes, which resolves the "Block quote
ends without a blank line; unexpected unindent" warning.
- **Description:** This PR updates the internal documentation link for
the RAG tutorials to reflect the updated path. Previously, the link
pointed to the root `/docs/tutorials/`, which was generic. It now
correctly routes to the RAG-specific tutorial page.
- **Issue:** N/A
- **Dependencies:** None
- **Twitter handle:** N/A
> × No solution found when resolving dependencies:
╰─▶ Because only langchain-neo4j==0.5.0 is available and
langchain-neo4j==0.5.0 depends on neo4j-graphrag>=1.9.0, we can conclude
that all versions of langchain-neo4j depend on neo4j-graphrag>=1.9.0.
And because only neo4j-graphrag<=1.9.0 is available and
neo4j-graphrag==1.9.0 depends on pypdf>=5.1.0,<6.0.0, we can conclude
that all versions of langchain-neo4j depend on pypdf>=5.1.0,<6.0.0.
And because langchain-upstage==0.6.0 depends on pypdf>=4.2.0,<5.0.0
and only langchain-upstage==0.6.0 is available, we can conclude that
all versions of langchain-neo4j and all versions of langchain-upstage
are incompatible.
And because you require langchain-neo4j and langchain-upstage, we can
conclude that your requirements are unsatisfiable.
---------
Co-authored-by: Mason Daugherty <mason@langchain.dev>
**TL;DR much of the provided `Makefile` targets were broken, and any
time I wanted to preview changes locally I either had to refer to a
command Chester gave me or try waiting on a Vercel preview deployment.
With this PR, everything should behave like normal.**
Significant updates to the `Makefile` and documentation files, focusing
on improving usability, adding clear messaging, and fixing/enhancing
documentation workflows.
### Updates to `Makefile`:
#### Enhanced build and cleaning processes:
- Added informative messages (e.g., "📚 Building LangChain
documentation...") to makefile targets like `docs_build`, `docs_clean`,
and `api_docs_build` for better user feedback during execution.
- Introduced a `clean-cache` target to the `docs` `Makefile` to clear
cached dependencies and ensure clean builds.
#### Improved dependency handling:
- Modified `install-py-deps` to create a `.venv/deps_installed` marker,
preventing redundant/duplicate dependency installations and improving
efficiency.
#### Streamlined file generation and infrastructure setup:
- Added caching for the LangServe README download and parallelized
feature table generation
- Added user-friendly completion messages for targets like `copy-infra`
and `render`.
#### Documentation server updates:
- Enhanced the `start` target with messages indicating server start and
URL for local documentation viewing.
---
### Documentation Improvements:
#### Content clarity and consistency:
- Standardized section titles for consistency across documentation
files.
[[1]](diffhunk://#diff-9b1a85ea8a9dcf79f58246c88692cd7a36316665d7e05a69141cfdc50794c82aL1-R1)
[[2]](diffhunk://#diff-944008ad3a79d8a312183618401fcfa71da0e69c75803eff09b779fc8e03183dL1-R1)
- Refined phrasing and formatting in sections like "Dependency
management" and "Formatting and linting" for better readability.
[[1]](diffhunk://#diff-2069d4f956ab606ae6d51b191439283798adaf3a6648542c409d258131617059L6-R6)
[[2]](diffhunk://#diff-2069d4f956ab606ae6d51b191439283798adaf3a6648542c409d258131617059L84-R82)
#### Enhanced workflows:
- Updated instructions for building and viewing documentation locally,
including tips for specifying server ports and handling API reference
previews.
[[1]](diffhunk://#diff-048deddcfd44b242e5b23aed9f2e9ec73afc672244ce14df2a0a316d95840c87L60-R94)
[[2]](diffhunk://#diff-048deddcfd44b242e5b23aed9f2e9ec73afc672244ce14df2a0a316d95840c87L82-R126)
- Expanded guidance on cleaning documentation artifacts and using
linting tools effectively.
[[1]](diffhunk://#diff-048deddcfd44b242e5b23aed9f2e9ec73afc672244ce14df2a0a316d95840c87L82-R126)
[[2]](diffhunk://#diff-048deddcfd44b242e5b23aed9f2e9ec73afc672244ce14df2a0a316d95840c87L107-R142)
#### API reference documentation:
- Improved instructions for generating and formatting in-code
documentation, highlighting best practices for docstring writing.
[[1]](diffhunk://#diff-048deddcfd44b242e5b23aed9f2e9ec73afc672244ce14df2a0a316d95840c87L107-R142)
[[2]](diffhunk://#diff-048deddcfd44b242e5b23aed9f2e9ec73afc672244ce14df2a0a316d95840c87L144-R186)
---
### Minor Changes:
- Added support for a new package name (`langchain_v1`) in the API
documentation generation script.
- Fixed minor capitalization and formatting issues in documentation
files.
[[1]](diffhunk://#diff-2069d4f956ab606ae6d51b191439283798adaf3a6648542c409d258131617059L40-R40)
[[2]](diffhunk://#diff-2069d4f956ab606ae6d51b191439283798adaf3a6648542c409d258131617059L166-R160)
---------
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Thank you for contributing to LangChain!
- **Adding documentation for PGVectorStore**:
docs: Adding documentation for the new PGVectorStore as a part of
langchain-postgres
- **Add docs**: The notebook for PGVectorStore is now added to the
directory `docs/docs/integrations`.
As a part of this change, we've also updated the VectorStore features
table and VectorStoreTabs
---------
Co-authored-by: Chester Curme <chester.curme@gmail.com>
Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>
**Description:**
Fixes a bug in the file callback test where ANSI escape codes were
causing test failures. The improved test now properly handles ANSI
escape sequences by:
- Using exact string comparison instead of substring checking
- Applying the `strip_ansi` function consistently to all file contents
- Adding descriptive assertion messages
- Maintaining test coverage and backward compatibility
The changes ensure tests pass reliably even when terminal control
sequences are present in the output
**Issue:** Fixes#32150
**Dependencies:** None required - uses existing dependencies only.
---------
Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>
This PR addresses the common issue where users struggle to pass custom
parameters to OpenAI-compatible APIs like LM Studio, vLLM, and others.
The problem occurs when users try to use `model_kwargs` for custom
parameters, which causes API errors.
## Problem
Users attempting to pass custom parameters (like LM Studio's `ttl`
parameter) were getting errors:
```python
# ❌ This approach fails
llm = ChatOpenAI(
base_url="http://localhost:1234/v1",
model="mlx-community/QwQ-32B-4bit",
model_kwargs={"ttl": 5} # Causes TypeError: unexpected keyword argument 'ttl'
)
```
## Solution
The `extra_body` parameter is the correct way to pass custom parameters
to OpenAI-compatible APIs:
```python
# ✅ This approach works correctly
llm = ChatOpenAI(
base_url="http://localhost:1234/v1",
model="mlx-community/QwQ-32B-4bit",
extra_body={"ttl": 5} # Custom parameters go in extra_body
)
```
## Changes Made
1. **Enhanced Documentation**: Updated the `extra_body` parameter
docstring with comprehensive examples for LM Studio, vLLM, and other
providers
2. **Added Documentation Section**: Created a new "OpenAI-compatible
APIs" section in the main class docstring with practical examples
3. **Unit Tests**: Added tests to verify `extra_body` functionality
works correctly:
- `test_extra_body_parameter()`: Verifies custom parameters are included
in request payload
- `test_extra_body_with_model_kwargs()`: Ensures `extra_body` and
`model_kwargs` work together
4. **Clear Guidance**: Documented when to use `extra_body` vs
`model_kwargs`
## Examples Added
**LM Studio with TTL (auto-eviction):**
```python
ChatOpenAI(
base_url="http://localhost:1234/v1",
api_key="lm-studio",
model="mlx-community/QwQ-32B-4bit",
extra_body={"ttl": 300} # Auto-evict after 5 minutes
)
```
**vLLM with custom sampling:**
```python
ChatOpenAI(
base_url="http://localhost:8000/v1",
api_key="EMPTY",
model="meta-llama/Llama-2-7b-chat-hf",
extra_body={
"use_beam_search": True,
"best_of": 4
}
)
```
## Why This Works
- `model_kwargs` parameters are passed directly to the OpenAI client's
`create()` method, causing errors for non-standard parameters
- `extra_body` parameters are included in the HTTP request body, which
is exactly what OpenAI-compatible APIs expect for custom parameters
Fixes#32115.
<!-- START COPILOT CODING AGENT TIPS -->
---
💬 Share your feedback on Copilot coding agent for the chance to win a
$200 gift card! Click
[here](https://survey.alchemer.com/s3/8343779/Copilot-Coding-agent) to
start the survey.
---------
Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com>
Co-authored-by: mdrxy <61371264+mdrxy@users.noreply.github.com>
Co-authored-by: Mason Daugherty <github@mdrxy.com>
Co-authored-by: Mason Daugherty <mason@langchain.dev>
Further clean up of namespace:
- Removed prompts (we'll re-add in a separate commit)
- Remove LocalFileStore until we can review whether all the
implementation details are necessary
- Remove message processing logic from memory (we'll figure out where to
expose it)
- Remove `Tool` primitive (should be sufficient to use `BaseTool` for
typing purposes)
- Remove utilities to create kv stores. Unclear if they've had much
usage outside MultiparentRetriever
This PR adds scaffolding for langchain 1.0 entry package.
Most contents have been removed.
Currently remaining entrypoints for:
* chat models
* embedding models
* memory -> trimming messages, filtering messages and counting tokens
[we may remove this]
* prompts -> we may remove some prompts
* storage: primarily to support cache backed embeddings, may remove the
kv store
* tools -> report tool primitives
Things to be added:
* Selected agent implementations
* Selected workflows
* Common primitives: messages, Document
* Primitives for type hinting: BaseChatModel, BaseEmbeddings
* Selected retrievers
* Selected text splitters
Things to be removed:
* Globals needs to be removed (needs an update in langchain core)
Todos:
* TBD indexing api (requires sqlalchemy which we don't want as a
dependency)
* Be explicit about public/private interfaces (e.g., likely rename
chat_models.base.py to something more internal)
* Remove dockerfiles
* Update module doc-strings and README.md
The `_dereference_refs_helper` in `langchain_core.utils.json_schema`
incorrectly handled objects with a reference and other fields.
**Issue**: #32170
# Description
We change the check so that it accepts other keys in the object.
## Summary
- Fixed redundant word "done" in SECURITY.md line 69
- Fixed grammar errors in Fireworks README.md line 77: "how it fares
compares" → "how it compares" and "in terms just" → "in terms of"
## Test plan
- [x] Verified changes improve readability and correct grammar
- [x] No functional changes, documentation only
🤖 Generated with [Claude Code](https://claude.ai/code)
Co-authored-by: Claude <claude@anthropic.com>
Co-authored-by: Claude <noreply@anthropic.com>
Multiple models were
[retired](https://docs.anthropic.com/en/docs/about-claude/model-deprecations#model-status)
yesterday.
Tests remain broken until we figure out what to do with the legacy
Anthropic LLM integration— currently uses their (legacy) text
completions API, for which there appear to be no remaining supported
models.
* Adding support for more Chroma client options (`HttpClient` and
`CloundClient`). This includes adding arguments necessary for
instantiating these clients.
* Adding support for Chroma's new persisted collection configuration (we
moved index configuration into this new construct).
* Delegate `Settings` configuration to Chroma's client constructors.
## Problem
When using `ChatOllama` with `create_react_agent`, agents would
sometimes terminate prematurely with empty responses when Ollama
returned `done_reason: 'load'` responses with no content. This caused
agents to return empty `AIMessage` objects instead of actual generated
text.
```python
from langchain_ollama import ChatOllama
from langgraph.prebuilt import create_react_agent
from langchain_core.messages import HumanMessage
llm = ChatOllama(model='qwen2.5:7b', temperature=0)
agent = create_react_agent(model=llm, tools=[])
result = agent.invoke(HumanMessage('Hello'), {"configurable": {"thread_id": "1"}})
# Before fix: AIMessage(content='', response_metadata={'done_reason': 'load'})
# Expected: AIMessage with actual generated content
```
## Root Cause
The `_iterate_over_stream` and `_aiterate_over_stream` methods treated
any response with `done: True` as final, regardless of `done_reason`.
When Ollama returns `done_reason: 'load'` with empty content, it
indicates the model was loaded but no actual generation occurred - this
should not be considered a complete response.
## Solution
Modified the streaming logic to skip responses when:
- `done: True`
- `done_reason: 'load'`
- Content is empty or contains only whitespace
This ensures agents only receive actual generated content while
preserving backward compatibility for load responses that do contain
content.
## Changes
- **`_iterate_over_stream`**: Skip empty load responses instead of
yielding them
- **`_aiterate_over_stream`**: Apply same fix to async streaming
- **Tests**: Added comprehensive test cases covering all edge cases
## Testing
All scenarios now work correctly:
- ✅ Empty load responses are skipped (fixes original issue)
- ✅ Load responses with actual content are preserved (backward
compatibility)
- ✅ Normal stop responses work unchanged
- ✅ Streaming behavior preserved
- ✅ `create_react_agent` integration fixed
Fixes#31482.
<!-- START COPILOT CODING AGENT TIPS -->
---
💡 You can make Copilot smarter by setting up custom instructions,
customizing its development environment and configuring Model Context
Protocol (MCP) servers. Learn more [Copilot coding agent
tips](https://gh.io/copilot-coding-agent-tips) in the docs.
---------
Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com>
Co-authored-by: mdrxy <61371264+mdrxy@users.noreply.github.com>
Co-authored-by: Mason Daugherty <github@mdrxy.com>
## **Description:**
This PR updates the internal documentation link for the RAG tutorials to
reflect the updated path. Previously, the link pointed to the root
`/docs/tutorials/`, which was generic. It now correctly routes to the
RAG-specific tutorial page for the following text-embedding models.
1. DatabricksEmbeddings
2. IBM watsonx.ai
3. OpenAIEmbeddings
4. NomicEmbeddings
5. CohereEmbeddings
6. MistralAIEmbeddings
7. FireworksEmbeddings
8. TogetherEmbeddings
9. LindormAIEmbeddings
10. ModelScopeEmbeddings
11. ClovaXEmbeddings
12. NetmindEmbeddings
13. SambaNovaCloudEmbeddings
14. SambaStudioEmbeddings
15. ZhipuAIEmbeddings
## **Issue:** N/A
## **Dependencies:** None
## **Twitter handle:** N/A
This PR addresses deprecation warnings users encounter when using
LangChain tools with Pydantic v2:
```
PydanticDeprecatedSince20: The `schema` method is deprecated; use `model_json_schema` instead.
Deprecated in Pydantic V2.0 to be removed in V3.0.
```
## Root Cause
Several LangChain components were still using the deprecated `.schema()`
method directly instead of the Pydantic v1/v2 compatible approach. While
users calling `.schema()` on returned models will still see warnings
(which is correct), LangChain's internal code should not generate these
warnings.
## Changes Made
Updated 3 files to use the standard compatibility pattern:
```python
# Before (deprecated)
schema = model.schema()
# After (compatible with both v1 and v2)
if hasattr(model, "model_json_schema"):
schema = model.model_json_schema() # Pydantic v2
else:
schema = model.schema() # Pydantic v1
```
### Files Updated:
- **`evaluation/parsing/json_schema.py`**: Fixed `_parse_json()` method
to handle Pydantic models correctly
- **`output_parsers/yaml.py`**: Fixed `get_format_instructions()` to use
compatible schema access
- **`chains/openai_functions/citation_fuzzy_match.py`**: Fixed direct
`.schema()` call on QuestionAnswer model
## Verification
✅ **Zero breaking changes** - all existing functionality preserved
✅ **No deprecation warnings** from LangChain internal code
✅ **Backward compatible** with Pydantic v1
✅ **Forward compatible** with Pydantic v2
✅ **Edge cases handled** (strings, plain objects, etc.)
## User Impact
LangChain users will no longer see deprecation warnings from internal
LangChain code. Users who directly call `.schema()` on schemas returned
by LangChain should adopt the same compatibility pattern:
```python
# User code should use this pattern
input_schema = tool.get_input_schema()
if hasattr(input_schema, "model_json_schema"):
schema_result = input_schema.model_json_schema()
else:
schema_result = input_schema.schema()
```
Fixes#31458.
<!-- START COPILOT CODING AGENT TIPS -->
---
💬 Share your feedback on Copilot coding agent for the chance to win a
$200 gift card! Click
[here](https://survey.alchemer.com/s3/8343779/Copilot-Coding-agent) to
start the survey.
---------
Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com>
Co-authored-by: mdrxy <61371264+mdrxy@users.noreply.github.com>
Co-authored-by: Mason Daugherty <github@mdrxy.com>
This PR fixes the PostgreSQL NUL byte issue that causes
`psycopg.DataError` when inserting documents containing `\x00` bytes
into PostgreSQL-based vector stores.
## Problem
PostgreSQL text fields cannot contain NUL (0x00) bytes. When documents
with such characters are processed by PGVector or langchain-postgres
implementations, they fail with:
```
(psycopg.DataError) PostgreSQL text fields cannot contain NUL (0x00) bytes
```
This commonly occurs when processing PDFs, documents from various
loaders, or text extracted by libraries like unstructured that may
contain embedded NUL bytes.
## Solution
Added `sanitize_for_postgres()` utility function to
`langchain_core.utils.strings` that removes or replaces NUL bytes from
text content.
### Key Features
- **Simple API**: `sanitize_for_postgres(text, replacement="")`
- **Configurable**: Replace NUL bytes with empty string (default) or
space for readability
- **Comprehensive**: Handles all problematic examples from the original
issue
- **Well-tested**: Complete unit tests with real-world examples
- **Backward compatible**: No breaking changes, purely additive
### Usage Example
```python
from langchain_core.utils import sanitize_for_postgres
from langchain_core.documents import Document
# Before: This would fail with DataError
problematic_content = "Getting\x00Started with embeddings"
# After: Clean the content before database insertion
clean_content = sanitize_for_postgres(problematic_content)
# Result: "GettingStarted with embeddings"
# Or preserve readability with spaces
readable_content = sanitize_for_postgres(problematic_content, " ")
# Result: "Getting Started with embeddings"
# Use in Document processing
doc = Document(page_content=clean_content, metadata={...})
```
### Integration Pattern
PostgreSQL vector store implementations should sanitize content before
insertion:
```python
def add_documents(self, documents: List[Document]) -> List[str]:
# Sanitize documents before insertion
sanitized_docs = []
for doc in documents:
sanitized_content = sanitize_for_postgres(doc.page_content, " ")
sanitized_doc = Document(
page_content=sanitized_content,
metadata=doc.metadata,
id=doc.id
)
sanitized_docs.append(sanitized_doc)
return self._insert_documents_to_db(sanitized_docs)
```
## Changes Made
- Added `sanitize_for_postgres()` function in
`langchain_core/utils/strings.py`
- Updated `langchain_core/utils/__init__.py` to export the new function
- Added comprehensive unit tests in
`tests/unit_tests/utils/test_strings.py`
- Validated against all examples from the original issue report
## Testing
All tests pass, including:
- Basic NUL byte removal and replacement
- Multiple consecutive NUL bytes
- Empty string handling
- Real examples from the GitHub issue
- Backward compatibility with existing string utilities
This utility enables PostgreSQL integrations in both langchain-community
and langchain-postgres packages to handle documents with NUL bytes
reliably.
Fixes#26033.
<!-- START COPILOT CODING AGENT TIPS -->
---
💬 Share your feedback on Copilot coding agent for the chance to win a
$200 gift card! Click
[here](https://survey.alchemer.com/s3/8343779/Copilot-Coding-agent) to
start the survey.
---------
Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com>
Co-authored-by: mdrxy <61371264+mdrxy@users.noreply.github.com>
Co-authored-by: Mason Daugherty <github@mdrxy.com>
The vectorstore feature table in the documentation was showing incorrect
information for the "IDs in add Documents" capability. Most vectorstores
were marked as ❌ (not supported) when they actually support extracting
IDs from documents.
## Problem
The issue was an inconsistency between two sources of truth:
- **JavaScript feature table** (`docs/src/theme/FeatureTables.js`):
Hardcoded `idsInAddDocuments: false` for most vectorstores
- **Python script** (`docs/scripts/vectorstore_feat_table.py`):
Correctly showed `"IDs in add Documents": True` for most vectorstores
## Root Cause
All vectorstores inherit the base `VectorStore.add_documents()` method
which automatically extracts document IDs:
```python
# From libs/core/langchain_core/vectorstores/base.py lines 277-284
if "ids" not in kwargs:
ids = [doc.id for doc in documents]
# If there's at least one valid ID, we'll assume that IDs should be used.
if any(ids):
kwargs["ids"] = ids
```
Since no vectorstores override `add_documents()`, they all inherit this
behavior and support IDs in documents.
## Solution
Updated `idsInAddDocuments` from `false` to `true` for 13 vectorstores:
- AstraDBVectorStore, Chroma, Clickhouse, DatabricksVectorSearch
- ElasticsearchStore, FAISS, InMemoryVectorStore,
MongoDBAtlasVectorSearch
- PGVector, PineconeVectorStore, Redis, Weaviate, SQLServer
The other 4 vectorstores (CouchbaseSearchVectorStore, Milvus, openGauss,
QdrantVectorStore) were already correctly marked as `true`.
## Impact
Users visiting
https://python.langchain.com/docs/integrations/vectorstores/ will now
see accurate information. The "IDs in add Documents" column will
correctly show ✅ for all vectorstores instead of incorrectly showing ❌
for most of them.
This aligns with the API documentation which states: "if kwargs contains
ids and documents contain ids, the ids in the kwargs will receive
precedence" - clearly indicating that document IDs are supported.
Fixes#30622.
<!-- START COPILOT CODING AGENT TIPS -->
---
💬 Share your feedback on Copilot coding agent for the chance to win a
$200 gift card! Click
[here](https://survey.alchemer.com/s3/8343779/Copilot-Coding-agent) to
start the survey.
---------
Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com>
Co-authored-by: mdrxy <61371264+mdrxy@users.noreply.github.com>
See https://docs.astral.sh/ruff/rules/#tryceratops-try
* TRY004 (replace by TypeError) in main code is escaped with `noqa` to
not break backward compatibility. The rule is still interesting for new
code.
* TRY301 ignored at the moment. This one is quite hard to fix and I'm
not sure it's very interesting to activate it.
Co-authored-by: Mason Daugherty <mason@langchain.dev>
* **Description:** Updated `parse_result` logic to handle cases where
`self.first_tool_only` is `True` and multiple matching keys share the
same function name. Instead of returning the first match prematurely,
the method now prioritizes filtering results by the specified key to
ensure correct selection.
* **Issue:** #32100
---------
Co-authored-by: Mason Daugherty <github@mdrxy.com>
Co-authored-by: Mason Daugherty <mason@langchain.dev>
**Description:**
This PR makes argument parsing for Ollama tool calls more robust. Some
LLMs—including Ollama—may return arguments as Python-style dictionaries
with single quotes (e.g., `{'a': 1}`), which are not valid JSON and
previously caused parsing to fail.
The updated `_parse_json_string` method in
`langchain_ollama.chat_models` now attempts standard JSON parsing and,
if that fails, falls back to `ast.literal_eval` for safe evaluation of
Python-style dictionaries. This improves interoperability with LLMs and
fixes a common usability issue for tool-based agents.
**Issue:**
Closes#30910
**Dependencies:**
None
**Tests:**
- Added new unit tests for double-quoted JSON, single-quoted dicts,
mixed quoting, and malformed/failure cases.
- All tests pass locally, including new coverage for single-quoted
inputs.
**Notes:**
- No breaking changes.
- No new dependencies introduced.
- Code is formatted and linted (`ruff format`, `ruff check`).
- If maintainers have suggestions for further improvements, I’m happy to
revise!
Thank you for maintaining LangChain! Looking forward to your feedback.
Stricter JSON schema validation broke a test. Test was fixed in
https://github.com/langchain-ai/langchain/pull/32145. Core release runs
old tests (i.e., last released version of langchain-anthropic) against
new core. So we bypass anthropic for release. Will revert after.
Previously, we hit an index out of range error with empty variable names
(accessing tag[0]), now we through a slightly nicer error
---------
Co-authored-by: Mason Daugherty <mason@langchain.dev>
## **Description:**
This PR updates the `link` values for the following integration metadata
entries:
1. **VertexAILLM**
- Changed from: `google_vertexai`
- To: `google_vertex_ai_palm`
2. **NVIDIA**
- Changed from: `NVIDIA`
- To: `nvidia_ai_endpoints`
These changes ensure that the documentation links correspond to the
correct integration paths, improving documentation navigation and
consistency with the integration structure.
## **Issue:** N/A
## **Dependencies:** None
## **Twitter handle:** N/A
Co-authored-by: Mason Daugherty <mason@langchain.dev>
- **Description:** This PR updates the `package` field for the VertexAI
integration in the documentation metadata. The original value was
`langchain-google_vertexai`, which has been corrected to
`langchain-google-vertexai` to reflect the actual package name used in
PyPI and LangChain integrations.
- **Issue:** N/A
- **Dependencies:** None
- **Twitter handle:** N/A
Fixes#32042
## Summary
Fixes a critical bug in JSON Schema reference resolution that prevented
correctly dereferencing numeric components in JSON pointer paths,
specifically for list indices in `anyOf`, `oneOf`, and `allOf` arrays.
## Changes
- Fixed `_retrieve_ref` function in
`libs/core/langchain_core/utils/json_schema.py` to properly handle
numeric components
- Added comprehensive test function `test_dereference_refs_list_index()`
in `libs/core/tests/unit_tests/utils/test_json_schema.py`
- Resolved line length formatting issues
- Improved type checking and index validation for list and dictionary
references
## Key Improvements
- Correctly handles list index references in JSON pointer paths
- Maintains backward compatibility with existing dictionary numeric key
functionality
- Adds robust error handling for out-of-bounds and invalid indices
- Passes all test cases covering various reference scenarios
## Test Coverage
- Verified fix for `#/properties/payload/anyOf/1/properties/startDate`
reference
- Tested edge cases including out-of-bounds and negative indices
- Ensured no regression in existing reference resolution functionality
Resolves the reported issue with JSON Schema reference dereferencing for
list indices.
---------
Co-authored-by: open-swe-dev[bot] <open-swe-dev@users.noreply.github.com>
Co-authored-by: Mason Daugherty <github@mdrxy.com>
Co-authored-by: Mason Daugherty <mason@langchain.dev>
Since #29963 BaseCache and Callbacks are imported in BaseLanguageModel
so there's no need to import them and rebuild the models.
Note: fix is available since `langchain-core==0.3.39` and the current
langchain dependency on core is `>=0.3.66` so the fix will always be
there.
- **Description:** Corrected the `link` path in the Google Gemini
integration entry from
`/docs/integrations/text_embedding/google-generative-ai` to
`/docs/integrations/text_embedding/google_generative_ai` to align with
actual directory structure and prevent broken documentation links.
- **Issue:** N/A
- **Dependencies:** None
- **Twitter handle:** N/A
The `num_gpu` parameter in `OllamaEmbeddings` was not being passed to
the Ollama client in the async embedding method, causing GPU
acceleration settings to be ignored when using async operations.
## Problem
The issue was in the `aembed_documents` method where the `options`
parameter (containing `num_gpu` and other configuration) was missing:
```python
# Sync method (working correctly)
return self._client.embed(
self.model, texts, options=self._default_params, keep_alive=self.keep_alive
)["embeddings"]
# Async method (missing options parameter)
return (
await self._async_client.embed(
self.model, texts, keep_alive=self.keep_alive # ❌ No options!
)
)["embeddings"]
```
This meant that when users specified `num_gpu=4` (or any other GPU
configuration), it would work with sync calls but be ignored with async
calls.
## Solution
Added the missing `options=self._default_params` parameter to the async
embed call to match the sync version:
```python
# Fixed async method
return (
await self._async_client.embed(
self.model,
texts,
options=self._default_params, # ✅ Now includes num_gpu!
keep_alive=self.keep_alive,
)
)["embeddings"]
```
## Validation
- ✅ Added unit test to verify options are correctly passed in both sync
and async methods
- ✅ All existing tests continue to pass
- ✅ Manual testing confirms `num_gpu` parameter now works correctly
- ✅ Code passes linting and formatting checks
The fix ensures that GPU configuration works consistently across both
synchronous and asynchronous embedding operations.
Fixes#32059.
<!-- START COPILOT CODING AGENT TIPS -->
---
💡 You can make Copilot smarter by setting up custom instructions,
customizing its development environment and configuring Model Context
Protocol (MCP) servers. Learn more [Copilot coding agent
tips](https://gh.io/copilot-coding-agent-tips) in the docs.
---------
Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com>
Co-authored-by: mdrxy <61371264+mdrxy@users.noreply.github.com>
Co-authored-by: Mason Daugherty <github@mdrxy.com>
Description
The Perplexity chat model already returns a search_results field, but
LangChain dropped it when mapping Perplexity responses to
additional_kwargs.
This patch adds "search_results" to the allowed attribute lists in both
_stream and _generate, so downstream code can access it just like
images, citations, or related_questions.
Dependencies
None. The change is purely internal; no new imports or optional
dependencies required.
https://community.perplexity.ai/t/new-feature-search-results-field-with-richer-metadata/398
---------
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Co-authored-by: Mason Daugherty <github@mdrxy.com>
## Description
When ChatDeepSeek invokes a tool that returns a list, it results in an
openai.UnprocessableEntityError due to a failure in deserializing the
JSON body.
The root of the problem is that ChatDeepSeek uses BaseChatOpenAI
internally, but the APIs are not identical: OpenAI v1/chat/completions
accepts arrays as tool results, but Deepseek API does not.
As a solution added `_get_request_payload` method to ChatDeepSeek, which
inherits the behavior from BaseChatOpenAI but adds a step to stringify
tool message content in case the content is an array. I also add a unit
test for this.
From the linked issue you can find the full reproducible example the
reporter of the issue provided. After the changes it works as expected.
Source: [Deepseek
docs](https://api-docs.deepseek.com/api/create-chat-completion/)

Source: [OpenAI
docs](https://platform.openai.com/docs/api-reference/chat/create)

## Issue
Fixes#31394
## Dependencies:
No new dependencies.
## Twitter handle:
Don't have one.
---------
Co-authored-by: Mason Daugherty <github@mdrxy.com>
Co-authored-by: Mason Daugherty <mason@langchain.dev>
- **Description:** Ensure that the tool description is an empty string
when creating a Structured Tool from a Pydantic class in case no
description is provided
- **Issue:** Fixes#31606
---------
Co-authored-by: Mason Daugherty <mason@langchain.dev>
- **Description**: issues a warning if inf and nan are passed as inputs
to langchain_core.vectorstores.utils._cosine_similarity
- **Issue**: Fixes#31496
- **Dependencies**: no external dependencies added, only warnings module
imported
---------
Co-authored-by: Mason Daugherty <mason@langchain.dev>
Before jumping into tech implementation, I added a context for
linearization-config param, and explained what's linealization in this
context.
I also linked an AWS blog for more advanced use cases, as this single
example doesn't cover all use cases.
---------
Co-authored-by: Mason Daugherty <mason@langchain.dev>
**On this PR I am doing two things:**
1. Adding titles to the 4 example we have, to allow the reader to
capture the essence of the paragraph quickly
2. Replacing 'samples' with 'examples', for more clarity,
**Why 'examples' could be a better terminology over 'samples' here?**
1. On the page, we were using both 'samples' and 'examples'
interchangeably which lead to confusion, now 'examples' are the use
cases, while 'samples' are the the sample data being used
2. This is consistent with the rest of the docs, we typically use
'examples' for examples, for example
https://python.langchain.com/docs/integrations/callbacks/fiddler/
## Description
Currently when deserializing objects that contain non-deserializable
values, we throw an error. However, there are cases (e.g. proxies that
return response fields containing extra fields like Python datetimes),
where these values are not important and we just want to drop them.
Twitter handle: @hacubu
---------
Co-authored-by: Mason Daugherty <github@mdrxy.com>
Trying to unblock documentation build pipeline
* Bump langgraph dep in docs
* Update langgraph in lock file (resolves an issue in API reference
generation)
I am modifying two things:
1. "This sample demonstrates" with "The following samples demonstrate"
as we're talking about at least 4 samples
2. Bringing the sentence to after talking about the definition of
textract to keep the document organized (textract definition then
samples)
---------
Co-authored-by: Mason Daugherty <github@mdrxy.com>
**PR title**:
add deprecation notice for PipelinePromptTemplate
**PR message**:
In the API documentation, PipelinePromptTemplate is marked as
deprecated, but this is not mentioned in the docs.
I'm submitting this PR to add a deprecation notice to the docs.
**Tests**:
N/A (documentation only)
---------
Co-authored-by: Mason Daugherty <github@mdrxy.com>
This PR changes the return type hints of the `format_prompt` and
`aformat_prompt` methods in `BaseChatPromptTemplate` from `PromptValue`
to `ChatPromptValue`. Since both methods always return a
`ChatPromptValue`.
**Description:**
Added an explicit validation step in
`langchain_core.vectorstores.utils._cosine_similarity` to raise a
`ValueError` if the input query or any embedding contains `NaN` values.
This prevents silent failures or unstable behavior during similarity
calculations, especially when using maximal_marginal_relevance.
**Issue**:
Fixes#31806
**Dependencies:**
None
---------
Co-authored-by: Azhagammal S C <azhagammal@kofluence.com>
Co-authored-by: Mason Daugherty <mason@langchain.dev>
Thank you for contributing to LangChain!
- [x] **PR title**: "package: description"
- Where "package" is whichever of langchain, core, etc. is being
modified. Use "docs: ..." for purely docs changes, "infra: ..." for CI
changes.
- Example: "core: add foobar LLM"
- [x] **PR message**: ***Delete this entire checklist*** and replace
with
- **Description:** a description of the change
- **Issue:** the issue # it fixes, if applicable
- **Dependencies:** any dependencies required for this change
- **Twitter handle:** if your PR gets announced, and you'd like a
mention, we'll gladly shout you out!
- [x] **Add tests and docs**: If you're adding a new integration, please
include
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in
`docs/docs/integrations` directory.
- [x] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/
Additional guidelines:
- Make sure optional dependencies are imported within a function.
- Please do not add dependencies to pyproject.toml files (even optional
ones) unless they are required for unit tests.
- Most PRs should not touch more than one package.
- Changes should be backwards compatible.
If no one reviews your PR within a few days, please @-mention one of
baskaryan, eyurtsev, ccurme, vbarda, hwchase17.
* update model validation due to change in [Ollama
client](https://github.com/ollama/ollama) - ensure you are running the
latest version (0.9.6) to use `validate_model_on_init`
* add code example and fix formatting for ChatOllama reasoning
* ensure that setting `reasoning` in invocation kwargs overrides
class-level setting
* tests
Thank you for contributing to LangChain!
- [x] **PR title**: "package: description"
- Where "package" is whichever of langchain, core, etc. is being
modified. Use "docs: ..." for purely docs changes, "infra: ..." for CI
changes.
- Example: "core: add foobar LLM"
- [x] **PR message**: ***Delete this entire checklist*** and replace
with
- **Description:** a description of the change
- **Issue:** the issue # it fixes, if applicable
- **Dependencies:** any dependencies required for this change
- **Twitter handle:** if your PR gets announced, and you'd like a
mention, we'll gladly shout you out!
- [x] **Add tests and docs**: If you're adding a new integration, please
include
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in
`docs/docs/integrations` directory.
- [x] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/
Additional guidelines:
- Make sure optional dependencies are imported within a function.
- Please do not add dependencies to pyproject.toml files (even optional
ones) unless they are required for unit tests.
- Most PRs should not touch more than one package.
- Changes should be backwards compatible.
If no one reviews your PR within a few days, please @-mention one of
baskaryan, eyurtsev, ccurme, vbarda, hwchase17.
Thank you for contributing to LangChain!
- [x] **PR title**: "package: description"
- Where "package" is whichever of langchain, core, etc. is being
modified. Use "docs: ..." for purely docs changes, "infra: ..." for CI
changes.
- Example: "core: add foobar LLM"
- [x] **PR message**: ***Delete this entire checklist*** and replace
with
- **Description:** a description of the change
- **Issue:** the issue # it fixes, if applicable
- **Dependencies:** any dependencies required for this change
- **Twitter handle:** if your PR gets announced, and you'd like a
mention, we'll gladly shout you out!
- [x] **Add tests and docs**: If you're adding a new integration, please
include
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in
`docs/docs/integrations` directory.
- [x] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/
Additional guidelines:
- Make sure optional dependencies are imported within a function.
- Please do not add dependencies to pyproject.toml files (even optional
ones) unless they are required for unit tests.
- Most PRs should not touch more than one package.
- Changes should be backwards compatible.
If no one reviews your PR within a few days, please @-mention one of
baskaryan, eyurtsev, ccurme, vbarda, hwchase17.
Thank you for contributing to LangChain!
- [x] **PR title**: "package: description"
- Where "package" is whichever of langchain, core, etc. is being
modified. Use "docs: ..." for purely docs changes, "infra: ..." for CI
changes.
- Example: "core: add foobar LLM"
- [x] **PR message**: ***Delete this entire checklist*** and replace
with
- **Description:** a description of the change
- **Issue:** the issue # it fixes, if applicable
- **Dependencies:** any dependencies required for this change
- **Twitter handle:** if your PR gets announced, and you'd like a
mention, we'll gladly shout you out!
- [x] **Add tests and docs**: If you're adding a new integration, please
include
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in
`docs/docs/integrations` directory.
- [x] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/
Additional guidelines:
- Make sure optional dependencies are imported within a function.
- Please do not add dependencies to pyproject.toml files (even optional
ones) unless they are required for unit tests.
- Most PRs should not touch more than one package.
- Changes should be backwards compatible.
If no one reviews your PR within a few days, please @-mention one of
baskaryan, eyurtsev, ccurme, vbarda, hwchase17.
Harden the default implementation of the XML parser for the agent
---------
Co-authored-by: Mason Daugherty <github@mdrxy.com>
Co-authored-by: Mason Daugherty <mason@langchain.dev>
**Description:**
I traced the kwargs starting at `.invoke()` and it was not clear where
they go. it was clarified to two layers down. so I changed it to make it
more documented for the next person.
**Issue:**
No related issue.
**Dependencies:**
No dependency changes.
**Twitter handle:**
Nah. We're good.
If no one reviews your PR within a few days, please @-mention one of
baskaryan, eyurtsev, ccurme, vbarda, hwchase17.
---------
Co-authored-by: Mason Daugherty <mason@langchain.dev>
This PR updates the doc on Hugging Face's inference offering from
'inference API' to 'inference providers'
---------
Co-authored-by: Mason Daugherty <mason@langchain.dev>
* New `reasoning` (bool) param to support toggling [Ollama
thinking](https://ollama.com/blog/thinking) (#31573, #31700). If
`reasoning=True`, Ollama's `thinking` content will be placed in the
model responses' `additional_kwargs.reasoning_content`.
* Supported by:
* ChatOllama (class level, invocation level TODO)
* OllamaLLM (TODO)
* Added tests to ensure streaming tool calls is successful (#29129)
* Refactored tests that relied on `extract_reasoning()`
* Myriad docs additions and consistency/typo fixes
* Improved type safety in some spots
Closes#29129
Addresses #31573 and #31700
Supersedes #31701
Thank you for contributing to LangChain!
- [x] **PR title**: "package: description"
- Where "package" is whichever of langchain, core, etc. is being
modified. Use "docs: ..." for purely docs changes, "infra: ..." for CI
changes.
- Example: "core: add foobar LLM"
- [x] **PR message**: ***Delete this entire checklist*** and replace
with
- **Description:** a description of the change
- **Issue:** the issue # it fixes, if applicable
- **Dependencies:** any dependencies required for this change
- **Twitter handle:** if your PR gets announced, and you'd like a
mention, we'll gladly shout you out!
- [x] **Add tests and docs**: If you're adding a new integration, please
include
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in
`docs/docs/integrations` directory.
- [x] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/
Additional guidelines:
- Make sure optional dependencies are imported within a function.
- Please do not add dependencies to pyproject.toml files (even optional
ones) unless they are required for unit tests.
- Most PRs should not touch more than one package.
- Changes should be backwards compatible.
If no one reviews your PR within a few days, please @-mention one of
baskaryan, eyurtsev, ccurme, vbarda, hwchase17.
Co-authored-by: Mason Daugherty <mason@langchain.dev>
Thank you for contributing to LangChain!
- [x] **PR title**: "package: description"
- Where "package" is whichever of langchain, core, etc. is being
modified. Use "docs: ..." for purely docs changes, "infra: ..." for CI
changes.
- Example: "core: add foobar LLM"
- [x] **PR message**: ***Delete this entire checklist*** and replace
with
- **Description:** a description of the change
- **Issue:** the issue # it fixes, if applicable
- **Dependencies:** any dependencies required for this change
- **Twitter handle:** if your PR gets announced, and you'd like a
mention, we'll gladly shout you out!
- [x] **Add tests and docs**: If you're adding a new integration, please
include
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in
`docs/docs/integrations` directory.
- [x] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/
Additional guidelines:
- Make sure optional dependencies are imported within a function.
- Please do not add dependencies to pyproject.toml files (even optional
ones) unless they are required for unit tests.
- Most PRs should not touch more than one package.
- Changes should be backwards compatible.
If no one reviews your PR within a few days, please @-mention one of
baskaryan, eyurtsev, ccurme, vbarda, hwchase17.
Integrate Bandit for security analysis, suppress warnings for specific issues, and address potential vulnerabilities such as hardcoded passwords and SQL injection risks. Adjust documentation and formatting for clarity.
* Ensure access to local model during `ChatOllama` instantiation
(#27720). This adds a new param `validate_model_on_init` (default:
`true`)
* Catch a few more errors from the Ollama client to assist users
## Summary
- Removes the `xslt_path` parameter from HTMLSectionSplitter to
eliminate XXE attack vector
- Hardens XML/HTML parsers with secure configurations to prevent XXE
attacks
- Adds comprehensive security tests to ensure the vulnerability is fixed
## Context
This PR addresses a critical XXE vulnerability discovered in the
HTMLSectionSplitter component. The vulnerability allowed attackers to:
- Read sensitive local files (SSH keys, passwords, configuration files)
- Perform Server-Side Request Forgery (SSRF) attacks
- Exfiltrate data to attacker-controlled servers
## Changes Made
1. **Removed `xslt_path` parameter** - This eliminates the primary
attack vector where users could supply malicious XSLT files
2. **Hardened XML parsers** - Added security configurations to prevent
XXE attacks even with the default XSLT:
- `no_network=True` - Blocks network access
- `resolve_entities=False` - Prevents entity expansion -
`load_dtd=False` - Disables DTD processing -
`XSLTAccessControl.DENY_ALL` - Blocks all file/network I/O in XSLT
transformations
3. **Added security tests** - New test file `test_html_security.py` with
comprehensive tests for various XXE attack vectors
4. **Updated existing tests** - Modified tests that were using the
removed `xslt_path` parameter
## Test Plan
- [x] All existing tests pass
- [x] New security tests verify XXE attacks are blocked
- [x] Code passes linting and formatting checks
- [x] Tested with both old and new versions of lxml
Twitter handle: @_colemurray
Recommend using context manager for FileCallbackHandler to avoid opening
too many file descriptors
---------
Co-authored-by: Mason Daugherty <github@mdrxy.com>
Thank you for contributing to LangChain!
- [ ] **PR title**: "package: description"
- Where "package" is whichever of langchain, core, etc. is being
modified. Use "docs: ..." for purely docs changes, "infra: ..." for CI
changes.
- Example: "core: add foobar LLM"
- [ ] **PR message**: ***Delete this entire checklist*** and replace
with
- **Description:** a description of the change
- **Issue:** the issue # it fixes, if applicable
- **Dependencies:** any dependencies required for this change
- **Twitter handle:** if your PR gets announced, and you'd like a
mention, we'll gladly shout you out!
- [ ] **Add tests and docs**: If you're adding a new integration, please
include
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in
`docs/docs/integrations` directory.
- [ ] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/
Additional guidelines:
- Make sure optional dependencies are imported within a function.
- Please do not add dependencies to pyproject.toml files (even optional
ones) unless they are required for unit tests.
- Most PRs should not touch more than one package.
- Changes should be backwards compatible.
If no one reviews your PR within a few days, please @-mention one of
baskaryan, eyurtsev, ccurme, vbarda, hwchase17.
- There was some ambiguous wording that has been updated to hopefully
clarify the functionality of `reasoning_format` in ChatGroq.
- Added support for `reasoning_effort`
- Added links to see models capable of `reasoning_format` and
`reasoning_effort`
- Other minor nits
Thank you for contributing to LangChain!
- [ ] **PR title**: "package: description"
- Where "package" is whichever of langchain, core, etc. is being
modified. Use "docs: ..." for purely docs changes, "infra: ..." for CI
changes.
- Example: "core: add foobar LLM"
- [ ] **PR message**: ***Delete this entire checklist*** and replace
with
- **Description:** a description of the change
- **Issue:** the issue # it fixes, if applicable
- **Dependencies:** any dependencies required for this change
- **Twitter handle:** if your PR gets announced, and you'd like a
mention, we'll gladly shout you out!
- [ ] **Add tests and docs**: If you're adding a new integration, please
include
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in
`docs/docs/integrations` directory.
- [ ] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/
Additional guidelines:
- Make sure optional dependencies are imported within a function.
- Please do not add dependencies to pyproject.toml files (even optional
ones) unless they are required for unit tests.
- Most PRs should not touch more than one package.
- Changes should be backwards compatible.
If no one reviews your PR within a few days, please @-mention one of
baskaryan, eyurtsev, ccurme, vbarda, hwchase17.
- docs: for the Ollama notebooks, improve the specificity of some links,
add `homebrew` install info, update some wording
- tests: reduce number of local models needed to run in half from 4 → 2
(shedding 8gb of required installs)
- bump deps (non-breaking) in anticipation of upcoming "thinking" PR
Add additional hashing options to the indexing API, warn on SHA-1
Requires:
- Bumping langchain-core version
- bumping min langchain-core in langchain
---------
Co-authored-by: ccurme <chester.curme@gmail.com>
**Description:** Updates ChatPerplexity documentation to replace
deprecated llama 3 model reference with the current sonar model in the
API key example code block.
**Issue:** N/A (maintenance update for deprecated model)
**Dependencies:** No new dependencies required
---------
Co-authored-by: Mason Daugherty <github@mdrxy.com>
`Runnable`'s `Input` is contravariant so we need to enumerate all
possible inputs and it's not possible to put them in a `Union`.
Also, it's better to only require a runnable that
accepts`list[BaseMessage]` instead of a broader `Sequence[BaseMessage]`
as internally the runnable is only called with a list.
As part of core releases we run tests on the last released version of
some packages (including langchain-openai) using the new version of
langchain-core. We run langchain-openai's test suite as it was when it
was last released.
Our test for computer use started raising 500 error at some point during
the day today (test passed as part of scheduled test job in the
morning):
> InternalServerError: Error code: 500 - {'error': {'message': 'An error
occurred while processing your request. You can retry your request, or
contact us through our help center at help.openai.com if the error
persists.
Will revert this change after we release langchain-core.
**Description:**
Previously, when transitioning from a deeper Markdown header (e.g., ###)
to a shallower one (e.g., ##), the
ExperimentalMarkdownSyntaxTextSplitter retained the deeper header in the
metadata.
This commit updates the `_resolve_header_stack` method to remove headers
at the same or deeper levels before appending the current header. As a
result, each chunk now reflects only the active header context.
Fixes unexpected metadata leakage across sections in nested Markdown
documents.
Additionally, test cases have been updated to:
- Validate correct header resolution and metadata assignment.
- Cover edge cases with nested headers and horizontal rules.
**Issue:**
Fixes [#31596](https://github.com/langchain-ai/langchain/issues/31596)
**Dependencies:**
None
**Twitter handle:** -> [_RaghuKapur](https://twitter.com/_RaghuKapur)
**LinkedIn:** ->
[https://www.linkedin.com/in/raghukapur/](https://www.linkedin.com/in/raghukapur/)
## Description
<!-- What does this pull request accomplish? -->
- When parsing MistralAI chunk dicts to Langchain to `AIMessageChunk`
schemas via the `_convert_chunk_to_message_chunk` utility function, the
`finish_reason` was not being included in `response_metadata` as it is
for other providers.
- This PR adds a one-liner fix to include the finish reason.
- fixes: https://github.com/langchain-ai/langchain/issues/31666
* Simplified Pydantic handling since Pydantic v1 is not supported
anymore.
* Replace use of deprecated v1 methods by corresponding v2 methods.
* Remove use of other deprecated methods.
* Activate mypy errors on deprecated methods use.
---------
Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>
**Description:**
This PR fixes an `IndexError` that occurs when `LLMListwiseRerank` is
called with an empty list of documents.
Earlier, the code assumed the presence of at least one document and
attempted to construct the context string based on `len(documents) - 1`,
which raises an error when documents is an empty list.
The fix works with gpt-4o-mini if I make the list empty, but fails
occasionally with gpt-3.5-turbo. In case of empty list, setting the
string to "empty list" seems to have the expected response.
**Issue:** #31192
Description:
This pull request corrects minor spelling mistakes in the comments
within the `chat_models.py` file of the MistralAI partner integration.
Specifically, it fixes the spelling of "equivalent" and "compatibility"
in two separate comments. These changes improve code readability and
maintain professional documentation standards. No functional code
changes are included.
`uv lock --upgrade-package langsmith
`
Original issue: The lock file (uv.lock) was constraining
langsmith>=0.1.125,<0.4, preventing LangSmith 0.4.1 installation. Even
though the pyproject.toml wasn't restricting langchain core.
Issue:
https://langchain.slack.com/archives/C050X0VTN56/p1750107176007629
This PR updates the tool runtime example notebook to replace the
deprecated `.schema()` method with `.model_json_schema()`, aligning it
with Pydantic V2.
### 🔧 Changes:
- Replaced:
```python
update_favorite_pets.get_input_schema().schema()
with
update_favorite_pets.get_input_schema().model_json_schema()
```
Fixes#31609
### Description
Add keep_separator arg to HTMLSemanticPreservingSplitter and pass value
to instance of RecursiveCharacterTextSplitter used under the hood.
### Issue
Documents returned by `HTMLSemanticPreservingSplitter.split_text(text)`
are defaulted to use separators at beginning of page_content. [See third
and fourth document in example output from how-to
guide](https://python.langchain.com/docs/how_to/split_html/#using-htmlsemanticpreservingsplitter):
```
[Document(metadata={'Header 1': 'Main Title'}, page_content='This is an introductory paragraph with some basic content.'),
Document(metadata={'Header 2': 'Section 1: Introduction'}, page_content='This section introduces the topic'),
Document(metadata={'Header 2': 'Section 1: Introduction'}, page_content='. Below is a list: First item Second item Third item with bold text and a link Subsection 1.1: Details This subsection provides additional details'),
Document(metadata={'Header 2': 'Section 1: Introduction'}, page_content=". Here's a table: Header 1 Header 2 Header 3 Row 1, Cell 1 Row 1, Cell 2 Row 1, Cell 3 Row 2, Cell 1 Row 2, Cell 2 Row 2, Cell 3"),
Document(metadata={'Header 2': 'Section 2: Media Content'}, page_content='This section contains an image and a video:  '),
Document(metadata={'Header 2': 'Section 3: Code Example'}, page_content='This section contains a code block: <code:html> <div> <p>This is a paragraph inside a div.</p> </div> </code>'),
Document(metadata={'Header 2': 'Conclusion'}, page_content='This is the conclusion of the document.')]
```
### Dependencies
None
@ttrumper3
docs: update multimodal PDF and image usage for gpt-4.1
**Description:**
This update revises the LangChain documentation to support the new
GPT-4.1 multimodal API format. It fixes the previous broken example for
PDF uploads (which returned a 400 error: "Missing required parameter:
'messages[0].content[1].file'") and adds clear instructions on how to
include base64-encoded images for OpenAI models.
**Issue:**
error appointed in foruns for pdf load into api ->
'''
@[Albaeld](https://github.com/Albaeld)
Albaeld
[8 days
ago](https://github.com/langchain-ai/langchain/discussions/27702#discussioncomment-13369460)
This simply does not work with openai:gpt-4.1. I get:
Error code: 400 - {'error': {'message': "Missing required parameter:
'messages[0].content[1].file'.", 'type': 'invalid_request_error',
'param': 'messages[0].content[1].file', 'code':
'missing_required_parameter'}}
'''
**Dependencies:**
None
**Twitter handle:**
N/A
---------
Co-authored-by: Chester Curme <chester.curme@gmail.com>
Description:
This pull request corrects minor typographical errors in the
documentation notebooks for vectorstore integrations. Specifically, it
fixes the spelling of "datastore" in `llm_rails.ipynb` and
"pre-existent" in `redis.ipynb`. These changes improve the clarity and
professionalism of the documentation. No functional code changes are
included.
Thank you for contributing to LangChain!
[x] PR title: langchain_ollama: support custom headers for Ollama
partner APIs
Where "package" is whichever of langchain, core, etc. is being modified.
Use "docs: ..." for purely docs changes, "infra: ..." for CI changes.
Example: "core: add foobar LLM"
[x] PR message:
**Description: This PR adds support for passing custom HTTP headers to
Ollama models when used as a LangChain integration. This is especially
useful for enterprise users or partners who need to send authentication
tokens, API keys, or custom tracking headers when querying secured
Ollama servers.
Issue: N/A (new enhancement)
**Dependencies: No external dependencies introduced.
Twitter handle: @arunkumar_offl
[x] Add tests and docs: If you're adding a new integration, please
include
1.Added a unit test in test_chat_models.py to validate headers are
passed correctly.
2. Added an example notebook:
docs/docs/integrations/llms/ollama_custom_headers.ipynb showing how to
use custom headers.
[x] Lint and test: Ran make format, make lint, and make test to ensure
the code is clean and passing all checks.
Additional guidelines:
Make sure optional dependencies are imported within a function.
Please do not add dependencies to pyproject.toml files (even optional
ones) unless they are required for unit tests.
Most PRs should not touch more than one package.
Changes should be backwards compatible.
If no one reviews your PR within a few days, please @-mention one of
baskaryan, eyurtsev, ccurme, vbarda, hwchase17.
This MR is only for the docs. Added integration with Nebius AI Studio to
docs. The integration package is available at
[https://github.com/nebius/langchain-nebius](https://github.com/nebius/langchain-nebius).
---------
Co-authored-by: Akim Tsvigun <aktsvigun@nebius.com>
Co-authored-by: Chester Curme <chester.curme@gmail.com>
Thank you for contributing to LangChain!
- [ ] **PR title**: "package: description"
- Where "package" is whichever of langchain, core, etc. is being
modified. Use "docs: ..." for purely docs changes, "infra: ..." for CI
changes.
- Example: "core: add foobar LLM"
- [ ] **PR message**: ***Delete this entire checklist*** and replace
with
- **Description:** a description of the change
- **Issue:** the issue # it fixes, if applicable
- **Dependencies:** any dependencies required for this change
- **Twitter handle:** if your PR gets announced, and you'd like a
mention, we'll gladly shout you out!
- [ ] **Add tests and docs**: If you're adding a new integration, please
include
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in
`docs/docs/integrations` directory.
- [ ] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/
Additional guidelines:
- Make sure optional dependencies are imported within a function.
- Please do not add dependencies to pyproject.toml files (even optional
ones) unless they are required for unit tests.
- Most PRs should not touch more than one package.
- Changes should be backwards compatible.
If no one reviews your PR within a few days, please @-mention one of
baskaryan, eyurtsev, ccurme, vbarda, hwchase17.
Thank you for contributing to LangChain!
- [ ] **PR title**: "package: description"
- Where "package" is whichever of langchain, core, etc. is being
modified. Use "docs: ..." for purely docs changes, "infra: ..." for CI
changes.
- Example: "core: add foobar LLM"
- [ ] **PR message**: ***Delete this entire checklist*** and replace
with
- **Description:** a description of the change
- **Issue:** the issue # it fixes, if applicable
- **Dependencies:** any dependencies required for this change
- **Twitter handle:** if your PR gets announced, and you'd like a
mention, we'll gladly shout you out!
- [ ] **Add tests and docs**: If you're adding a new integration, please
include
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in
`docs/docs/integrations` directory.
- [ ] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/
Additional guidelines:
- Make sure optional dependencies are imported within a function.
- Please do not add dependencies to pyproject.toml files (even optional
ones) unless they are required for unit tests.
- Most PRs should not touch more than one package.
- Changes should be backwards compatible.
If no one reviews your PR within a few days, please @-mention one of
baskaryan, eyurtsev, ccurme, vbarda, hwchase17.
- **Description:** Remove the outdated Gemini models and replace those
with the latest models.
- **Issue:** Earlier the code was not running, now the code runs.
- **Dependencies:** No
- **Twitter handle:** [soumendrak_](https://x.com/soumendrak_)
## Description
Updating Exa integration documentation to showcase the latest features
and best practices.
## Changes
- Added examples for `ExaSearchResults` tool with advanced search
options
- Added examples for `ExaFindSimilarResults` tool
- Updated agent example to use LangGraph
- Demonstrated text content options, summaries, and highlights
- Included examples of search type control and live crawling
## Additional Context
I'm from the Exa team updating our integration documentation to reflect
current capabilities and best practices.
Remove proxy imports to langchain_experimental.
Previously, these imports would work if a user manually installed
langchain_experimental. However, we want to drop support even for that
as langchain_experimental is generally not recommended to be run in
production.
---------
Co-authored-by: ccurme <chester.curme@gmail.com>
**Description:**
Fixed a small grammatical error in the `retrievers.mdx` documentation.
Replaced "we can be built retrievers on top of search APIs..." with
"we can build retrievers on top of search APIs..." for clarity and
correctness.
**Issue:**
N/A
**Dependencies:**
None
**Twitter handle:**
@hassan_zameel
OpenAI changed their API to require the `partial_images` parameter when
using image generation + streaming.
As described in https://github.com/langchain-ai/langchain/pull/31424, we
are ignoring partial images. Here, we accept the `partial_images`
parameter (as required by OpenAI), but emit a warning and continue to
ignore partial images.
**Description:**
`langchain_huggingface` has a very large installation size of around 600
MB (on a Mac with Python 3.11). This is due to its dependency on
`sentence-transformers`, which in turn depends on `torch`, which is 320
MB all by itself. Similarly, the depedency on `transformers` adds
another set of heavy dependencies. With those dependencies removed, the
installation of `langchain_huggingface` only takes up ~26 MB. This is
only 5 % of the full installation!
These libraries are not necessary to use `langchain_huggingface`'s API
wrapper classes, only for local inferences/embeddings. All import
statements for those two libraries already have import guards in place
(try/catch with a helpful "please install x" message).
This PR therefore moves those two libraries to an optional dependency
group `full`. So a `pip install langchain_huggingface` will only install
the lightweight version, and a `pip install
"langchain_huggingface[full]"` will install all dependencies.
I know this may break existing code, because `sentence-transformers` and
`transformers` are now no longer installed by default. Given that users
will see helpful error messages when that happens, and the major impact
of this small change, I hope that you will still consider this PR.
**Dependencies:** No new dependencies, but new optional grouping.
- **Description:**
- In _infer_arg_descriptions, the annotations dictionary contains string
representations of types instead of actual typing objects. This causes
_is_annotated_type to fail, preventing the correct description from
being generated.
- This is a simple fix using the get_type_hints method, which resolves
the annotations properly and is supported across all Python versions.
- **Issue:** #31051
---------
Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>
https://github.com/langchain-ai/langchain/pull/31286 included an update
to the return type for `BaseChatModel.(a)stream`, from
`Iterator[BaseMessageChunk]` to `Iterator[BaseMessage]`.
This change is correct, because when streaming is disabled, the stream
methods return an iterator of `BaseMessage`, and the inheritance is such
that an `BaseMessage` is not a `BaseMessageChunk` (but the reverse is
true).
However, LangChain includes a pattern throughout its docs of [summing
BaseMessageChunks](https://python.langchain.com/docs/how_to/streaming/#llms-and-chat-models)
to accumulate a chat model stream. This pattern is implemented in tests
for most integration packages and appears in application code. So
https://github.com/langchain-ai/langchain/pull/31286 introduces mypy
errors throughout the ecosystem (or maybe more accurately, it reveals
that this pattern does not account for use of the `.stream` method when
streaming is disabled).
Here we revert just the change to the stream return type to unblock
things. A fix for this should address docs + integration packages (or if
we elect to just force people to update code, be explicit about that).
**Description:**
This PR updates approximately 4' occurences of the deprecated
`initialize_agent` function in LangChain documentation and examples,
replacing it with the recommended `create_react_agent` and pattern. It
also refactors related examples to align with current best practices.
**Issue:**
Partially Fixes#29277
**Dependencies:**
None
**X handle:**
@TK1475
---------
Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>
This PR adds documentation to our Microsft Provider page for LangChain
Azure AI. This PR does not add any extra dependencies or require any
tests besides passing CI.
Thank you for contributing to LangChain!
- [x] **PR title**: "package: description"
- Where "package" is whichever of langchain, core, etc. is being
modified. Use "docs: ..." for purely docs changes, "infra: ..." for CI
changes.
- Example: "core: add foobar LLM"
- [x] **PR message**: ***Delete this entire checklist*** and replace
with
- **Description:** a description of the change
- **Issue:** the issue # it fixes, if applicable
- **Dependencies:** any dependencies required for this change
- **Twitter handle:** if your PR gets announced, and you'd like a
mention, we'll gladly shout you out!
- [x] **Add tests and docs**: If you're adding a new integration, please
include
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in
`docs/docs/integrations` directory.
- [x] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/
Additional guidelines:
- Make sure optional dependencies are imported within a function.
- Please do not add dependencies to pyproject.toml files (even optional
ones) unless they are required for unit tests.
- Most PRs should not touch more than one package.
- Changes should be backwards compatible.
If no one reviews your PR within a few days, please @-mention one of
baskaryan, eyurtsev, ccurme, vbarda, hwchase17.
## PR Title
```
docs: enhance Salesforce Toolkit documentation
```
## PR Description
**Description:** Enhanced the Salesforce Toolkit documentation to
provide a more comprehensive overview of the `langchain-salesforce`
package. The updates include improved descriptions of the toolkit's
capabilities, detailed setup instructions for authentication using
environment variables, updated code snippets with consistent parameter
naming and improved readability, and additional resources with API
references for better user guidance.
**Issue:** N/A (documentation improvement)
**Dependencies:** None
**Twitter handle:** @colesmcintosh
---
### Changes Made:
- Improved description of the Salesforce Toolkit's capabilities and
features
- Added detailed setup instructions for authentication using environment
variables
- Updated code snippets to use consistent parameter naming and improved
readability
- Included additional resources and API references for better user
guidance
- Enhanced overall documentation structure and clarity
### Files Modified:
- `docs/docs/integrations/tools/salesforce.ipynb` (83 insertions, 36
deletions)
This is a documentation-only change that improves the user experience
for developers working with the Salesforce Toolkit. The changes are
backwards compatible and follow LangChain's documentation standards.
---------
Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>
Does not support partial images during generation at the moment. Before
doing that I'd like to figure out how to specify the aggregation logic
without requiring changes in core.
---------
Co-authored-by: Chester Curme <chester.curme@gmail.com>
change `--no-cache-dirclear` -> `--no-cache-dir`.
pip throws `no such option: --no-cache-dirclear` since its invalid.
`--no-cache-dir` is the correct one.
Thank you for contributing to LangChain!
- [ ] **PR title**: "package: description"
- Where "package" is whichever of langchain, core, etc. is being
modified. Use "docs: ..." for purely docs changes, "infra: ..." for CI
changes.
- Example: "core: add foobar LLM"
- [ ] **PR message**: ***Delete this entire checklist*** and replace
with
- **Description:** a description of the change
- **Issue:** the issue # it fixes, if applicable
- **Dependencies:** any dependencies required for this change
- **Twitter handle:** if your PR gets announced, and you'd like a
mention, we'll gladly shout you out!
- [ ] **Add tests and docs**: If you're adding a new integration, please
include
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in
`docs/docs/integrations` directory.
- [ ] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/
Additional guidelines:
- Make sure optional dependencies are imported within a function.
- Please do not add dependencies to pyproject.toml files (even optional
ones) unless they are required for unit tests.
- Most PRs should not touch more than one package.
- Changes should be backwards compatible.
If no one reviews your PR within a few days, please @-mention one of
baskaryan, eyurtsev, ccurme, vbarda, hwchase17.
…cs/integrations/tools/ All tools section
Thank you for contributing to LangChain!
- [x] **PR title**: "package: description"
- Where "package" is whichever of langchain, core, etc. is being
modified. Use "docs: ..." for purely docs changes, "infra: ..." for CI
changes.
- Example: "core: add foobar LLM"
- [x] **PR message**: ***Delete this entire checklist*** and replace
with
- **Description:** a description of the change
- **Issue:** the issue # it fixes, if applicable
- **Dependencies:** any dependencies required for this change
- **Twitter handle:** if your PR gets announced, and you'd like a
mention, we'll gladly shout you out!
- [x] **Add tests and docs**: If you're adding a new integration, please
include
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in
`docs/docs/integrations` directory.
- [x] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/
Additional guidelines:
- Make sure optional dependencies are imported within a function.
- Please do not add dependencies to pyproject.toml files (even optional
ones) unless they are required for unit tests.
- Most PRs should not touch more than one package.
- Changes should be backwards compatible.
If no one reviews your PR within a few days, please @-mention one of
baskaryan, eyurtsev, ccurme, vbarda, hwchase17.
…tegrations/tools/ All tools section
Thank you for contributing to LangChain!
- [x] **PR title**: "package: description"
- Where "package" is whichever of langchain, core, etc. is being
modified. Use "docs: ..." for purely docs changes, "infra: ..." for CI
changes.
- Example: "core: add foobar LLM"
- [x] **PR message**: ***Delete this entire checklist*** and replace
with
- **Description:** a description of the change
- **Issue:** the issue # it fixes, if applicable
- **Dependencies:** any dependencies required for this change
- **Twitter handle:** if your PR gets announced, and you'd like a
mention, we'll gladly shout you out!
- [x] **Add tests and docs**: If you're adding a new integration, please
include
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in
`docs/docs/integrations` directory.
- [x] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/
Additional guidelines:
- Make sure optional dependencies are imported within a function.
- Please do not add dependencies to pyproject.toml files (even optional
ones) unless they are required for unit tests.
- Most PRs should not touch more than one package.
- Changes should be backwards compatible.
If no one reviews your PR within a few days, please @-mention one of
baskaryan, eyurtsev, ccurme, vbarda, hwchase17.
Thank you for contributing to LangChain!
- [x] **PR title**: "package: description"
- Where "package" is whichever of langchain, core, etc. is being
modified. Use "docs: ..." for purely docs changes, "infra: ..." for CI
changes.
- Example: "core: add foobar LLM"
- [x] **PR message**: ***Delete this entire checklist*** and replace
with
- **Description:** a description of the change
- **Issue:** the issue # it fixes, if applicable
- **Dependencies:** any dependencies required for this change
- **Twitter handle:** if your PR gets announced, and you'd like a
mention, we'll gladly shout you out!
- [x] **Add tests and docs**: If you're adding a new integration, please
include
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in
`docs/docs/integrations` directory.
- [x] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/
Additional guidelines:
- Make sure optional dependencies are imported within a function.
- Please do not add dependencies to pyproject.toml files (even optional
ones) unless they are required for unit tests.
- Most PRs should not touch more than one package.
- Changes should be backwards compatible.
If no one reviews your PR within a few days, please @-mention one of
baskaryan, eyurtsev, ccurme, vbarda, hwchase17.
Thank you for contributing to LangChain!
- [x] **PR title**: "package: description"
- Where "package" is whichever of langchain, core, etc. is being
modified. Use "docs: ..." for purely docs changes, "infra: ..." for CI
changes.
- Example: "core: add foobar LLM"
- [x] **PR message**: ***Delete this entire checklist*** and replace
with
- **Description:** a description of the change
- **Issue:** the issue # it fixes, if applicable
- **Dependencies:** any dependencies required for this change
- **Twitter handle:** if your PR gets announced, and you'd like a
mention, we'll gladly shout you out!
- [x] **Add tests and docs**: If you're adding a new integration, please
include
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in
`docs/docs/integrations` directory.
- [x] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/
Additional guidelines:
- Make sure optional dependencies are imported within a function.
- Please do not add dependencies to pyproject.toml files (even optional
ones) unless they are required for unit tests.
- Most PRs should not touch more than one package.
- Changes should be backwards compatible.
If no one reviews your PR within a few days, please @-mention one of
baskaryan, eyurtsev, ccurme, vbarda, hwchase17.
Thank you for contributing to LangChain!
- [x] **PR title**: "package: description"
- Where "package" is whichever of langchain, core, etc. is being
modified. Use "docs: ..." for purely docs changes, "infra: ..." for CI
changes.
- Example: "core: add foobar LLM"
- [x] **PR message**: ***Delete this entire checklist*** and replace
with
- **Description:** a description of the change
- **Issue:** the issue # it fixes, if applicable
- **Dependencies:** any dependencies required for this change
- **Twitter handle:** if your PR gets announced, and you'd like a
mention, we'll gladly shout you out!
- [x] **Add tests and docs**: If you're adding a new integration, please
include
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in
`docs/docs/integrations` directory.
- [x] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/
Additional guidelines:
- Make sure optional dependencies are imported within a function.
- Please do not add dependencies to pyproject.toml files (even optional
ones) unless they are required for unit tests.
- Most PRs should not touch more than one package.
- Changes should be backwards compatible.
If no one reviews your PR within a few days, please @-mention one of
baskaryan, eyurtsev, ccurme, vbarda, hwchase17.
Thank you for contributing to LangChain!
- [x] **PR title**: "package: description"
- Where "package" is whichever of langchain, core, etc. is being
modified. Use "docs: ..." for purely docs changes, "infra: ..." for CI
changes.
- Example: "core: add foobar LLM"
- [x] **PR message**: ***Delete this entire checklist*** and replace
with
- **Description:** a description of the change
- **Issue:** the issue # it fixes, if applicable
- **Dependencies:** any dependencies required for this change
- **Twitter handle:** if your PR gets announced, and you'd like a
mention, we'll gladly shout you out!
- [x] **Add tests and docs**: If you're adding a new integration, please
include
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in
`docs/docs/integrations` directory.
- [x] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/
Additional guidelines:
- Make sure optional dependencies are imported within a function.
- Please do not add dependencies to pyproject.toml files (even optional
ones) unless they are required for unit tests.
- Most PRs should not touch more than one package.
- Changes should be backwards compatible.
If no one reviews your PR within a few days, please @-mention one of
baskaryan, eyurtsev, ccurme, vbarda, hwchase17.
Llama-3.1 started failing consistently with
> groq.BadRequestError: Error code: 400 - ***'error': ***'message':
"Failed to call a function. Please adjust your prompt. See
'failed_generation' for more details.", 'type': 'invalid_request_error',
'code': 'tool_use_failed', 'failed_generation':
'<function=brave_search>***"query": "Hello!"***</function>'***
Thank you for contributing to LangChain!
- [x] **PR title**: "package: description"
- Where "package" is whichever of langchain, core, etc. is being
modified. Use "docs: ..." for purely docs changes, "infra: ..." for CI
changes.
- Example: "core: add foobar LLM"
- [x] **PR message**: ***Delete this entire checklist*** and replace
with
- **Description:** a description of the change
- **Issue:** the issue # it fixes, if applicable
- **Dependencies:** any dependencies required for this change
- **Twitter handle:** if your PR gets announced, and you'd like a
mention, we'll gladly shout you out!
- [x] **Add tests and docs**: If you're adding a new integration, please
include
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in
`docs/docs/integrations` directory.
- [x] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/
Additional guidelines:
- Make sure optional dependencies are imported within a function.
- Please do not add dependencies to pyproject.toml files (even optional
ones) unless they are required for unit tests.
- Most PRs should not touch more than one package.
- Changes should be backwards compatible.
If no one reviews your PR within a few days, please @-mention one of
baskaryan, eyurtsev, ccurme, vbarda, hwchase17.
Co-authored-by: ccurme <chester.curme@gmail.com>
Release core 0.3.63
Small update just to expand the list of well known tools. This is
necessary while the logic lives in langchain-core.
---------
Co-authored-by: Chester Curme <chester.curme@gmail.com>
Add image generation tool to the list of well known tools. This is needed for changes in the ChatOpenAI client.
TODO: Some of this logic needs to be moved from core directly into the client as changes in core should not be required to add a new tool to the openai chat client.
Thank you for contributing to LangChain!
- [x] **PR title**: "package: description"
- Where "package" is whichever of langchain, core, etc. is being
modified. Use "docs: ..." for purely docs changes, "infra: ..." for CI
changes.
- Example: "core: add foobar LLM"
- [x] **PR message**: ***Delete this entire checklist*** and replace
with
- **Description:** a description of the change
- **Issue:** the issue # it fixes, if applicable
- **Dependencies:** any dependencies required for this change
- **Twitter handle:** if your PR gets announced, and you'd like a
mention, we'll gladly shout you out!
- [x] **Add tests and docs**: If you're adding a new integration, please
include
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in
`docs/docs/integrations` directory.
- [x] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/
Additional guidelines:
- Make sure optional dependencies are imported within a function.
- Please do not add dependencies to pyproject.toml files (even optional
ones) unless they are required for unit tests.
- Most PRs should not touch more than one package.
- Changes should be backwards compatible.
If no one reviews your PR within a few days, please @-mention one of
baskaryan, eyurtsev, ccurme, vbarda, hwchase17.
Thank you for contributing to LangChain!
- [x] **PR title**: "package: description"
- Where "package" is whichever of langchain, core, etc. is being
modified. Use "docs: ..." for purely docs changes, "infra: ..." for CI
changes.
- Example: "core: add foobar LLM"
- [x] **PR message**: ***Delete this entire checklist*** and replace
with
- **Description:** a description of the change
- **Issue:** the issue # it fixes, if applicable
- **Dependencies:** any dependencies required for this change
- **Twitter handle:** if your PR gets announced, and you'd like a
mention, we'll gladly shout you out!
- [x] **Add tests and docs**: If you're adding a new integration, please
include
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in
`docs/docs/integrations` directory.
- [x] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/
Additional guidelines:
- Make sure optional dependencies are imported within a function.
- Please do not add dependencies to pyproject.toml files (even optional
ones) unless they are required for unit tests.
- Most PRs should not touch more than one package.
- Changes should be backwards compatible.
If no one reviews your PR within a few days, please @-mention one of
baskaryan, eyurtsev, ccurme, vbarda, hwchase17.
Deleted two outdated phrases that were reflecting the current versions
of packages at the time i.e.: 1-"langchain-community is currently on
version 0.2.x." 2-langchain-"experimental is currently on version 0.0.x"
docs: update Valyu integration notebooks to reflect current
langchain-valyu package implementation
Updated the Valyu integration documentation notebooks to align with the
current implementation of the langchain-valyu package. The changes
include:
- Updated ValyuContextRetriever to ValyuRetriever class name
- Changed parameter name from similarity_threshold to
relevance_threshold
- Removed query_rewrite parameter from search tool examples
- Added start_date and end_date parameters for time filtering
- Updated default values to match current implementation
(relevance_threshold: 0.5)
- Enhanced parameter documentation with proper descriptions and
constraints
- Updated section titles to reflect "Deep Search" functionality
…integrations/retrievers/ All retrievers section
Thank you for contributing to LangChain!
- [x] **PR title**: "package: description"
- Where "package" is whichever of langchain, core, etc. is being
modified. Use "docs: ..." for purely docs changes, "infra: ..." for CI
changes.
- Example: "core: add foobar LLM"
- [x] **PR message**: ***Delete this entire checklist*** and replace
with
- **Description:** a description of the change
- **Issue:** the issue # it fixes, if applicable
- **Dependencies:** any dependencies required for this change
- **Twitter handle:** if your PR gets announced, and you'd like a
mention, we'll gladly shout you out!
- [x] **Add tests and docs**: If you're adding a new integration, please
include
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in
`docs/docs/integrations` directory.
- [x] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/
Additional guidelines:
- Make sure optional dependencies are imported within a function.
- Please do not add dependencies to pyproject.toml files (even optional
ones) unless they are required for unit tests.
- Most PRs should not touch more than one package.
- Changes should be backwards compatible.
If no one reviews your PR within a few days, please @-mention one of
baskaryan, eyurtsev, ccurme, vbarda, hwchase17.
…ngchain.com/docs/integrations/tools/ All tools section
Thank you for contributing to LangChain!
- [x] **PR title**: "package: description"
- Where "package" is whichever of langchain, core, etc. is being
modified. Use "docs: ..." for purely docs changes, "infra: ..." for CI
changes.
- Example: "core: add foobar LLM"
- [x] **PR message**: ***Delete this entire checklist*** and replace
with
- **Description:** a description of the change
- **Issue:** the issue # it fixes, if applicable
- **Dependencies:** any dependencies required for this change
- **Twitter handle:** if your PR gets announced, and you'd like a
mention, we'll gladly shout you out!
- [x] **Add tests and docs**: If you're adding a new integration, please
include
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in
`docs/docs/integrations` directory.
- [x] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/
Additional guidelines:
- Make sure optional dependencies are imported within a function.
- Please do not add dependencies to pyproject.toml files (even optional
ones) unless they are required for unit tests.
- Most PRs should not touch more than one package.
- Changes should be backwards compatible.
If no one reviews your PR within a few days, please @-mention one of
baskaryan, eyurtsev, ccurme, vbarda, hwchase17.
…gchain.com/docs/integrations/document_loaders/ All document loaders
section
Thank you for contributing to LangChain!
- [x] **PR title**: "package: description"
- Where "package" is whichever of langchain, core, etc. is being
modified. Use "docs: ..." for purely docs changes, "infra: ..." for CI
changes.
- Example: "core: add foobar LLM"
- [x] **PR message**: ***Delete this entire checklist*** and replace
with
- **Description:** a description of the change
- **Issue:** the issue # it fixes, if applicable
- **Dependencies:** any dependencies required for this change
- **Twitter handle:** if your PR gets announced, and you'd like a
mention, we'll gladly shout you out!
- [x] **Add tests and docs**: If you're adding a new integration, please
include
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in
`docs/docs/integrations` directory.
- [x] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/
Additional guidelines:
- Make sure optional dependencies are imported within a function.
- Please do not add dependencies to pyproject.toml files (even optional
ones) unless they are required for unit tests.
- Most PRs should not touch more than one package.
- Changes should be backwards compatible.
If no one reviews your PR within a few days, please @-mention one of
baskaryan, eyurtsev, ccurme, vbarda, hwchase17.
…cs/integrations/document_loaders/ All document loaders section
Thank you for contributing to LangChain!
- [x] **PR title**: "package: description"
- Where "package" is whichever of langchain, core, etc. is being
modified. Use "docs: ..." for purely docs changes, "infra: ..." for CI
changes.
- Example: "core: add foobar LLM"
- [x] **PR message**: ***Delete this entire checklist*** and replace
with
- **Description:** a description of the change
- **Issue:** the issue # it fixes, if applicable
- **Dependencies:** any dependencies required for this change
- **Twitter handle:** if your PR gets announced, and you'd like a
mention, we'll gladly shout you out!
- [x] **Add tests and docs**: If you're adding a new integration, please
include
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in
`docs/docs/integrations` directory.
- [x] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/
Additional guidelines:
- Make sure optional dependencies are imported within a function.
- Please do not add dependencies to pyproject.toml files (even optional
ones) unless they are required for unit tests.
- Most PRs should not touch more than one package.
- Changes should be backwards compatible.
If no one reviews your PR within a few days, please @-mention one of
baskaryan, eyurtsev, ccurme, vbarda, hwchase17.
…chain.com/docs/integrations/document_loaders/ All document loaders
section
Thank you for contributing to LangChain!
- [x] **PR title**: "package: description"
- Where "package" is whichever of langchain, core, etc. is being
modified. Use "docs: ..." for purely docs changes, "infra: ..." for CI
changes.
- Example: "core: add foobar LLM"
- [x] **PR message**: ***Delete this entire checklist*** and replace
with
- **Description:** a description of the change
- **Issue:** the issue # it fixes, if applicable
- **Dependencies:** any dependencies required for this change
- **Twitter handle:** if your PR gets announced, and you'd like a
mention, we'll gladly shout you out!
- [x] **Add tests and docs**: If you're adding a new integration, please
include
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in
`docs/docs/integrations` directory.
- [x] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/
Additional guidelines:
- Make sure optional dependencies are imported within a function.
- Please do not add dependencies to pyproject.toml files (even optional
ones) unless they are required for unit tests.
- Most PRs should not touch more than one package.
- Changes should be backwards compatible.
If no one reviews your PR within a few days, please @-mention one of
baskaryan, eyurtsev, ccurme, vbarda, hwchase17.
Thank you for contributing to LangChain!
- [x] **PR title**: "package: description"
- Where "package" is whichever of langchain, core, etc. is being
modified. Use "docs: ..." for purely docs changes, "infra: ..." for CI
changes.
- Example: "core: add foobar LLM"
- [x] **PR message**: ***Delete this entire checklist*** and replace
with
- **Description:** a description of the change
- **Issue:** the issue # it fixes, if applicable
- **Dependencies:** any dependencies required for this change
- **Twitter handle:** if your PR gets announced, and you'd like a
mention, we'll gladly shout you out!
- [x] **Add tests and docs**: If you're adding a new integration, please
include
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in
`docs/docs/integrations` directory.
- [x] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/
Additional guidelines:
- Make sure optional dependencies are imported within a function.
- Please do not add dependencies to pyproject.toml files (even optional
ones) unless they are required for unit tests.
- Most PRs should not touch more than one package.
- Changes should be backwards compatible.
If no one reviews your PR within a few days, please @-mention one of
baskaryan, eyurtsev, ccurme, vbarda, hwchase17.
---------
Co-authored-by: Chester Curme <chester.curme@gmail.com>
…integrations/document_loaders/ All document loaders section
Thank you for contributing to LangChain!
- [x] **PR title**: "package: description"
- Where "package" is whichever of langchain, core, etc. is being
modified. Use "docs: ..." for purely docs changes, "infra: ..." for CI
changes.
- Example: "core: add foobar LLM"
- [x] **PR message**: ***Delete this entire checklist*** and replace
with
- **Description:** a description of the change
- **Issue:** the issue # it fixes, if applicable
- **Dependencies:** any dependencies required for this change
- **Twitter handle:** if your PR gets announced, and you'd like a
mention, we'll gladly shout you out!
- [x] **Add tests and docs**: If you're adding a new integration, please
include
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in
`docs/docs/integrations` directory.
- [x] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/
Additional guidelines:
- Make sure optional dependencies are imported within a function.
- Please do not add dependencies to pyproject.toml files (even optional
ones) unless they are required for unit tests.
- Most PRs should not touch more than one package.
- Changes should be backwards compatible.
If no one reviews your PR within a few days, please @-mention one of
baskaryan, eyurtsev, ccurme, vbarda, hwchase17.
…tionguard.ipynb
Thank you for contributing to LangChain!
- [x] **PR title**: "package: description"
- Where "package" is whichever of langchain, core, etc. is being
modified. Use "docs: ..." for purely docs changes, "infra: ..." for CI
changes.
- Example: "core: add foobar LLM"
- [x] **PR message**: ***Delete this entire checklist*** and replace
with
- **Description:** a description of the change
- **Issue:** the issue # it fixes, if applicable
- **Dependencies:** any dependencies required for this change
- **Twitter handle:** if your PR gets announced, and you'd like a
mention, we'll gladly shout you out!
- [x] **Add tests and docs**: If you're adding a new integration, please
include
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in
`docs/docs/integrations` directory.
- [x] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/
Additional guidelines:
- Make sure optional dependencies are imported within a function.
- Please do not add dependencies to pyproject.toml files (even optional
ones) unless they are required for unit tests.
- Most PRs should not touch more than one package.
- Changes should be backwards compatible.
If no one reviews your PR within a few days, please @-mention one of
baskaryan, eyurtsev, ccurme, vbarda, hwchase17.
Thank you for contributing to LangChain!
- [x] **PR title**: "package: description"
- Where "package" is whichever of langchain, core, etc. is being
modified. Use "docs: ..." for purely docs changes, "infra: ..." for CI
changes.
- Example: "core: add foobar LLM"
- [x] **PR message**: ***Delete this entire checklist*** and replace
with
- **Description:** a description of the change
- **Issue:** the issue # it fixes, if applicable
- **Dependencies:** any dependencies required for this change
- **Twitter handle:** if your PR gets announced, and you'd like a
mention, we'll gladly shout you out!
- [x] **Add tests and docs**: If you're adding a new integration, please
include
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in
`docs/docs/integrations` directory.
- [x] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/
Additional guidelines:
- Make sure optional dependencies are imported within a function.
- Please do not add dependencies to pyproject.toml files (even optional
ones) unless they are required for unit tests.
- Most PRs should not touch more than one package.
- Changes should be backwards compatible.
If no one reviews your PR within a few days, please @-mention one of
baskaryan, eyurtsev, ccurme, vbarda, hwchase17.
Thank you for contributing to LangChain!
- [x] **PR title**: "package: description"
- Where "package" is whichever of langchain, core, etc. is being
modified. Use "docs: ..." for purely docs changes, "infra: ..." for CI
changes.
- Example: "core: add foobar LLM"
- [x] **PR message**: ***Delete this entire checklist*** and replace
with
- **Description:** a description of the change
- **Issue:** the issue # it fixes, if applicable
- **Dependencies:** any dependencies required for this change
- **Twitter handle:** if your PR gets announced, and you'd like a
mention, we'll gladly shout you out!
- [x] **Add tests and docs**: If you're adding a new integration, please
include
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in
`docs/docs/integrations` directory.
- [x] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/
Additional guidelines:
- Make sure optional dependencies are imported within a function.
- Please do not add dependencies to pyproject.toml files (even optional
ones) unless they are required for unit tests.
- Most PRs should not touch more than one package.
- Changes should be backwards compatible.
If no one reviews your PR within a few days, please @-mention one of
baskaryan, eyurtsev, ccurme, vbarda, hwchase17.
Thank you for contributing to LangChain!
- [x] **PR title**: "package: description"
- Where "package" is whichever of langchain, core, etc. is being
modified. Use "docs: ..." for purely docs changes, "infra: ..." for CI
changes.
- Example: "core: add foobar LLM"
- [x] **PR message**: ***Delete this entire checklist*** and replace
with
- **Description:** a description of the change
- **Issue:** the issue # it fixes, if applicable
- **Dependencies:** any dependencies required for this change
- **Twitter handle:** if your PR gets announced, and you'd like a
mention, we'll gladly shout you out!
- [x] **Add tests and docs**: If you're adding a new integration, please
include
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in
`docs/docs/integrations` directory.
- [x] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/
Additional guidelines:
- Make sure optional dependencies are imported within a function.
- Please do not add dependencies to pyproject.toml files (even optional
ones) unless they are required for unit tests.
- Most PRs should not touch more than one package.
- Changes should be backwards compatible.
If no one reviews your PR within a few days, please @-mention one of
baskaryan, eyurtsev, ccurme, vbarda, hwchase17.
Thank you for contributing to LangChain!
- [x] **PR title**: "package: description"
- Where "package" is whichever of langchain, core, etc. is being
modified. Use "docs: ..." for purely docs changes, "infra: ..." for CI
changes.
- Example: "core: add foobar LLM"
- [x] **PR message**: ***Delete this entire checklist*** and replace
with
- **Description:** a description of the change
- **Issue:** the issue # it fixes, if applicable
- **Dependencies:** any dependencies required for this change
- **Twitter handle:** if your PR gets announced, and you'd like a
mention, we'll gladly shout you out!
- [x] **Add tests and docs**: If you're adding a new integration, please
include
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in
`docs/docs/integrations` directory.
- [x] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/
Additional guidelines:
- Make sure optional dependencies are imported within a function.
- Please do not add dependencies to pyproject.toml files (even optional
ones) unless they are required for unit tests.
- Most PRs should not touch more than one package.
- Changes should be backwards compatible.
If no one reviews your PR within a few days, please @-mention one of
baskaryan, eyurtsev, ccurme, vbarda, hwchase17.
### Description
Added a note above the retriever overview table to clarify that the
descriptions are truncated for readability and how to view the full
version (via hover or click).
### Issue
Fixes#31311 — Users were confused by incomplete retriever descriptions
in the integration docs.
### Dependencies
None
---------
Co-authored-by: Chester Curme <chester.curme@gmail.com>
…rations/chat/ All chat models section
Thank you for contributing to LangChain!
- [x] **PR title**: "package: description"
- Where "package" is whichever of langchain, core, etc. is being
modified. Use "docs: ..." for purely docs changes, "infra: ..." for CI
changes.
- Example: "core: add foobar LLM"
- [x] **PR message**: ***Delete this entire checklist*** and replace
with
- **Description:** a description of the change
- **Issue:** the issue # it fixes, if applicable
- **Dependencies:** any dependencies required for this change
- **Twitter handle:** if your PR gets announced, and you'd like a
mention, we'll gladly shout you out!
- [x] **Add tests and docs**: If you're adding a new integration, please
include
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in
`docs/docs/integrations` directory.
- [x] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/
Additional guidelines:
- Make sure optional dependencies are imported within a function.
- Please do not add dependencies to pyproject.toml files (even optional
ones) unless they are required for unit tests.
- Most PRs should not touch more than one package.
- Changes should be backwards compatible.
If no one reviews your PR within a few days, please @-mention one of
baskaryan, eyurtsev, ccurme, vbarda, hwchase17.
…ntegrations/chat/ All chat models section
Thank you for contributing to LangChain!
- [x] **PR title**: "package: description"
- Where "package" is whichever of langchain, core, etc. is being
modified. Use "docs: ..." for purely docs changes, "infra: ..." for CI
changes.
- Example: "core: add foobar LLM"
- [x] **PR message**: ***Delete this entire checklist*** and replace
with
- **Description:** a description of the change
- **Issue:** the issue # it fixes, if applicable
- **Dependencies:** any dependencies required for this change
- **Twitter handle:** if your PR gets announced, and you'd like a
mention, we'll gladly shout you out!
- [x] **Add tests and docs**: If you're adding a new integration, please
include
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in
`docs/docs/integrations` directory.
- [x] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/
Additional guidelines:
- Make sure optional dependencies are imported within a function.
- Please do not add dependencies to pyproject.toml files (even optional
ones) unless they are required for unit tests.
- Most PRs should not touch more than one package.
- Changes should be backwards compatible.
If no one reviews your PR within a few days, please @-mention one of
baskaryan, eyurtsev, ccurme, vbarda, hwchase17.
…egrations/chat/ All chat models section
Thank you for contributing to LangChain!
- [x] **PR title**: "package: description"
- Where "package" is whichever of langchain, core, etc. is being
modified. Use "docs: ..." for purely docs changes, "infra: ..." for CI
changes.
- Example: "core: add foobar LLM"
- [x] **PR message**: ***Delete this entire checklist*** and replace
with
- **Description:** a description of the change
- **Issue:** the issue # it fixes, if applicable
- **Dependencies:** any dependencies required for this change
- **Twitter handle:** if your PR gets announced, and you'd like a
mention, we'll gladly shout you out!
- [x] **Add tests and docs**: If you're adding a new integration, please
include
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in
`docs/docs/integrations` directory.
- [x] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/
Additional guidelines:
- Make sure optional dependencies are imported within a function.
- Please do not add dependencies to pyproject.toml files (even optional
ones) unless they are required for unit tests.
- Most PRs should not touch more than one package.
- Changes should be backwards compatible.
If no one reviews your PR within a few days, please @-mention one of
baskaryan, eyurtsev, ccurme, vbarda, hwchase17.
…tegrations/chat/ All chat models section
Thank you for contributing to LangChain!
- [x] **PR title**: "package: description"
- Where "package" is whichever of langchain, core, etc. is being
modified. Use "docs: ..." for purely docs changes, "infra: ..." for CI
changes.
- Example: "core: add foobar LLM"
- [x] **PR message**: ***Delete this entire checklist*** and replace
with
- **Description:** a description of the change
- **Issue:** the issue # it fixes, if applicable
- **Dependencies:** any dependencies required for this change
- **Twitter handle:** if your PR gets announced, and you'd like a
mention, we'll gladly shout you out!
- [x] **Add tests and docs**: If you're adding a new integration, please
include
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in
`docs/docs/integrations` directory.
- [x] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/
Additional guidelines:
- Make sure optional dependencies are imported within a function.
- Please do not add dependencies to pyproject.toml files (even optional
ones) unless they are required for unit tests.
- Most PRs should not touch more than one package.
- Changes should be backwards compatible.
If no one reviews your PR within a few days, please @-mention one of
baskaryan, eyurtsev, ccurme, vbarda, hwchase17.
Thank you for contributing to LangChain!
- [x] **PR title**: "package: description"
- Where "package" is whichever of langchain, core, etc. is being
modified. Use "docs: ..." for purely docs changes, "infra: ..." for CI
changes.
- Example: "core: add foobar LLM"
- [x] **PR message**: ***Delete this entire checklist*** and replace
with
- **Description:** a description of the change
- **Issue:** the issue # it fixes, if applicable
- **Dependencies:** any dependencies required for this change
- **Twitter handle:** if your PR gets announced, and you'd like a
mention, we'll gladly shout you out!
- [x] **Add tests and docs**: If you're adding a new integration, please
include
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in
`docs/docs/integrations` directory.
- [x] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/
Additional guidelines:
- Make sure optional dependencies are imported within a function.
- Please do not add dependencies to pyproject.toml files (even optional
ones) unless they are required for unit tests.
- Most PRs should not touch more than one package.
- Changes should be backwards compatible.
If no one reviews your PR within a few days, please @-mention one of
baskaryan, eyurtsev, ccurme, vbarda, hwchase17.
As part of core releases we run tests on the last released version of
some packages (including langchain-openai) using the new version of
langchain-core. We run langchain-openai's test suite as it was when it
was last released.
OpenAI has since updated their API— relaxing constraints on what schemas
are supported when `strict=True`— causing these tests to break. They
have since been fixed. But the old tests will continue to fail.
Will revert this change after we release OpenAI today.
Added support for new Exa API features. Updated Exa docs and python
package (langchain-exa).
Description
Added support for new Exa API features in the langchain-exa package:
- Added max_characters option for text content
- Added support for summary and custom summary prompts
- Added livecrawl option with "always", "fallback", "never" settings
- Added "auto" option for search type
- Updated documentation and tests
Dependencies
- No new dependencies required. Using existing features from exa-py.
twitter: @theishangoswami
---------
Co-authored-by: Chester Curme <chester.curme@gmail.com>
**Description:** ConversationChain has been deprecated, and the
documentation says to use RunnableWithMessageHistory in its place, but
the link at the top of the page to RunnableWithMessageHistory is broken
(it's rendering as "html()"). See here at the top of the page:
https://python.langchain.com/api_reference/langchain/chains/langchain.chains.conversation.base.ConversationChain.html.
This PR fixes the link.
**Issue**: N/A
**Dependencies**: N/A
**Twitter handle:**: If you're on Bluesky, I'm @vikramsaraph.com
Scheduled testing started failing today because the Responses API
stopped raising `BadRequestError` for a schema that was previously
invalid when `strict=True`.
Although docs still say that [some type-specific keywords are not yet
supported](https://platform.openai.com/docs/guides/structured-outputs#some-type-specific-keywords-are-not-yet-supported)
(including `minimum` and `maximum` for numbers), the below appears to
run and correctly respect the constraints:
```python
import json
import openai
maximums = list(range(1, 11))
arg_values = []
for maximum in maximums:
tool = {
"type": "function",
"name": "magic_function",
"description": "Applies a magic function to an input.",
"parameters": {
"properties": {
"input": {"maximum": maximum, "minimum": 0, "type": "integer"}
},
"required": ["input"],
"type": "object",
"additionalProperties": False
},
"strict": True
}
client = openai.OpenAI()
response = client.responses.create(
model="gpt-4.1",
input=[{"role": "user", "content": "What is the value of magic_function(3)? Use the tool."}],
tools=[tool],
)
function_call = next(item for item in response.output if item.type == "function_call")
args = json.loads(function_call.arguments)
arg_values.append(args["input"])
print(maximums)
print(arg_values)
# [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]
# [1, 2, 3, 3, 3, 3, 3, 3, 3, 3]
```
Until yesterday this raised BadRequestError.
The same is not true of Chat Completions, which appears to still raise
BadRequestError
```python
tool = {
"type": "function",
"function": {
"name": "magic_function",
"description": "Applies a magic function to an input.",
"parameters": {
"properties": {
"input": {"maximum": 5, "minimum": 0, "type": "integer"}
},
"required": ["input"],
"type": "object",
"additionalProperties": False
},
"strict": True
}
}
response = client.chat.completions.create(
model="gpt-4.1",
messages=[{"role": "user", "content": "What is the value of magic_function(3)? Use the tool."}],
tools=[tool],
)
response # raises BadRequestError
```
Here we update tests accordingly.
Thank you for contributing to LangChain!
- [x] **PR title**: "package: description"
- Where "package" is whichever of langchain, core, etc. is being
modified. Use "docs: ..." for purely docs changes, "infra: ..." for CI
changes.
- Example: "core: add foobar LLM"
- [x] **PR message**: ***Delete this entire checklist*** and replace
with
- **Description:** a description of the change
- **Issue:** the issue # it fixes, if applicable
- **Dependencies:** any dependencies required for this change
- **Twitter handle:** if your PR gets announced, and you'd like a
mention, we'll gladly shout you out!
- [x] **Add tests and docs**: If you're adding a new integration, please
include
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in
`docs/docs/integrations` directory.
- [x] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/
Additional guidelines:
- Make sure optional dependencies are imported within a function.
- Please do not add dependencies to pyproject.toml files (even optional
ones) unless they are required for unit tests.
- Most PRs should not touch more than one package.
- Changes should be backwards compatible.
If no one reviews your PR within a few days, please @-mention one of
baskaryan, eyurtsev, ccurme, vbarda, hwchase17.
**Description**:
This PR updates the documentation to address a potential issue when
using `hub.pull(...)` with non-US LangSmith endpoints (e.g.,
`https://eu.api.smith.langchain.com`).
By default, the `hub.pull` function assumes the non US-based API URL.
When the `LANGSMITH_ENDPOINT` environment variable is set to a non-US
region, this can lead to `LangSmithNotFoundError 404 not found` errors
when pulling public assets from the LangChain Hub.
Issue: #31191
---------
Co-authored-by: Chester Curme <chester.curme@gmail.com>
Thank you for contributing to LangChain!
- [x] **PR title**: "package: description"
- Where "package" is whichever of langchain, core, etc. is being
modified. Use "docs: ..." for purely docs changes, "infra: ..." for CI
changes.
- Example: "core: add foobar LLM"
- [x] **PR message**: ***Delete this entire checklist*** and replace
with
- **Description:** a description of the change
- **Issue:** the issue # it fixes, if applicable
- **Dependencies:** any dependencies required for this change
- **Twitter handle:** if your PR gets announced, and you'd like a
mention, we'll gladly shout you out!
- [x] **Add tests and docs**: If you're adding a new integration, please
include
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in
`docs/docs/integrations` directory.
- [x] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/
Additional guidelines:
- Make sure optional dependencies are imported within a function.
- Please do not add dependencies to pyproject.toml files (even optional
ones) unless they are required for unit tests.
- Most PRs should not touch more than one package.
- Changes should be backwards compatible.
If no one reviews your PR within a few days, please @-mention one of
baskaryan, eyurtsev, ccurme, vbarda, hwchase17.
Update docs to add Featherless.ai Provider & Chat Model
- **Description:** Adding Featherless.ai as provider in teh
documentations giving access to over 4300+ open-source models
- **Twitter handle:** https://x.com/FeatherlessAI
DSPy removed their LangChain integration in version 2.6.6.
Here we remove the page and add a redirect to the LangChain v0.2 docs
for posterity.
We add an admonition to the v0.2 docs in
https://github.com/langchain-ai/langchain/pull/31277.
* It is possible to chain a `Runnable` with an `AsyncIterator` as seen
in `test_runnable.py`.
* Iterator and AsyncIterator Input/Output of Callables must be put
before `Callable[[Other], Any]` otherwise the pattern matching picks the
latter.
**PR message**: Not sure if I put the check at the right spot, but I
thought throwing the error before the loop made sense to me.
**Description:** Checks if there are only system messages using
AnthropicChat model and throws an error if it's the case. Check Issue
for more details
**Issue:** #30764
---------
Co-authored-by: Chester Curme <chester.curme@gmail.com>
**Issue:**[
#309070](https://github.com/langchain-ai/langchain/issues/30970)
**Cause**
Arg type in python code
```
arg: Union[SubSchema1, SubSchema2]
```
is translated to `anyOf` in **json schema**
```
"anyOf" : [{sub schema 1 ...}, {sub schema 1 ...}]
```
The value of anyOf is a list sub schemas.
The bug is caused since the sub schemas inside `anyOf` list is not taken
care of.
The location where the issue happens is `convert_to_openai_function`
function -> `_recursive_set_additional_properties_false` function, that
recursively adds `"additionalProperties": false` to json schema which is
[required by OpenAI's strict function
calling](https://platform.openai.com/docs/guides/structured-outputs?api-mode=responses#additionalproperties-false-must-always-be-set-in-objects)
**Solution:**
This PR fixes this issue by iterating each sub schema inside `anyOf`
list.
A unit test is added.
**Twitter handle:** shengboma
If no one reviews your PR within a few days, please @-mention one of
baskaryan, eyurtsev, ccurme, vbarda, hwchase17.
---------
Co-authored-by: ccurme <chester.curme@gmail.com>
`aindex` function should check not only `adelete` method, but `delete`
method too
**PR title**: "core: fix async indexing issue with adelete/delete
checking"
**PR message**: Currently `langchain.indexes.aindex` checks if vector
store has overrided adelete method. But due to `adelete` default
implementation store can have just `delete` overrided to make `adelete`
working.
---------
Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>
Description: This document change concerns the document-loader
integration, specifically `Confluence`.
I am trying to use the ConfluenceLoader and came across deprecations
when I followed the instructions in the documentation. So I updated the
code blocks with the latest changes made to langchain, and also updated
the documentation for better readability
---------
Co-authored-by: ccurme <chester.curme@gmail.com>
Thank you for contributing to LangChain!
- [ ] **PR title**: "package: description"
- Where "package" is whichever of langchain, core, etc. is being
modified. Use "docs: ..." for purely docs changes, "infra: ..." for CI
changes.
- Example: "core: add foobar LLM"
- [ ] **PR message**: ***Delete this entire checklist*** and replace
with
- **Description:** a description of the change
- **Issue:** the issue # it fixes, if applicable
- **Dependencies:** any dependencies required for this change
- **Twitter handle:** if your PR gets announced, and you'd like a
mention, we'll gladly shout you out!
- [ ] **Add tests and docs**: If you're adding a new integration, please
include
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in
`docs/docs/integrations` directory.
- [ ] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/
Additional guidelines:
- Make sure optional dependencies are imported within a function.
- Please do not add dependencies to pyproject.toml files (even optional
ones) unless they are required for unit tests.
- Most PRs should not touch more than one package.
- Changes should be backwards compatible.
If no one reviews your PR within a few days, please @-mention one of
baskaryan, eyurtsev, ccurme, vbarda, hwchase17.
---------
Co-authored-by: Chester Curme <chester.curme@gmail.com>
- **Description:** The file ```docs/api_reference/create_api_rst.py```
uses a pyproject.toml check to remove partners which don't have a valid
pyproject.toml. This PR extends that check to ```/langchain/libs/*```
sub-directories as well. Without this the ```make api_docs_build```
command fails (see error).
- **Issue:** #31109
- **Dependencies:** none
- **Error Traceback:**
uv run --no-group test python docs/api_reference/create_api_rst.py
Starting to build API reference files.
Building package: community
pyproject.toml not found in /langchain/libs/community.
You are either attempting to build a directory which is not a package or
the package is missing a pyproject.toml file which should be
added.Aborting the build.
make: *** [Makefile:35: api_docs_build] Error 1
**Description**:
Add a `async_client_kwargs` field to ollama chat/llm/embeddings adapters
that is passed to async httpx client constructor.
**Motivation:**
In my use-case:
- chat/embedding model adapters may be created frequently, sometimes to
be called just once or to never be called at all
- they may be used in bots sunc and async mode (not known at the moment
they are created)
So, I want to keep a static transport instance maintaining connection
pool, so model adapters can be created and destroyed freely. But that
doesn't work when both sync and async functions are in use as I can only
pass one transport instance for both sync and async client, while
transport types must be different for them. So I can't make both sync
and async calls use shared transport with current model adapter
interfaces.
In this PR I add a separate `async_client_kwargs` that gets passed to
async client constructor, so it will be possible to pass a separate
transport instance. For sake of backwards compatibility, it is merged
with `client_kwargs`, so nothing changes when it is not set.
I am unable to run linter right now, but the changes look ok.
Bumps
[Ana06/get-changed-files](https://github.com/ana06/get-changed-files)
from 2.2.0 to 2.3.0.
<details>
<summary>Release notes</summary>
<p><em>Sourced from <a
href="https://github.com/ana06/get-changed-files/releases">Ana06/get-changed-files's
releases</a>.</em></p>
<blockquote>
<h2>v2.3.0</h2>
<p>This project is a fork of <a
href="https://github.com/jitterbit/get-changed-files">jitterbit/get-changed-files</a>,
which:</p>
<ul>
<li>Supports <code>pull_request_target</code></li>
<li>Allows to filter files using regular expressions</li>
<li>Removes the ahead check</li>
<li>Considers renamed modified files as modified</li>
<li>Adds <code>added_modified_renamed</code> that includes renamed
non-modified files and all files in <code>added_modified</code></li>
<li>Uses Node 20</li>
</ul>
<h2>Changes</h2>
<ul>
<li>Update to Node 20</li>
<li>Update dependencies</li>
</ul>
<h2>Raw diff</h2>
<p><a
href="https://github.com/Ana06/get-changed-files/compare/v2.2.0...v2.3.0">https://github.com/Ana06/get-changed-files/compare/v2.2.0...v2.3.0</a></p>
</blockquote>
</details>
<details>
<summary>Commits</summary>
<ul>
<li><a
href="25f79e676e"><code>25f79e6</code></a>
Update version in package.json</li>
<li><a
href="6f5373eb01"><code>6f5373e</code></a>
Merge pull request <a
href="https://redirect.github.com/ana06/get-changed-files/issues/42">#42</a>
from Ana06/release2-3</li>
<li><a
href="64dffb46a4"><code>64dffb4</code></a>
Prepare 2.3.0 release</li>
<li><a
href="791f7645b7"><code>791f764</code></a>
[CI] Update actions/checkout</li>
<li><a
href="5a4a136e91"><code>5a4a136</code></a>
[CI] Ensure GH action uses node version 20</li>
<li><a
href="38bdb2e498"><code>38bdb2e</code></a>
Update to Node 20</li>
<li><a
href="5558be5781"><code>5558be5</code></a>
Merge pull request <a
href="https://redirect.github.com/ana06/get-changed-files/issues/30">#30</a>
from Ana06/dependabot/npm_and_yarn/decode-uri-componen...</li>
<li><a
href="6a376fdbb3"><code>6a376fd</code></a>
Merge pull request <a
href="https://redirect.github.com/ana06/get-changed-files/issues/31">#31</a>
from Ana06/dependabot/npm_and_yarn/qs-6.5.3</li>
<li><a
href="ace6e7bcbb"><code>ace6e7b</code></a>
Merge pull request <a
href="https://redirect.github.com/ana06/get-changed-files/issues/32">#32</a>
from brtrick/main</li>
<li><a
href="a102fae9bf"><code>a102fae</code></a>
Merge pull request <a
href="https://redirect.github.com/ana06/get-changed-files/issues/33">#33</a>
from Ana06/dependabot/npm_and_yarn/json5-2.2.3</li>
<li>Additional commits viewable in <a
href="https://github.com/ana06/get-changed-files/compare/v2.2.0...v2.3.0">compare
view</a></li>
</ul>
</details>
<br />
[](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores)
Dependabot will resolve any conflicts with this PR as long as you don't
alter it yourself. You can also trigger a rebase manually by commenting
`@dependabot rebase`.
[//]: # (dependabot-automerge-start)
[//]: # (dependabot-automerge-end)
---
<details>
<summary>Dependabot commands and options</summary>
<br />
You can trigger Dependabot actions by commenting on this PR:
- `@dependabot rebase` will rebase this PR
- `@dependabot recreate` will recreate this PR, overwriting any edits
that have been made to it
- `@dependabot merge` will merge this PR after your CI passes on it
- `@dependabot squash and merge` will squash and merge this PR after
your CI passes on it
- `@dependabot cancel merge` will cancel a previously requested merge
and block automerging
- `@dependabot reopen` will reopen this PR if it is closed
- `@dependabot close` will close this PR and stop Dependabot recreating
it. You can achieve the same result by closing it manually
- `@dependabot show <dependency name> ignore conditions` will show all
of the ignore conditions of the specified dependency
- `@dependabot ignore this major version` will close this PR and stop
Dependabot creating any more for this major version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this minor version` will close this PR and stop
Dependabot creating any more for this minor version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this dependency` will close this PR and stop
Dependabot creating any more for this dependency (unless you reopen the
PR or upgrade to it yourself)
</details>
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Updates dependencies to Chroma to integrate the major release of Chroma
with improved performance, and to fix issues users have been seeing
using the latest chroma docker image with langchain-chroma
https://github.com/langchain-ai/langchain/issues/31047#issuecomment-2850790841
Updates chromadb dependency to >=1.0.9
This also removes the dependency of chroma-hnswlib, meaning it can run
against python 3.13 runners for tests as well.
Tested this by pulling the latest Chroma docker image, running
langchain-chroma using client mode
```
httpClient = chromadb.HttpClient(host="localhost", port=8000)
vector_store = Chroma(
client=httpClient,
collection_name="test",
embedding_function=embeddings,
)
```
"Alanis Morissette" spelling error
Thank you for contributing to LangChain!
- [ ] **PR title**: "package: description"
- Where "package" is whichever of langchain, core, etc. is being
modified. Use "docs: ..." for purely docs changes, "infra: ..." for CI
changes.
- Example: "core: add foobar LLM"
- [ ] **PR message**: ***Delete this entire checklist*** and replace
with
- **Description:** a description of the change
- **Issue:** the issue # it fixes, if applicable
- **Dependencies:** any dependencies required for this change
- **Twitter handle:** if your PR gets announced, and you'd like a
mention, we'll gladly shout you out!
- [ ] **Add tests and docs**: If you're adding a new integration, please
include
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in
`docs/docs/integrations` directory.
- [ ] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/
Additional guidelines:
- Make sure optional dependencies are imported within a function.
- Please do not add dependencies to pyproject.toml files (even optional
ones) unless they are required for unit tests.
- Most PRs should not touch more than one package.
- Changes should be backwards compatible.
If no one reviews your PR within a few days, please @-mention one of
baskaryan, eyurtsev, ccurme, vbarda, hwchase17.
Bumps [astral-sh/setup-uv](https://github.com/astral-sh/setup-uv) from 5
to 6.
<details>
<summary>Release notes</summary>
<p><em>Sourced from <a
href="https://github.com/astral-sh/setup-uv/releases">astral-sh/setup-uv's
releases</a>.</em></p>
<blockquote>
<h2>v6.0.0 🌈 activate-environment and working-directory</h2>
<h2>Changes</h2>
<p>This version contains some breaking changes which have been gathering
up for a while. Lets dive into them:</p>
<ul>
<li><a
href="https://github.com/astral-sh/setup-uv/blob/HEAD/#activate-environment">Activate
environment</a></li>
<li><a
href="https://github.com/astral-sh/setup-uv/blob/HEAD/#working-directory">Working
Directory</a></li>
<li><a
href="https://github.com/astral-sh/setup-uv/blob/HEAD/#default-cache-dependency-glob">Default
<code>cache-dependency-glob</code></a></li>
<li><a
href="https://github.com/astral-sh/setup-uv/blob/HEAD/#use-default-cache-dir-on-self-hosted-runners">Use
default cache dir on self hosted runners</a></li>
</ul>
<h3>Activate environment</h3>
<p>In previous versions using the input <code>python-version</code>
automatically activated a venv at the repository root.
This led to some unwanted side-effects, was sometimes unexpected and not
flexible enough.</p>
<p>The venv activation is now explicitly controlled with the new input
<code>activate-environment</code> (false by default):</p>
<pre lang="yaml"><code>- name: Install the latest version of uv and
activate the environment
uses: astral-sh/setup-uv@v6
with:
activate-environment: true
- run: uv pip install pip
</code></pre>
<p>The venv gets created by the <a
href="https://docs.astral.sh/uv/pip/environments/"><code>uv
venv</code></a> command so the python version is controlled by the
<code>python-version</code> input or the files
<code>pyproject.toml</code>, <code>uv.toml</code>,
<code>.python-version</code> in the <code>working-directory</code>.</p>
<h3>Working Directory</h3>
<p>The new input <code>working-directory</code> controls where we look
for <code>pyproject.toml</code>, <code>uv.toml</code> and
<code>.python-version</code> files
which are used to determine the version of uv and python to install.</p>
<p>It can also be used to control where the venv gets created.</p>
<pre lang="yaml"><code>- name: Install uv based on the config files in
the working-directory
uses: astral-sh/setup-uv@v6
with:
working-directory: my/subproject/dir
</code></pre>
<blockquote>
<p>[!CAUTION]</p>
<p>The inputs <code>pyproject-file</code> and <code>uv-file</code> have
been removed.</p>
</blockquote>
<h3>Default <code>cache-dependency-glob</code></h3>
<p><a href="https://github.com/ssbarnea"><code>@ssbarnea</code></a>
found out that the default <code>cache-dependency-glob</code> was not
suitable for a lot of users.</p>
<p>The old default</p>
<!-- raw HTML omitted -->
</blockquote>
<p>... (truncated)</p>
</details>
<details>
<summary>Commits</summary>
<ul>
<li><a
href="6b9c6063ab"><code>6b9c606</code></a>
Bump dependencies (<a
href="https://redirect.github.com/astral-sh/setup-uv/issues/389">#389</a>)</li>
<li><a
href="ef6bcdff59"><code>ef6bcdf</code></a>
Fix default cache dependency glob (<a
href="https://redirect.github.com/astral-sh/setup-uv/issues/388">#388</a>)</li>
<li><a
href="9a311713f4"><code>9a31171</code></a>
chore: update known checksums for 0.6.17 (<a
href="https://redirect.github.com/astral-sh/setup-uv/issues/384">#384</a>)</li>
<li><a
href="c7f87aa956"><code>c7f87aa</code></a>
bump to v6 in README (<a
href="https://redirect.github.com/astral-sh/setup-uv/issues/382">#382</a>)</li>
<li><a
href="aadfaf08d6"><code>aadfaf0</code></a>
Change default cache-dependency-glob (<a
href="https://redirect.github.com/astral-sh/setup-uv/issues/352">#352</a>)</li>
<li><a
href="a0f9da6273"><code>a0f9da6</code></a>
No default UV_CACHE_DIR on selfhosted runners (<a
href="https://redirect.github.com/astral-sh/setup-uv/issues/380">#380</a>)</li>
<li><a
href="ec4c691628"><code>ec4c691</code></a>
new inputs activate-environment and working-directory (<a
href="https://redirect.github.com/astral-sh/setup-uv/issues/381">#381</a>)</li>
<li><a
href="aa1290542e"><code>aa12905</code></a>
chore: update known checksums for 0.6.16 (<a
href="https://redirect.github.com/astral-sh/setup-uv/issues/378">#378</a>)</li>
<li><a
href="fcaddda076"><code>fcaddda</code></a>
chore: update known checksums for 0.6.15 (<a
href="https://redirect.github.com/astral-sh/setup-uv/issues/377">#377</a>)</li>
<li><a
href="fb3a0a97fa"><code>fb3a0a9</code></a>
log info on venv activation (<a
href="https://redirect.github.com/astral-sh/setup-uv/issues/375">#375</a>)</li>
<li>See full diff in <a
href="https://github.com/astral-sh/setup-uv/compare/v5...v6">compare
view</a></li>
</ul>
</details>
<br />
[](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores)
Dependabot will resolve any conflicts with this PR as long as you don't
alter it yourself. You can also trigger a rebase manually by commenting
`@dependabot rebase`.
[//]: # (dependabot-automerge-start)
[//]: # (dependabot-automerge-end)
---
<details>
<summary>Dependabot commands and options</summary>
<br />
You can trigger Dependabot actions by commenting on this PR:
- `@dependabot rebase` will rebase this PR
- `@dependabot recreate` will recreate this PR, overwriting any edits
that have been made to it
- `@dependabot merge` will merge this PR after your CI passes on it
- `@dependabot squash and merge` will squash and merge this PR after
your CI passes on it
- `@dependabot cancel merge` will cancel a previously requested merge
and block automerging
- `@dependabot reopen` will reopen this PR if it is closed
- `@dependabot close` will close this PR and stop Dependabot recreating
it. You can achieve the same result by closing it manually
- `@dependabot show <dependency name> ignore conditions` will show all
of the ignore conditions of the specified dependency
- `@dependabot ignore this major version` will close this PR and stop
Dependabot creating any more for this major version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this minor version` will close this PR and stop
Dependabot creating any more for this minor version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this dependency` will close this PR and stop
Dependabot creating any more for this dependency (unless you reopen the
PR or upgrade to it yourself)
</details>
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
**Description:** This is a document change regarding integration with
package `langchain-anthropic` for newly released websearch tool ([Claude
doc](https://docs.anthropic.com/en/docs/build-with-claude/tool-use/web-search-tool)).
Issue 1: The sample in [Web Search
section](https://python.langchain.com/docs/integrations/chat/anthropic/#web-search)
did not run. You would get an error as below:
```
File "my_file.py", line 170, in call
model_with_tools = model.bind_tools([websearch_tool])
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/langchain_anthropic/chat_models.py", line 1363, in bind_tools
tool if _is_builtin_tool(tool) else convert_to_anthropic_tool(tool)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/langchain_anthropic/chat_models.py", line 1645, in convert_to_anthropic_tool
input_schema=oai_formatted["parameters"],
~~~~~~~~~~~~~^^^^^^^^^^^^^^
KeyError: 'parameters'
```
This is because websearch tool is only recently supported in
langchain-anthropic==0.3.13`, in [0.3.13
release](https://github.com/langchain-ai/langchain/releases?q=tag%3A%22langchain-anthropic%3D%3D0%22&expanded=true)
mentioning:
> anthropic[patch]: support web search
(https://github.com/langchain-ai/langchain/pull/31157)
Issue 2: The current doc has outdated package requirements for Websearch
tool: "This guide requires langchain-anthropic>=0.3.10".
Changes:
- Updated the required `langchain-anthropic` package version (0.3.10 ->
0.3.13).
- Added notes to user when using websearch sample.
I believe this will help avoid future confusion from readers.
**Issue:** N/A
**Dependencies:** N/A
**Twitter handle:** N/A
* Remove unnecessary cast of id -> str (can do with a field setting)
* Remove unnecessary `set_text` model validator (can be done with a
computed field - though we had to make some changes to the `Generation`
class to make this possible
Before: ~2.4s
Blue circles represent time spent in custom validators :(
<img width="1337" alt="Screenshot 2025-05-14 at 10 10 12 AM"
src="https://github.com/user-attachments/assets/bb4f477f-4ee3-4870-ae93-14ca7f197d55"
/>
After: ~2.2s
<img width="1344" alt="Screenshot 2025-05-14 at 10 11 03 AM"
src="https://github.com/user-attachments/assets/99f97d80-49de-462f-856f-9e7e8662adbc"
/>
We still want to optimize the backwards compatible tool calls model
validator, though I think this might involve breaking changes, so wanted
to separate that into a different PR. This is circled in green.
**Description:** Before this commit, if one record is batched in more
than 32k rows for sqlite3 >= 3.32 or more than 999 rows for sqlite3 <
3.31, the `record_manager.delete_keys()` will fail, as we are creating a
query with too many variables.
This commit ensures that we are batching the delete operation leveraging
the `cleanup_batch_size` as it is already done for `full` cleanup.
Added unit tests for incremental mode as well on different deleting
batch size.
2025-05-14 11:38:21 -04:00
2323 changed files with 135183 additions and 63831 deletions
@@ -5,26 +5,31 @@ This project includes a [dev container](https://containers.dev/), which lets you
You can use the dev container configuration in this folder to build and run the app without needing to install any of its tools locally! You can use it in [GitHub Codespaces](https://github.com/features/codespaces) or the [VS Code Dev Containers extension](https://marketplace.visualstudio.com/items?itemName=ms-vscode-remote.remote-containers).
## GitHub Codespaces
[](https://codespaces.new/langchain-ai/langchain)
You may use the button above, or follow these steps to open this repo in a Codespace:
1. Click the **Code** drop-down menu at the top of https://github.com/langchain-ai/langchain.
1. Click the **Code** drop-down menu at the top of <https://github.com/langchain-ai/langchain>.
1. Click on the **Codespaces** tab.
1. Click **Create codespace on master**.
For more info, check out the [GitHub documentation](https://docs.github.com/en/free-pro-team@latest/github/developing-online-with-codespaces/creating-a-codespace#creating-a-codespace).
## VS Code Dev Containers
[](https://vscode.dev/redirect?url=vscode://ms-vscode-remote.remote-containers/cloneInVolume?url=https://github.com/langchain-ai/langchain)
Note: If you click the link above you will open the main repo (langchain-ai/langchain) and not your local cloned repo. This is fine if you only want to run and test the library, but if you want to contribute you can use the link below and replace with your username and cloned repo name:
> If you click the link above you will open the main repo (`langchain-ai/langchain`) and *not* your local cloned repo. This is fine if you only want to run and test the library, but if you want to contribute you can use the link below and replace with your username and cloned repo name:
Then you will have a local cloned repo where you can contribute and then create pull requests.
If you already have VS Code and Docker installed, you can use the button above to get started. This will cause VSCode to automatically install the Dev Containers extension if needed, clone the source code into a container volume, and spin up a dev container for use.
If you already have VS Code and Docker installed, you can use the button above to get started. This will use VSCode to automatically install the Dev Containers extension if needed, clone the source code into a container volume, and spin up a dev container for use.
Alternatively you can also follow these steps to open this repo in a container using the VS Code Dev Containers extension:
@@ -40,5 +45,5 @@ You can learn more in the [Dev Containers documentation](https://code.visualstud
## Tips and tricks
* If you are working with the same repository folder in a container and Windows, you'll want consistent line endings (otherwise you may see hundreds of changes in the SCM view). The `.gitattributes` file in the root of this repo will disable line ending conversion and should prevent this. See [tips and tricks](https://code.visualstudio.com/docs/devcontainers/tips-and-tricks#_resolving-git-line-ending-issues-in-containers-resulting-in-many-modified-files) for more info.
* If you'd like to review the contents of the image used in this dev container, you can check it out in the [devcontainers/images](https://github.com/devcontainers/images/tree/main/src/python) repo.
- If you are working with the same repository folder in a container and Windows, you'll want consistent line endings (otherwise you may see hundreds of changes in the SCM view). The `.gitattributes` file in the root of this repo will disable line ending conversion and should prevent this. See [tips and tricks](https://code.visualstudio.com/docs/devcontainers/tips-and-tricks#_resolving-git-line-ending-issues-in-containers-resulting-in-many-modified-files) for more info.
- If you'd like to review the contents of the image used in this dev container, you can check it out in the [devcontainers/images](https://github.com/devcontainers/images/tree/main/src/python) repo.
Hi there! Thank you for even being interested in contributing to LangChain.
As an open-source project in a rapidly developing field, we are extremely open to contributions, whether they involve new features, improved infrastructure, better documentation, or bug fixes.
To learn how to contribute to LangChain, please follow the [contribution guide here](https://python.langchain.com/docs/contributing/).
To learn how to contribute to LangChain, please follow the [contribution guide here](https://docs.langchain.com/oss/python/contributing).
description:Please confirm and check all the following options.
options:
- label:I searched existing ideas and did not find a similar one
required:true
- label:I added a very descriptive title
required:true
- label:I've clearly described the feature request and motivation for it
required:true
- type:textarea
id:feature-request
validations:
required:true
attributes:
label:Feature request
description:|
A clear and concise description of the feature proposal. Please provide links to any relevant GitHub repos, papers, or other resources if relevant.
- type:textarea
id:motivation
validations:
required:true
attributes:
label:Motivation
description:|
Please outline the motivation for the proposal. Is your feature request related to a problem? e.g., I'm always frustrated when [...]. If this is related to another GitHub issue, please link here too.
- type:textarea
id:proposal
validations:
required:false
attributes:
label:Proposal (If applicable)
description:|
If you would like to propose a solution, please describe it here.
Please follow these instructions, fill every question, and do every step. 🙏
We're asking for this because answering questions and solving problems in GitHub takes a lot of time --
this is time that we cannot spend on adding new features, fixing bugs, writing documentation or reviewing pull requests.
By asking questions in a structured way (following this) it will be much easier for us to help you.
There's a high chance that by following this process, you'll find the solution on your own, eliminating the need to submit a question and wait for an answer. 😎
As there are many questions submitted every day, we will **DISCARD** and close the incomplete ones.
That will allow us (and others) to focus on helping people like you that follow the whole process. 🤓
Relevant links to check before opening a question to see if your question has already been answered, fixed or
if there's another way to solve your problem:
[LangChain documentation with the integrated search](https://python.langchain.com/docs/get_started/introduction),
description:Please confirm and check all the following options.
options:
- label:I added a very descriptive title to this question.
required:true
- label:I searched the LangChain documentation with the integrated search.
required:true
- label:I used the GitHub search to find a similar question and didn't find it.
required:true
- type:checkboxes
id:help
attributes:
label:Commit to Help
description:|
After submitting this, I commit to one of:
* Read open questions until I find 2 where I can help someone and add a comment to help there.
* I already hit the "watch" button in this repository to receive notifications and I commit to help at least 2 people that ask questions in the future.
* Once my question is answered, I will mark the answer as "accepted".
options:
- label:I commit to help with one of those options 👆
required:true
- type:textarea
id:example
attributes:
label:Example Code
description:|
Please add a self-contained, [minimal, reproducible, example](https://stackoverflow.com/help/minimal-reproducible-example) with your use case.
If a maintainer can copy it, run it, and see it right away, there's a much higher chance that you'll be able to get help.
**Important!**
* Use code tags (e.g., ```python ... ```) to correctly [format your code](https://help.github.com/en/github/writing-on-github/creating-and-highlighting-code-blocks#syntax-highlighting).
* INCLUDE the language label (e.g. `python`) after the first three backticks to enable syntax highlighting. (e.g., ```python rather than ```).
* Reduce your code to the minimum required to reproduce the issue if possible. This makes it much easier for others to help you.
* Avoid screenshots when possible, as they are hard to read and (more importantly) don't allow others to copy-and-paste your code.
placeholder:|
from langchain_core.runnables import RunnableLambda
def bad_code(inputs) -> int:
raise NotImplementedError('For demo purpose')
chain = RunnableLambda(bad_code)
chain.invoke('Hello!')
render:python
validations:
required:true
- type:textarea
id:description
attributes:
label:Description
description:|
What is the problem, question, or error?
Write a short description explaining what you are doing, what you expect to happen, and what is currently happening.
placeholder:|
* I'm trying to use the `langchain` library to do X.
* I expect to see Y.
* Instead, it does Z.
validations:
required:true
- type:textarea
id:system-info
attributes:
label:System Info
description:|
Please share your system info with us.
"pip freeze | grep langchain"
platform (windows / linux / mac)
python version
OR if you're on a recent version of langchain-core you can paste the output of:
python -m langchain_core.sys_info
placeholder:|
"pip freeze | grep langchain"
platform
python version
Alternatively, if you're on a recent version of langchain-core you can paste the output of:
python -m langchain_core.sys_info
These will only surface LangChain packages, don't forget to include any other relevant
packages you're using (if you're not sure what's relevant, you can paste the entire output of `pip freeze`).
description:Report a bug in LangChain. To report a security issue, please instead use the security option below. For questions, please use the GitHub Discussions.
labels:["02 Bug Report"]
description:Report a bug in LangChain. To report a security issue, please instead use the security option below. For questions, please use the LangChain forum.
labels:["bug"]
type:bug
body:
- type:markdown
attributes:
value:>
Thank you for taking the time to file a bug report.
Use this to report bugs in LangChain.
If you're not certain that your issue is due to a bug in LangChain, please use [GitHub Discussions](https://github.com/langchain-ai/langchain/discussions)
to ask for help with your issue.
value:|
Thank you for taking the time to file a bug report.
Use this to report BUGS in LangChain. For usage questions, feature requests and general design questions, please use the [LangChain Forum](https://forum.langchain.com/).
Relevant links to check before filing a bug report to see if your issue has already been reported, fixed or
if there's another way to solve your problem:
[LangChain documentation with the integrated search](https://python.langchain.com/docs/get_started/introduction),
description:Please confirm and check all the following options.
options:
- label:I added a very descriptive title to this issue.
- label:This is a bug, not a usage question.
required:true
- label:I added a clear and descriptive title that summarizes this issue.
required:true
- label:I used the GitHub search to find a similar question and didn't find it.
required:true
@@ -35,6 +34,10 @@ body:
required:true
- label:The bug is not resolved by updating to the latest stable version of LangChain (or the specific integration package).
required:true
- label:This is not related to the langchain-community package.
required:true
- label:I read what a minimal reproducible example is (https://stackoverflow.com/help/minimal-reproducible-example).
required:true
- label:I posted a self-contained, minimal, reproducible example. A maintainer can copy it and run it AS IS.
required:true
- type:textarea
@@ -45,25 +48,25 @@ body:
label:Example Code
description:|
Please add a self-contained, [minimal, reproducible, example](https://stackoverflow.com/help/minimal-reproducible-example) with your use case.
If a maintainer can copy it, run it, and see it right away, there's a much higher chance that you'll be able to get help.
**Important!**
**Important!**
* Avoid screenshots when possible, as they are hard to read and (more importantly) don't allow others to copy-and-paste your code.
* Reduce your code to the minimum required to reproduce the issue if possible. This makes it much easier for others to help you.
* Use code tags (e.g., ```python ... ```) to correctly [format your code](https://help.github.com/en/github/writing-on-github/creating-and-highlighting-code-blocks#syntax-highlighting).
* INCLUDE the language label (e.g. `python`) after the first three backticks to enable syntax highlighting. (e.g., ```python rather than ```).
* Reduce your code to the minimum required to reproduce the issue if possible. This makes it much easier for others to help you.
* Avoid screenshots when possible, as they are hard to read and (more importantly) don't allow others to copy-and-paste your code.
placeholder:|
The following code:
The following code:
```python
from langchain_core.runnables import RunnableLambda
def bad_code(inputs) -> int:
raise NotImplementedError('For demo purpose')
chain = RunnableLambda(bad_code)
chain.invoke('Hello!')
```
@@ -99,16 +102,18 @@ body:
Please share your system info with us. Do NOT skip this step and please don't trim
the output. Most users don't include enough information here and it makes it harder
for us to help you.
Run the following command in your terminal and paste the output here:
python -m langchain_core.sys_info
`python -m langchain_core.sys_info`
or if you have an existing python interpreter running:
```python
from langchain_core import sys_info
sys_info.print_sys_info()
```
alternatively, put the entire output of `pip freeze` here.
description:Request a new feature or enhancement for LangChain. For questions, please use the LangChain forum.
labels:["feature request"]
type:feature
body:
- type:markdown
attributes:
value:|
Thank you for taking the time to request a new feature.
Use this to request NEW FEATURES or ENHANCEMENTS in LangChain. For bug reports, please use the bug report template. For usage questions and general design questions, please use the [LangChain Forum](https://forum.langchain.com/).
Relevant links to check before filing a feature request to see if your request has already been made or
If you are not a LangChain maintainer or were not asked directly by a maintainer to create an issue, then please start the conversation in a [Question in GitHub Discussions](https://github.com/langchain-ai/langchain/discussions/categories/q-a) instead.
You are a LangChain maintainer if you maintain any of the packages inside of the LangChain repository
or are a regular contributor to LangChain with previous merged pull requests.
If you are not a LangChain maintainer, employee, or were not asked directly by a maintainer to create an issue, then please start the conversation on the [LangChain Forum](https://forum.langchain.com/) instead.
description:Create a task for project management and tracking by LangChain maintainers. If you are not a maintainer, please use other templates or the forum.
labels:["task"]
type:task
body:
- type:markdown
attributes:
value:|
Thanks for creating a task to help organize LangChain development.
This template is for **maintainer tasks** such as project management, development planning, refactoring, documentation updates, and other organizational work.
If you are not a LangChain maintainer or were not asked directly by a maintainer to create a task, then please start the conversation on the [LangChain Forum](https://forum.langchain.com/) instead or use the appropriate bug report or feature request templates on the previous page.
- type:checkboxes
id:maintainer
attributes:
label:Maintainer task
description:Confirm that you are allowed to create a task here.
options:
- label:I am a LangChain maintainer, or was asked directly by a LangChain maintainer to create a task here.
required:true
- type:textarea
id:task-description
attributes:
label:Task Description
description:|
Provide a clear and detailed description of the task.
What needs to be done? Be specific about the scope and requirements.
placeholder:|
This task involves...
The goal is to...
Specific requirements:
- ...
- ...
validations:
required:true
- type:textarea
id:acceptance-criteria
attributes:
label:Acceptance Criteria
description:|
Define the criteria that must be met for this task to be considered complete.
What are the specific deliverables or outcomes expected?
placeholder:|
This task will be complete when:
- [ ] ...
- [ ] ...
- [ ] ...
validations:
required:true
- type:textarea
id:context
attributes:
label:Context and Background
description:|
Provide any relevant context, background information, or links to related issues/PRs.
Why is this task needed? What problem does it solve?
placeholder:|
Background:
- ...
Related issues/PRs:
- #...
Additional context:
- ...
validations:
required:false
- type:textarea
id:dependencies
attributes:
label:Dependencies
description:|
List any dependencies or blockers for this task.
Are there other tasks, issues, or external factors that need to be completed first?
- Where "package" is whichever of langchain, core, etc. is being modified. Use "docs: ..." for purely docs changes, "infra: ..." for CI changes.
- Example: "core: add foobar LLM"
Thank you for contributing to LangChain! Follow these steps to mark your pull request as ready for review. **If any of these steps are not completed, your PR will not be considered for review.**
- [ ]**PR title**: Follows the format: {TYPE}({SCOPE}): {DESCRIPTION}
- *Note:* the `{DESCRIPTION}` must not start with an uppercase letter.
- Once you've written the title, please delete this checklist item; do not include it in the PR.
- [ ]**PR message**: ***Delete this entire checklist*** and replace with
- **Description:** a description of the change
- **Issue:** the issue # it fixes, if applicable
- **Dependencies:** any dependencies required for this change
- **Twitter handle:** if your PR gets announced, and you'd like a mention, we'll gladly shout you out!
- **Description:** a description of the change. Include a [closing keyword](https://docs.github.com/en/issues/tracking-your-work-with-issues/using-issues/linking-a-pull-request-to-an-issue#linking-a-pull-request-to-an-issue-using-a-keyword) if applicable to a relevant issue.
- **Issue:** the issue # it fixes, if applicable (e.g. Fixes #123)
- **Dependencies:** any dependencies required for this change
- [ ]**Add tests and docs**: If you're adding a new integration, you must include:
1. A test for the integration, preferably unit tests that do not rely on network access,
2. An example notebook showing its use. It lives in `docs/docs/integrations` directory.
- [ ]**Add tests and docs**: If you're adding a new integration, please include
1. a test for the integration, preferably unit tests that do not rely on network access,
2. an example notebook showing its use. It lives in `docs/docs/integrations` directory.
- [ ]**Lint and test**: Run `make format`, `make lint` and `make test` from the root of the package(s) you've modified. See contribution guidelines for more: https://python.langchain.com/docs/contributing/
- [ ]**Lint and test**: Run `make format`, `make lint` and `make test` from the root of the package(s) you've modified. **We will not consider a PR unless these three are passing in CI.** See [contribution guidelines](https://python.langchain.com/docs/contributing/) for more.
Additional guidelines:
- Make sure optional dependencies are imported within a function.
- Please do not add dependencies to pyproject.toml files (even optional ones) unless they are required for unit tests.
- Most PRs should not touch more than one package.
- Changes should be backwards compatible.
If no one reviews your PR within a few days, please @-mention one of baskaryan, eyurtsev, ccurme, vbarda, hwchase17.
- Most PRs should not touch more than one package.
- Please do not add dependencies to `pyproject.toml` files (even optional ones) unless they are **required** for unit tests.
- Changes should be backwards compatible.
- Make sure optional dependencies are imported within a function.
@@ -7,5 +7,5 @@ Please see the following guides for migrating LangChain code:
* Migrating from [LangChain 0.0.x Chains](https://python.langchain.com/docs/versions/migrating_chains/)
* Upgrade to [LangGraph Memory](https://python.langchain.com/docs/versions/migrating_memory/)
The [LangChain CLI](https://python.langchain.com/docs/versions/v0_3/#migrate-using-langchain-cli) can help you automatically upgrade your code to use non-deprecated imports.
The [LangChain CLI](https://python.langchain.com/docs/versions/v0_3/#migrate-using-langchain-cli) can help you automatically upgrade your code to use non-deprecated imports.
This will be especially helpful if you're still on either version 0.0.x or 0.1.x of LangChain.
[](https://vscode.dev/redirect?url=vscode://ms-vscode-remote.remote-containers/cloneInVolume?url=https://github.com/langchain-ai/langchain)
[<img src="https://github.com/codespaces/badge.svg" title="Open in Github Codespace" width="150" height="20">](https://codespaces.new/langchain-ai/langchain)
<img src="https://img.shields.io/static/v1?label=Dev%20Containers&message=Open&color=blue&logo=visualstudiocode&style=flat-square" alt="Open in Dev Containers">
> Looking for the JS/TS library? Check out [LangChain.js](https://github.com/langchain-ai/langchainjs).
LangChain is a framework for building LLM-powered applications. It helps you chain
together interoperable components and third-party integrations to simplify AI
application development — all while future-proofing decisions as the underlying
technology evolves.
LangChain is a framework for building LLM-powered applications. It helps you chain together interoperable components and third-party integrations to simplify AI application development — all while future-proofing decisions as the underlying technology evolves.
```bash
pip install -U langchain
```
To learn more about LangChain, check out
[the docs](https://python.langchain.com/docs/introduction/). If you’re looking for more
advanced customization or agent orchestration, check out
[LangGraph](https://langchain-ai.github.io/langgraph/), our framework for building
controllable agent workflows.
---
**Documentation**: To learn more about LangChain, check out [the docs](https://python.langchain.com/docs/introduction/).
If you're looking for more advanced customization or agent orchestration, check out [LangGraph](https://langchain-ai.github.io/langgraph/), our framework for building controllable agent workflows.
> [!NOTE]
> Looking for the JS/TS library? Check out [LangChain.js](https://github.com/langchain-ai/langchainjs).
## Why use LangChain?
LangChain helps developers build applications powered by LLMs through a standard
interface for models, embeddings, vector stores, and more.
LangChain helps developers build applications powered by LLMs through a standard interface for models, embeddings, vector stores, and more.
Use LangChain for:
- **Real-time data augmentation**. Easily connect LLMs to diverse data sources and
external / internal systems, drawing from LangChain’s vast library of integrations with
model providers, tools, vector stores, retrievers, and more.
- **Model interoperability**. Swap models in and out as your engineering team
experiments to find the best choice for your application’s needs. As the industry
frontier evolves, adapt quickly — LangChain’s abstractions keep you moving without
losing momentum.
- **Real-time data augmentation**. Easily connect LLMs to diverse data sources and external/internal systems, drawing from LangChain’s vast library of integrations with model providers, tools, vector stores, retrievers, and more.
- **Model interoperability**. Swap models in and out as your engineering team experiments to find the best choice for your application’s needs. As the industry frontier evolves, adapt quickly — LangChain’s abstractions keep you moving without losing momentum.
## LangChain’s ecosystem
While the LangChain framework can be used standalone, it also integrates seamlessly
with any LangChain product, giving developers a full suite of tools when building LLM
applications.
While the LangChain framework can be used standalone, it also integrates seamlessly with any LangChain product, giving developers a full suite of tools when building LLM applications.
To improve your LLM application development, pair LangChain with:
- [LangSmith](http://www.langchain.com/langsmith) - Helpful for agent evals and
observability. Debug poor-performing LLM app runs, evaluate agent trajectories, gain
visibility in production, and improve performance over time.
- [LangGraph](https://langchain-ai.github.io/langgraph/) - Build agents that can
reliably handle complex tasks with LangGraph, our low-level agent orchestration
framework. LangGraph offers customizable architecture, long-term memory, and
human-in-the-loop workflows — and is trusted in production by companies like LinkedIn,
- [LangSmith](https://www.langchain.com/langsmith) - Helpful for agent evals and observability. Debug poor-performing LLM app runs, evaluate agent trajectories, gain visibility in production, and improve performance over time.
- [LangGraph](https://langchain-ai.github.io/langgraph/) - Build agents that can reliably handle complex tasks with LangGraph, our low-level agent orchestration framework. LangGraph offers customizable architecture, long-term memory, and human-in-the-loop workflows — and is trusted in production by companies like LinkedIn, Uber, Klarna, and GitLab.
- [LangGraph Platform](https://docs.langchain.com/langgraph-platform) - Deploy and scale agents effortlessly with a purpose-built deployment platform for long-running, stateful workflows. Discover, reuse, configure, and share agents across teams — and iterate quickly with visual prototyping in [LangGraph Studio](https://langchain-ai.github.io/langgraph/concepts/langgraph_studio/).
## Additional resources
- [Tutorials](https://python.langchain.com/docs/tutorials/): Simple walkthroughs with
guided examples on getting started with LangChain.
snippets for topics such as tool calling, RAG use cases, and more.
- [Conceptual Guides](https://python.langchain.com/docs/concepts/): Explanations of key
concepts behind the LangChain framework.
- [API Reference](https://python.langchain.com/api_reference/): Detailed reference on
navigating base packages and integrations for LangChain.
- [Tutorials](https://python.langchain.com/docs/tutorials/): Simple walkthroughs with guided examples on getting started with LangChain.
- [How-to Guides](https://python.langchain.com/docs/how_to/): Quick, actionable code snippets for topics such as tool calling, RAG use cases, and more.
- [Conceptual Guides](https://python.langchain.com/docs/concepts/): Explanations of key concepts behind the LangChain framework.
- [LangChain Forum](https://forum.langchain.com/): Connect with the community and share all of your technical questions, ideas, and feedback.
- [API Reference](https://python.langchain.com/api_reference/): Detailed reference on navigating base packages and integrations for LangChain.
- [Chat LangChain](https://chat.langchain.com/): Ask questions & chat with our documentation.
@@ -4,13 +4,14 @@ LangChain has a large ecosystem of integrations with various external resources
## Best practices
When building such applications developers should remember to follow good security practices:
When building such applications, developers should remember to follow good security practices:
* [**Limit Permissions**](https://en.wikipedia.org/wiki/Principle_of_least_privilege): Scope permissions specifically to the application's need. Granting broad or excessive permissions can introduce significant security vulnerabilities. To avoid such vulnerabilities, consider using read-only credentials, disallowing access to sensitive resources, using sandboxing techniques (such as running inside a container), specifying proxy configurations to control external requests, etc. as appropriate for your application.
* **Anticipate Potential Misuse**: Just as humans can err, so can Large Language Models (LLMs). Always assume that any system access or credentials may be used in any way allowed by the permissions they are assigned. For example, if a pair of database credentials allows deleting data, it’s safest to assume that any LLM able to use those credentials may in fact delete data.
* [**Defense in Depth**](https://en.wikipedia.org/wiki/Defense_in_depth_(computing)): No security technique is perfect. Fine-tuning and good chain design can reduce, but not eliminate, the odds that a Large Language Model (LLM) may make a mistake. It’s best to combine multiple layered security approaches rather than relying on any single layer of defense to ensure security. For example: use both read-only permissions and sandboxing to ensure that LLMs are only able to access data that is explicitly meant for them to use.
* [**Limit Permissions**](https://en.wikipedia.org/wiki/Principle_of_least_privilege): Scope permissions specifically to the application's need. Granting broad or excessive permissions can introduce significant security vulnerabilities. To avoid such vulnerabilities, consider using read-only credentials, disallowing access to sensitive resources, using sandboxing techniques (such as running inside a container), specifying proxy configurations to control external requests, etc., as appropriate for your application.
* **Anticipate Potential Misuse**: Just as humans can err, so can Large Language Models (LLMs). Always assume that any system access or credentials may be used in any way allowed by the permissions they are assigned. For example, if a pair of database credentials allows deleting data, it's safest to assume that any LLM able to use those credentials may in fact delete data.
* [**Defense in Depth**](https://en.wikipedia.org/wiki/Defense_in_depth_(computing)): No security technique is perfect. Fine-tuning and good chain design can reduce, but not eliminate, the odds that a Large Language Model (LLM) may make a mistake. It's best to combine multiple layered security approaches rather than relying on any single layer of defense to ensure security. For example: use both read-only permissions and sandboxing to ensure that LLMs are only able to access data that is explicitly meant for them to use.
Risks of not doing so include, but are not limited to:
* Data corruption or loss.
* Unauthorized access to confidential information.
* Compromised performance or availability of critical resources.
@@ -21,65 +22,58 @@ Example scenarios with mitigation strategies:
* A user may ask an agent with write access to an external API to write malicious data to the API, or delete data from that API. To mitigate, give the agent read-only API keys, or limit it to only use endpoints that are already resistant to such misuse.
* A user may ask an agent with access to a database to drop a table or mutate the schema. To mitigate, scope the credentials to only the tables that the agent needs to access and consider issuing READ-ONLY credentials.
If you're building applications that access external resources like file systems, APIs
or databases, consider speaking with your company's security team to determine how to best
design and secure your applications.
If you're building applications that access external resources like file systems, APIs or databases, consider speaking with your company's security team to determine how to best design and secure your applications.
## Reporting OSS Vulnerabilities
LangChain is partnered with [huntr by Protect AI](https://huntr.com/) to provide
a bounty program for our open source projects.
LangChain is partnered with [huntr by Protect AI](https://huntr.com/) to provide
a bounty program for our open source projects.
Please report security vulnerabilities associated with the LangChain
open source projects by visiting the following link:
Please report security vulnerabilities associated with the LangChain
open source projects at [huntr](https://huntr.com/bounties/disclose/?target=https%3A%2F%2Fgithub.com%2Flangchain-ai%2Flangchain&validSearch=true).
Before reporting a vulnerability, please review:
1) In-Scope Targets and Out-of-Scope Targets below.
2) The [langchain-ai/langchain](https://python.langchain.com/docs/contributing/repo_structure) monorepo structure.
3) The [Best practicies](#best-practices) above to
understand what we consider to be a security vulnerability vs. developer
responsibility.
2) The [langchain-ai/langchain](https://docs.langchain.com/oss/python/contributing/code#supporting-packages) monorepo structure.
3) The [Best Practices](#best-practices) above to understand what we consider to be a security vulnerability vs. developer responsibility.
### In-Scope Targets
The following packages and repositories are eligible for bug bounties:
- langchain-core
- langchain (see exceptions)
- langchain-community (see exceptions)
- langgraph
- langserve
* langchain-core
* langchain (see exceptions)
* langchain-community (see exceptions)
* langgraph
* langserve
### Out of Scope Targets
All out of scope targets defined by huntr as well as:
- **langchain-experimental**: This repository is for experimental code and is not
* **langchain-experimental**: This repository is for experimental code and is not
eligible for bug bounties (see [package warning](https://pypi.org/project/langchain-experimental/)), bug reports to it will be marked as interesting or waste of
time and published with no bounty attached.
- **tools**: Tools in either langchain or langchain-community are not eligible for bug
***tools**: Tools in either langchain or langchain-community are not eligible for bug
bounties. This includes the following directories
- libs/langchain/langchain/tools
- libs/community/langchain_community/tools
- Please review the [best practices](#best-practices)
* libs/langchain/langchain/tools
* libs/community/langchain_community/tools
* Please review the [Best Practices](#best-practices)
for more details, but generally tools interact with the real world. Developers are
expected to understand the security implications of their code and are responsible
for the security of their tools.
- Code documented with security notices. This will be decided done on a case by
case basis, but likely will not be eligible for a bounty as the code is already
* Code documented with security notices. This will be decided on a case-by-case basis, but likely will not be eligible for a bounty as the code is already
documented with guidelines for developers that should be followed for making their
application secure.
- Any LangSmith related repositories or APIs (see [Reporting LangSmith Vulnerabilities](#reporting-langsmith-vulnerabilities)).
* Any LangSmith related repositories or APIs (see [Reporting LangSmith Vulnerabilities](#reporting-langsmith-vulnerabilities)).
## Reporting LangSmith Vulnerabilities
Please report security vulnerabilities associated with LangSmith by email to `security@langchain.dev`.
[rag-locally-on-intel-cpu.ipynb](https://github.com/langchain-ai/langchain/tree/master/cookbook/rag-locally-on-intel-cpu.ipynb) | Perform Retrieval-Augmented-Generation (RAG) on locally downloaded open-source models using langchain and open source tools and execute it on Intel Xeon CPU. We showed an example of how to apply RAG on Llama 2 model and enable it to answer the queries related to Intel Q1 2024 earnings release.
[visual_RAG_vdms.ipynb](https://github.com/langchain-ai/langchain/tree/master/cookbook/visual_RAG_vdms.ipynb) | Performs Visual Retrieval-Augmented-Generation (RAG) using videos and scene descriptions generated by open source models.
[contextual_rag.ipynb](https://github.com/langchain-ai/langchain/tree/master/cookbook/contextual_rag.ipynb) | Performs contextual retrieval-augmented generation (RAG) prepending chunk-specific explanatory context to each chunk before embedding.
[rag-agents-locally-on-intel-cpu.ipynb](https://github.com/langchain-ai/langchain/tree/master/cookbook/local_rag_agents_intel_cpu.ipynb) | Build a RAG agent locally with open source models that routes questions through one of two paths to find answers. The agent generates answers based on documents retrieved from either the vector database or retrieved from web search. If the vector database lacks relevant information, the agent opts for web search. Open-source models for LLM and embeddings are used locally on an Intel Xeon CPU to execute this pipeline.
[rag-agents-locally-on-intel-cpu.ipynb](https://github.com/langchain-ai/langchain/tree/master/cookbook/local_rag_agents_intel_cpu.ipynb) | Build a RAG agent locally with open source models that routes questions through one of two paths to find answers. The agent generates answers based on documents retrieved from either the vector database or retrieved from web search. If the vector database lacks relevant information, the agent opts for web search. Open-source models for LLM and embeddings are used locally on an Intel Xeon CPU to execute this pipeline.
[rag_mlflow_tracking_evaluation.ipynb](https://github.com/langchain-ai/langchain/tree/master/cookbook/rag_mlflow_tracking_evaluation.ipynb) | Guide on how to create a RAG pipeline and track + evaluate it with MLflow.
"source": "##### You can embed text in the same VectorDB space as images, and retrieve text and images as well based on input text or image.\n##### Following link demonstrates that.\n<a> https://python.langchain.com/v0.2/docs/integrations/text_embedding/open_clip/ </a>"
},
{
"cell_type": "markdown",
@@ -389,7 +385,7 @@
" ax = axs[idx]\n",
" ax.imshow(img)\n",
" # Assuming similarity is not available in the new data, removed sim_score\n",
"This guide demonstrates how Oracle AI Vector Search can be used with Langchain to serve an end-to-end RAG pipeline. This guide goes through examples of:\n",
"This guide demonstrates how Oracle AI Vector Search can be used with LangChain to serve an end-to-end RAG pipeline. This guide goes through examples of:\n",
"\n",
" * Loading the documents from various sources using OracleDocLoader\n",
" * Summarizing them within/outside the database using OracleSummary\n",
@@ -47,7 +47,21 @@
"source": [
"### Prerequisites\n",
"\n",
"Please install Oracle Python Client driver to use Langchain with Oracle AI Vector Search. "
"You'll need to install `langchain-oracledb` with `python -m pip install -U langchain-oracledb` to use this integration.\n",
"\n",
"The `python-oracledb` driver is installed automatically as a dependency of langchain-oracledb.\n",
"First, connect as a privileged user to create a demo user with all the required privileges. Change the credentials for your environment. Also set the DEMO_PY_DIR path to a directory on the database host where your model file is located:"
]
},
{
@@ -56,65 +70,30 @@
"metadata": {},
"outputs": [],
"source": [
"# pip install oracledb"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Create Demo User\n",
"First, create a demo user with all the required privileges. "
]
},
{
"cell_type": "code",
"execution_count": 37,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Connection successful!\n",
"User setup done!\n"
]
}
],
"source": [
"import sys\n",
"\n",
"import oracledb\n",
"\n",
"# Update with your username, password, hostname, and service_name\n",
"username = \"\"\n",
"# Please update with your SYSTEM (or privileged user) username, password, and database connection string\n",
" execute immediate 'drop user if exists testuser cascade';\n",
"\n",
" -- Create user and grant privileges\n",
" execute immediate 'create user testuser identified by testuser';\n",
" execute immediate 'grant connect, unlimited tablespace, create credential, create procedure, create any index to testuser';\n",
" execute immediate 'create or replace directory DEMO_PY_DIR as ''/scratch/hroy/view_storage/hroy_devstorage/demo/orachain''';\n",
" execute immediate 'create or replace directory DEMO_PY_DIR as ''/home/yourname/demo/orachain''';\n",
" execute immediate 'grant read, write on directory DEMO_PY_DIR to public';\n",
" execute immediate 'grant create mining model to testuser';\n",
"\n",
"\n",
" -- Network access\n",
" begin\n",
" DBMS_NETWORK_ACL_ADMIN.APPEND_HOST_ACE(\n",
@@ -127,15 +106,7 @@
" end;\n",
" \"\"\"\n",
" )\n",
" print(\"User setup done!\")\n",
" except Exception as e:\n",
" print(f\"User setup failed with error: {e}\")\n",
" finally:\n",
" cursor.close()\n",
" conn.close()\n",
"except Exception as e:\n",
" print(f\"Connection failed with error: {e}\")\n",
" sys.exit(1)"
" print(\"User setup done!\")"
]
},
{
@@ -143,13 +114,13 @@
"metadata": {},
"source": [
"## Process Documents using Oracle AI\n",
"Consider the following scenario: users possess documents stored either in an Oracle Database or a file system and intend to utilize this data with Oracle AI Vector Search powered by Langchain.\n",
"Consider the following scenario: users possess documents stored either in an Oracle Database or a file system and intend to utilize this data with Oracle AI Vector Search powered by LangChain.\n",
"\n",
"To prepare the documents for analysis, a comprehensive preprocessing workflow is necessary. Initially, the documents must be retrieved, summarized (if required), and chunked as needed. Subsequent steps involve generating embeddings for these chunks and integrating them into the Oracle AI Vector Store. Users can then conduct semantic searches on this data.\n",
"\n",
"The Oracle AI Vector Search Langchain library encompasses a suite of document processing tools that facilitate document loading, chunking, summary generation, and embedding creation.\n",
"The Oracle AI Vector Search LangChain library encompasses a suite of document processing tools that facilitate document loading, chunking, summary generation, and embedding creation.\n",
"\n",
"In the sections that follow, we will detail the utilization of Oracle AI Langchain APIs to effectively implement each of these processes."
"In the sections that follow, we will detail the utilization of Oracle AI LangChain APIs to effectively implement each of these processes."
]
},
{
@@ -157,38 +128,24 @@
"metadata": {},
"source": [
"### Connect to Demo User\n",
"The following sample code will show how to connect to Oracle Database. By default, python-oracledb runs in a ‘Thin’ mode which connects directly to Oracle Database. This mode does not need Oracle Client libraries. However, some additional functionality is available when python-oracledb uses them. Python-oracledb is said to be in ‘Thick’ mode when Oracle Client libraries are used. Both modes have comprehensive functionality supporting the Python Database API v2.0 Specification. See the following [guide](https://python-oracledb.readthedocs.io/en/latest/user_guide/appendix_a.html#featuresummary) that talks about features supported in each mode. You might want to switch to thick-mode if you are unable to use thin-mode."
"The following sample code shows how to connect to Oracle Database using the python-oracledb driver. By default, python-oracledb runs in a ‘Thin’ mode which connects directly to Oracle Database. This mode does not need Oracle Client libraries. However, some additional functionality is available when python-oracledb uses them. Python-oracledb is said to be in ‘Thick’ mode when Oracle Client libraries are used. Both modes have comprehensive functionality supporting the Python Database API v2.0 Specification. See the following [guide](https://python-oracledb.readthedocs.io/en/latest/user_guide/appendix_a.html#featuresummary) that talks about features supported in each mode. You can switch to Thickmode if you are unable to use Thinmode."
]
},
{
"cell_type": "code",
"execution_count": 45,
"execution_count": null,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Connection successful!\n"
]
}
],
"outputs": [],
"source": [
"import sys\n",
"\n",
"import oracledb\n",
"\n",
"# please update with your username, password, hostname and service_name\n",
"username = \"\"\n",
"# please update with your username, password, and database connection string\n",
"Oracle accommodates a variety of embedding providers, enabling users to choose between proprietary database solutions and third-party services such as OCIGENAI and HuggingFace. This selection dictates the methodology for generating and managing embeddings.\n",
"Oracle accommodates a variety of embedding providers, enabling you to choose between proprietary database solutions and third-party services such as Oracle Generative AI Service and HuggingFace. This selection dictates the methodology for generating and managing embeddings.\n",
"\n",
"***Important*** : Should users opt for the database option, they must upload an ONNX model into the Oracle Database. Conversely, if a third-party provider is selected for embedding generation, uploading an ONNX model to Oracle Database is not required.\n",
"***Important*** : Should you opt for the database option, you must upload an ONNX model into the Oracle Database. Conversely, if a third-party provider is selected for embedding generation, uploading an ONNX model to Oracle Database is not required.\n",
"\n",
"A significant advantage of utilizing an ONNX model directly within Oracle is the enhanced security and performance it offers by eliminating the need to transmit data to external parties. Additionally, this method avoids the latency typically associated with network or REST API calls.\n",
"A significant advantage of utilizing an ONNX model directly within Oracle Database is the enhanced security and performance it offers by eliminating the need to transmit data to external parties. Additionally, this method avoids the latency typically associated with network or REST API calls.\n",
"\n",
"Below is the example code to upload an ONNX model into Oracle Database:"
"Users have the flexibility to load documents from either the Oracle Database, a file system, or both, by appropriately configuring the loader parameters. For comprehensive details on these parameters, please consult the [Oracle AI Vector Search Guide](https://docs.oracle.com/en/database/oracle/oracle-database/23/arpls/dbms_vector_chain1.html#GUID-73397E89-92FB-48ED-94BB-1AD960C4EA1F).\n",
"You have the flexibility to load documents from either the Oracle Database, a file system, or both, by appropriately configuring the loader parameters. For comprehensive details on these parameters, please consult the [Oracle AI Vector Search Guide](https://docs.oracle.com/en/database/oracle/oracle-database/23/arpls/dbms_vector_chain1.html#GUID-73397E89-92FB-48ED-94BB-1AD960C4EA1F).\n",
"\n",
"A significant advantage of utilizing OracleDocLoader is its capability to process over 150 distinct file formats, eliminating the need for multiple loaders for different document types. For a complete list of the supported formats, please refer to the [Oracle Text Supported Document Formats](https://docs.oracle.com/en/database/oracle/oracle-database/23/ccref/oracle-text-supported-document-formats.html).\n",
"\n",
"Below is a sample code snippet that demonstrates how to use OracleDocLoader"
"Below is a sample code snippet that demonstrates how to use OracleDocLoader:"
"Now that the user loaded the documents, they may want to generate a summary for each document. The Oracle AI Vector Search Langchain library offers a suite of APIs designed for document summarization. It supports multiple summarization providers such as Database, OCIGENAI, HuggingFace, among others, allowing users to select the provider that best meets their needs. To utilize these capabilities, users must configure the summary parameters as specified. For detailed information on these parameters, please consult the [Oracle AI Vector Search Guide book](https://docs.oracle.com/en/database/oracle/oracle-database/23/arpls/dbms_vector_chain1.html#GUID-EC9DDB58-6A15-4B36-BA66-ECBA20D2CE57)."
"Now that you have loaded the documents, you may want to generate a summary for each document. The Oracle AI Vector Search LangChain library offers a suite of APIs designed for document summarization. It supports multiple summarization providers such as Database, Oracle Generative AI Service, HuggingFace, among others, allowing you to select the provider that best meets their needs. To utilize these capabilities, you must configure the summary parameters as specified. For detailed information on these parameters, please consult the [Oracle AI Vector Search Guide book](https://docs.oracle.com/en/database/oracle/oracle-database/23/arpls/dbms_vector_chain1.html#GUID-EC9DDB58-6A15-4B36-BA66-ECBA20D2CE57)."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"***Note:*** The users may need to set proxy if they want to use some 3rd party summary generation providers other than Oracle's in-house and default provider: 'database'. If you don't have proxy, please remove the proxy parameter when you instantiate the OracleSummary."
"***Note:*** You may need to set proxy if you want to use some 3rd party summary generation providers other than Oracle's in-house and default provider: 'database'. If you don't have proxy, please remove the proxy parameter when you instantiate the OracleSummary."
]
},
{
"cell_type": "code",
"execution_count": 22,
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# proxy to be used when we instantiate summary and embedder object\n",
"# proxy to be used when we instantiate summary and embedder objects\n",
"proxy = \"\""
]
},
@@ -433,24 +347,16 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"The following sample code will show how to generate summary:"
"The following sample code shows how to generate a summary:"
"Now that the documents are chunked as per requirements, the users may want to generate embeddings for these chunks. Oracle AI Vector Search provides multiple methods for generating embeddings, utilizing either locally hosted ONNX models or third-party APIs. For comprehensive instructions on configuring these alternatives, please refer to the [Oracle AI Vector Search Guide](https://docs.oracle.com/en/database/oracle/oracle-database/23/arpls/dbms_vector_chain1.html#GUID-C6439E94-4E86-4ECD-954E-4B73D53579DE)."
"Now that the documents are chunked as per requirements, you may want to generate embeddings for these chunks. Oracle AI Vector Search provides multiple methods for generating embeddings, utilizing either locally hosted ONNX models or third-party APIs. For comprehensive instructions on configuring these alternatives, please refer to the [Oracle AI Vector Search Guide](https://docs.oracle.com/en/database/oracle/oracle-database/23/arpls/dbms_vector_chain1.html#GUID-C6439E94-4E86-4ECD-954E-4B73D53579DE)."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"***Note:*** Users may need to configure a proxy to utilize third-party embedding generation providers, excluding the 'database' provider that utilizes an ONNX model."
"***Note:*** You may need to configure a proxy to utilize third-party embedding generation providers, excluding the 'database' provider that utilizes an ONNX model."
]
},
{
"cell_type": "code",
"execution_count": 12,
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
@@ -547,24 +445,16 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"The following sample code will show how to generate embeddings:"
"The following sample code shows how to generate embeddings:"
"Now that you know how to use Oracle AI Langchain library APIs individually to process the documents, let us show how to integrate with Oracle AI Vector Store to facilitate the semantic searches."
"Now that you know how to use Oracle AI LangChain library APIs individually to process the documents, let us show how to integrate with Oracle AI Vector Store to facilitate the semantic searches."
"This example demonstrates the creation of a default HNSW index on embeddings within the 'oravs' table. Users may adjust various parameters according to their specific needs. For detailed information on these parameters, please consult the [Oracle AI Vector Search Guide book](https://docs.oracle.com/en/database/oracle/oracle-database/23/vecse/manage-different-categories-vector-indexes.html).\n",
"This example demonstrates the creation of a default HNSW index on embeddings within the 'oravs' table. You may adjust various parameters according to your specific needs. For detailed information on these parameters, please consult the [Oracle AI Vector Search Guide book](https://docs.oracle.com/en/database/oracle/oracle-database/23/vecse/manage-different-categories-vector-indexes.html).\n",
"\n",
"Additionally, various types of vector indices can be created to meet diverse requirements. More details can be found in our [comprehensive guide](https://python.langchain.com/v0.1/docs/integrations/vectorstores/oracle/).\n"
]
@@ -805,44 +667,31 @@
"## Perform Semantic Search\n",
"All set!\n",
"\n",
"We have successfully processed the documents and stored them in the vector store, followed by the creation of an index to enhance query performance. We are now prepared to proceed with semantic searches.\n",
"You have successfully processed the documents and stored them in the vector store, followed by the creation of an index to enhance query performance. You are now prepared to proceed with semantic searches.\n",
"\n",
"Below is the sample code for this process:"
]
},
{
"cell_type": "code",
"execution_count": 58,
"execution_count": null,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"[Document(page_content='The database stores LOBs differently from other data types. Creating a LOB column implicitly creates a LOB segment and a LOB index. The tablespace containing the LOB segment and LOB index, which are always stored together, may be different from the tablespace containing the table. Sometimes the database can store small amounts of LOB data in the table itself rather than in a separate LOB segment.', metadata={'_oid': '662f2f257677f3c2311a8ff999fd34e5', '_rowid': 'AAAR/xAAEAAAAAnAAC', 'id': '662f2f257677f3c2311a8ff999fd34e5$3$1', 'document_id': '3', 'document_summary': 'Sometimes the database can store small amounts of LOB data in the table itself rather than in a separate LOB segment.\\n\\n'})]\n",
"[]\n",
"[(Document(page_content='The database stores LOBs differently from other data types. Creating a LOB column implicitly creates a LOB segment and a LOB index. The tablespace containing the LOB segment and LOB index, which are always stored together, may be different from the tablespace containing the table. Sometimes the database can store small amounts of LOB data in the table itself rather than in a separate LOB segment.', metadata={'_oid': '662f2f257677f3c2311a8ff999fd34e5', '_rowid': 'AAAR/xAAEAAAAAnAAC', 'id': '662f2f257677f3c2311a8ff999fd34e5$3$1', 'document_id': '3', 'document_summary': 'Sometimes the database can store small amounts of LOB data in the table itself rather than in a separate LOB segment.\\n\\n'}), 0.055675752460956573)]\n",
"[]\n",
"[Document(page_content='If the answer to any preceding questions is yes, then the database stops the search and allocates space from the specified tablespace; otherwise, space is allocated from the database default shared temporary tablespace.', metadata={'_oid': '662f2f253acf96b33b430b88699490a2', '_rowid': 'AAAR/xAAEAAAAAnAAA', 'id': '662f2f253acf96b33b430b88699490a2$1$1', 'document_id': '1', 'document_summary': 'If the answer to any preceding questions is yes, then the database stops the search and allocates space from the specified tablespace; otherwise, space is allocated from the database default shared temporary tablespace.\\n\\n'})]\n",
"[Document(page_content='If the answer to any preceding questions is yes, then the database stops the search and allocates space from the specified tablespace; otherwise, space is allocated from the database default shared temporary tablespace.', metadata={'_oid': '662f2f253acf96b33b430b88699490a2', '_rowid': 'AAAR/xAAEAAAAAnAAA', 'id': '662f2f253acf96b33b430b88699490a2$1$1', 'document_id': '1', 'document_summary': 'If the answer to any preceding questions is yes, then the database stops the search and allocates space from the specified tablespace; otherwise, space is allocated from the database default shared temporary tablespace.\\n\\n'})]\n"
"# RAG Pipeline with MLflow Tracking, Tracing & Evaluation\n",
"\n",
"This notebook demonstrates how to build a complete Retrieval-Augmented Generation (RAG) pipeline using LangChain and integrate it with MLflow for experiment tracking, tracing, and evaluation.\n",
"\n",
"\n",
"- **RAG Pipeline Construction**: Build a complete RAG system using LangChain components\n",
"- **MLflow Integration**: Track experiments, parameters, and artifacts\n",
" \"system_prompt\": \"You are a helpful assistant. Use the following context to answer the question. Use three sentences maximum and keep the answer concise.\",\n",
" \"llm\": \"gpt-5-nano\",\n",
" \"temperature\": 0,\n",
"}"
]
},
{
"cell_type": "markdown",
"id": "8a2985f1",
"metadata": {},
"source": [
"#### ArXiv Dcoument Loading and Processing"
]
},
{
"cell_type": "code",
"execution_count": 4,
"id": "1f32aa36",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"{'Published': '2023-08-02', 'Title': 'Attention Is All You Need', 'Authors': 'Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N. Gomez, Lukasz Kaiser, Illia Polosukhin', 'Summary': 'The dominant sequence transduction models are based on complex recurrent or\\nconvolutional neural networks in an encoder-decoder configuration. The best\\nperforming models also connect the encoder and decoder through an attention\\nmechanism. We propose a new simple network architecture, the Transformer, based\\nsolely on attention mechanisms, dispensing with recurrence and convolutions\\nentirely. Experiments on two machine translation tasks show these models to be\\nsuperior in quality while being more parallelizable and requiring significantly\\nless time to train. Our model achieves 28.4 BLEU on the WMT 2014\\nEnglish-to-German translation task, improving over the existing best results,\\nincluding ensembles by over 2 BLEU. On the WMT 2014 English-to-French\\ntranslation task, our model establishes a new single-model state-of-the-art\\nBLEU score of 41.8 after training for 3.5 days on eight GPUs, a small fraction\\nof the training costs of the best models from the literature. We show that the\\nTransformer generalizes well to other tasks by applying it successfully to\\nEnglish constituency parsing both with large and limited training data.'}\n"
]
}
],
"source": [
"# Load documents from ArXiv\n",
"loader = ArxivLoader(\n",
" query=\"1706.03762\",\n",
" load_max_docs=1,\n",
")\n",
"docs = loader.load()\n",
"print(docs[0].metadata)\n",
"\n",
"# Split documents into chunks\n",
"splitter = RecursiveCharacterTextSplitter(\n",
" chunk_size=CONFIG[\"chunk_size\"],\n",
" chunk_overlap=CONFIG[\"chunk_overlap\"],\n",
")\n",
"chunks = splitter.split_documents(docs)\n",
"\n",
"\n",
"# Join chunks into a single string\n",
"def join_chunks(chunks):\n",
" return \"\\n\\n\".join([chunk.page_content for chunk in chunks])"
"Create a prediction function decorated with `@mlflow.trace` to automatically log:\n",
"- Input queries\n",
"- Retrieved documents\n",
"- Generated responses\n",
"- Execution time\n",
"- Chain intermediate steps"
]
},
{
"cell_type": "code",
"execution_count": 7,
"id": "7b45fc04",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Question: What is the main idea of the paper?\n",
"Response: The main idea is to replace recurrent/convolutional sequence models with a pure attention-based architecture called the Transformer. It uses self-attention to model dependencies between all positions in the input and output, enabling full parallelization and better handling of long-range relations. This approach achieves strong results on translation and can extend to other modalities.\n"
]
}
],
"source": [
"@mlflow.trace\n",
"def predict_fn(question: str) -> str:\n",
" return rag_chain.invoke(question)\n",
"\n",
"\n",
"# Test the prediction function\n",
"sample_question = \"What is the main idea of the paper?\"\n",
"response = predict_fn(sample_question)\n",
"print(f\"Question: {sample_question}\")\n",
"print(f\"Response: {response}\")"
]
},
{
"cell_type": "markdown",
"id": "421469de",
"metadata": {},
"source": [
"#### Evaluation Dataset and Scoring\n",
"\n",
"Define an evaluation dataset and run systematic evaluation using [MLflow's built-in scorers](https://mlflow.org/docs/latest/genai/eval-monitor/scorers/llm-judge/predefined/#available-scorers):\n",
"\n",
"<u>Evaluation Components:</u>\n",
"- **Dataset**: Questions with expected concepts and facts\n",
"- **Scorers**: \n",
" - `RelevanceToQuery`: Measures how relevant the response is to the question\n",
" - `Correctness`: Evaluates factual accuracy of the response\n",
" - `ExpectationsGuidelines`: Checks that output matches expectation guidelines\n",
"\n",
"<u>Best Practices:</u>\n",
"- Create diverse test cases covering different query types\n",
"- Include expected concepts to guide evaluation\n",
"- Use multiple scoring metrics for comprehensive assessment"
]
},
{
"cell_type": "code",
"execution_count": 8,
"id": "5c1dc4f2",
"metadata": {},
"outputs": [
{
"name": "stderr",
"output_type": "stream",
"text": [
"2025/08/23 20:14:39 INFO mlflow.models.evaluation.utils.trace: Auto tracing is temporarily enabled during the model evaluation for computing some metrics and debugging. To disable tracing, call `mlflow.autolog(disable=True)`.\n",
"2025/08/23 20:14:39 INFO mlflow.genai.utils.data_validation: Testing model prediction with the first sample in the dataset.\n"
" AIMessage(content='The result of \\\\(3 + 5^{2.743}\\\\) is approximately 300.04, and the result of \\\\(17.24 - 918.1241\\\\) is approximately -900.88.', response_metadata={'token_usage': {'completion_tokens': 44, 'prompt_tokens': 251, 'total_tokens': 295}, 'model_name': 'gpt-3.5-turbo-0125', 'system_fingerprint': 'fp_b28b39ffa8', 'finish_reason': 'stop', 'logprobs': None}, id='run-d1161669-ed09-4b18-94bd-6d8530df5aa8-0')]}"
"{'messages': [HumanMessage(content=\"what's 3 plus 5 raised to the 2.743. also what's 17.24 - 918.1241\", additional_kwargs={}, response_metadata={}),\n",
"{'messages': [HumanMessage(content=\"what's 3 plus 5 raised to the 2.743. also what's 17.24 - 918.1241\", additional_kwargs={}, response_metadata={}),\n",
" AIMessage(content=[{'text': \"I'll solve these calculations for you.\\n\\nFor the first part, I need to calculate 3 plus 5 raised to the power of 2.743.\\n\\nLet me break this down:\\n1) First, I'll calculate 5 raised to the power of 2.743\\n2) Then add 3 to the result\", 'type': 'text'}, {'id': 'toolu_01L1mXysBQtpPLQ2AZTaCGmE', 'input': {'x': 5, 'y': 2.743}, 'name': 'exponentiate', 'type': 'tool_use'}], additional_kwargs={}, response_metadata={'id': 'msg_01HCbDmuzdg9ATMyKbnecbEE', 'model': 'claude-3-7-sonnet-20250219', 'stop_reason': 'tool_use', 'stop_sequence': None, 'usage': {'cache_creation_input_tokens': 0, 'cache_read_input_tokens': 0, 'input_tokens': 563, 'output_tokens': 146, 'server_tool_use': None, 'service_tier': 'standard'}, 'model_name': 'claude-3-7-sonnet-20250219'}, id='run--9f6469fb-bcbb-4c1c-9eec-79f6979c38e6-0', tool_calls=[{'name': 'exponentiate', 'args': {'x': 5, 'y': 2.743}, 'id': 'toolu_01L1mXysBQtpPLQ2AZTaCGmE', 'type': 'tool_call'}], usage_metadata={'input_tokens': 563, 'output_tokens': 146, 'total_tokens': 709, 'input_token_details': {'cache_read': 0, 'cache_creation': 0}}),\n",
For more information on contributing to our documentation, see the [Documentation Contributing Guide](https://python.langchain.com/docs/contributing/how_to/documentation)
For more information on contributing to our documentation, see the [Documentation Contributing Guide](https://python.langchain.com/docs/contributing/how_to/documentation).
## Structure
The primary documentation is located in the `docs/` directory. This directory contains
both the source files for the main documentation as well as the API reference doc
build process.
### API Reference
API reference documentation is located in `docs/api_reference/` and is generated from
the codebase using Sphinx.
The API reference have additional build steps that differ from the main documentation.
#### Deployment Process
Currently, the build process roughly follows these steps:
1. Using the `api_doc_build.yml` GitHub workflow, the API reference docs are
[built](#build-technical-details) and copied to the `langchain-api-docs-html`
repository. This workflow is triggered either (1) on a cron routine interval or (2)
triggered manually.
In short, the workflow extracts all `langchain-ai`-org-owned repos defined in
`langchain/libs/packages.yml`, clones them locally (in the workflow runner's file
system), and then builds the API reference RST files (using `create_api_rst.py`).
Following post-processing, the HTML files are pushed to the
`langchain-api-docs-html` repository.
2. After the HTML files are in the `langchain-api-docs-html` repository, they are **not**
automatically published to the [live docs site](https://python.langchain.com/api_reference/).
The docs site is served by Vercel. The Vercel deployment process copies the HTML
files from the `langchain-api-docs-html` repository and deploys them to the live
site. Deployments are triggered on each new commit pushed to `master`.
#### Build Technical Details
The build process creates a virtual monorepo by syncing multiple repositories, then generates comprehensive API documentation:
1.**Repository Sync Phase:**
-`.github/scripts/prep_api_docs_build.py` - Clones external partner repos and organizes them into the `libs/partners/` structure to create a virtual monorepo for documentation building
2.**RST Generation Phase:**
-`docs/api_reference/create_api_rst.py` - Main script that **generates RST files** from Python source code
- Scans `libs/` directories and extracts classes/functions from each module (using `inspect`)
- Creates `.rst` files using specialized templates for different object types
- Templates in `docs/api_reference/templates/` (`pydantic.rst`, `runnable_pydantic.rst`, etc.)
3.**HTML Build Phase:**
- Sphinx-based, uses `sphinx.ext.autodoc` (auto-extracts docstrings from the codebase)
-`docs/api_reference/conf.py` (sphinx config) configures `autodoc` and other extensions
-`sphinx-build` processes the generated `.rst` files into HTML using autodoc
-`docs/api_reference/scripts/custom_formatter.py` - Post-processes the generated HTML
- Copies `reference.html` to `index.html` to create the default landing page (artifact? might not need to do this - just put everyhing in index directly?)
4.**Deployment:**
-`.github/workflows/api_doc_build.yml` - Workflow responsible for orchestrating the entire build and deployment process
- Built HTML files are committed and pushed to the `langchain-api-docs-html` repository
#### Local Build
For local development and testing of API documentation, use the Makefile targets in the repository root:
```bash
# Full build
make api_docs_build
```
Like the CI process, this target:
- Installs the CLI package in editable mode
- Generates RST files for all packages using `create_api_rst.py`
- Builds HTML documentation with Sphinx
- Post-processes the HTML with `custom_formatter.py`
- Opens the built documentation (`reference.html`) in your browser
**Quick Preview:**
```bash
make api_docs_quick_preview API_PKG=openai
```
- Generates RST files for only the specified package (default: `text-splitters`)
- Builds and post-processes HTML documentation
- Opens the preview in your browser
Both targets automatically clean previous builds and handle the complete build pipeline locally, mirroring the CI process but for faster iteration during development.
#### Documentation Standards
**Docstring Format:**
The API reference uses **Google-style docstrings** with reStructuredText markup. Sphinx processes these through the `sphinx.ext.napoleon` extension to generate documentation.
File diff suppressed because one or more lines are too long
Some files were not shown because too many files have changed in this diff
Show More
Reference in New Issue
Block a user
Blocking a user prevents them from interacting with repositories, such as opening or commenting on pull requests or issues. Learn more about blocking a user.