- retry_model_request hook lets a middleware decide to retry a failed
model request, with full ability to modify as much or as little of the
request before doing so
- ModelFallbackMiddleware tries each fallback model in order, until one
is successful, or fallback list is exhausted
Co-authored-by: Sydney Runkle <54324534+sydney-runkle@users.noreply.github.com>
* Add llm based tool selection middleware.
* Note that we might want some form of caching for when the agent is
inside an active tool calling loop as the tool selection isn't expected
to change during that time.
API:
```python
class LLMToolSelectorMiddleware(AgentMiddleware):
"""Uses an LLM to select relevant tools before calling the main model.
When an agent has many tools available, this middleware filters them down
to only the most relevant ones for the user's query. This reduces token usage
and helps the main model focus on the right tools.
Examples:
Limit to 3 tools:
```python
from langchain.agents.middleware import LLMToolSelectorMiddleware
middleware = LLMToolSelectorMiddleware(max_tools=3)
agent = create_agent(
model="openai:gpt-4o",
tools=[tool1, tool2, tool3, tool4, tool5],
middleware=[middleware],
)
```
Use a smaller model for selection:
```python
middleware = LLMToolSelectorMiddleware(model="openai:gpt-4o-mini", max_tools=2)
```
"""
def __init__(
self,
*,
model: str | BaseChatModel | None = None,
system_prompt: str = DEFAULT_SYSTEM_PROMPT,
max_tools: int | None = None,
always_include: list[str] | None = None,
) -> None:
"""Initialize the tool selector.
Args:
model: Model to use for selection. If not provided, uses the agent's main model.
Can be a model identifier string or BaseChatModel instance.
system_prompt: Instructions for the selection model.
max_tools: Maximum number of tools to select. If the model selects more,
only the first max_tools will be used. No limit if not specified.
always_include: Tool names to always include regardless of selection.
These do not count against the max_tools limit.
"""
```
```python
"""Test script for LLM tool selection middleware."""
from langchain.agents import create_agent
from langchain.agents.middleware import LLMToolSelectorMiddleware
from langchain_core.tools import tool
@tool
def get_weather(location: str) -> str:
"""Get current weather for a location."""
return f"Weather in {location}: 72°F, sunny"
@tool
def search_web(query: str) -> str:
"""Search the web for information."""
return f"Search results for: {query}"
@tool
def calculate(expression: str) -> str:
"""Perform mathematical calculations."""
return f"Result of {expression}: 42"
@tool
def send_email(to: str, subject: str) -> str:
"""Send an email to someone."""
return f"Email sent to {to} with subject: {subject}"
@tool
def get_stock_price(symbol: str) -> str:
"""Get current stock price for a symbol."""
return f"Stock price for {symbol}: $150.25"
@tool
def translate_text(text: str, target_language: str) -> str:
"""Translate text to another language."""
return f"Translated '{text}' to {target_language}"
@tool
def set_reminder(task: str, time: str) -> str:
"""Set a reminder for a task."""
return f"Reminder set: {task} at {time}"
@tool
def get_news(topic: str) -> str:
"""Get latest news about a topic."""
return f"Latest news about {topic}"
@tool
def book_flight(destination: str, date: str) -> str:
"""Book a flight to a destination."""
return f"Flight booked to {destination} on {date}"
@tool
def get_restaurant_recommendations(city: str, cuisine: str) -> str:
"""Get restaurant recommendations."""
return f"Top {cuisine} restaurants in {city}"
# Create agent with tool selection middleware
middleware = LLMToolSelectorMiddleware(
model="openai:gpt-4o-mini",
max_tools=3,
)
agent = create_agent(
model="openai:gpt-4o",
tools=[
get_weather,
search_web,
calculate,
send_email,
get_stock_price,
translate_text,
set_reminder,
get_news,
book_flight,
get_restaurant_recommendations,
],
middleware=[middleware],
)
# Test with a query that should select specific tools
response = agent.invoke(
{"messages": [{"role": "user", "content": "I need to find restaurants"}]}
)
print(response)
```
* Add server side tools to modifyModelRequest (represented as dicts)
* Update some of the logic in terms of which tools are bound to ToolNode
* We still have a constraint on changing the response format dynamically
when using tool strategy. structured_output_tools are being using in
some of the edges. The code is now raising an exception to explain that
it's a limitation of the implementation. (We can add support later.)
- supports 6 well-known PII types (email, credit_card, ip, mac_address,
url)
- 4 handling strategies (block, redact, mask, hash)
- supports custom PII types with detector functions or regex
- the built-in types were chosen because they are common, and detection
can be reliably implemented with stdlib
* Preserve Auto type for the response format. cc @sydney-runkle Creating
an extra type was the nicest devx I could find for this (makes it easy
to do isinstance(thingy, AutoStrategy)
Remaining issue to address:
* Going to sort out why we're including tools in the tool node
Change response format strategy dynamically based on model.
After this PR there are two remaining issues:
- [ ] Review binding of tools used for output to ToolNode (shouldn't be
required)
- [ ] Update ModelRequest to also support the original schema provided
by the user (to correctly support auto mode)
Adding a `dynamic_prompt` decorator to support smoother devx for dynamic
system prompts
```py
from langchain.agents.middleware.types import dynamic_prompt, ModelRequest, AgentState
from langchain.agents.middleware_agent import create_agent
from langgraph.runtime import Runtime
from dataclasses import dataclass
from langchain_core.messages import HumanMessage
@dataclass
class Context:
user_name: str
@dynamic_prompt
def my_prompt(request: ModelRequest, state: AgentState, runtime: Runtime[Context]) -> str:
user_name = runtime.context.user_name
return (
f"You are a helpful assistant helping {user_name}. Please refer to the user as {user_name}."
)
agent = create_agent(model="openai:gpt-4o", middleware=[my_prompt]).compile()
result = agent.invoke({"messages": [HumanMessage("Hello")]}, context=Context(user_name="Sydney"))
for msg in result["messages"]:
msg.pretty_print()
"""
================================ Human Message =================================
Hello
================================== Ai Message ==================================
Hello Sydney! How can I assist you today?
"""
```
Need to decide - what information should we feed to this description
factory? Right now, feeding:
* state
* runtime
* tool call (so the developer doesn't have to search through the state's
messages for the corresponding tool call)
I can see a case for just passing tool call. But again, this abstraction
is semi-bound to interrupts for tools... though we pretend it's more
abstract than that.
Right now:
```py
def custom_description(state: AgentState, runtime: Runtime, tool_call: ToolCall) -> str:
"""Generate a custom description."""
return f"Custom: {tool_call['name']} with args {tool_call['args']}"
middleware = HumanInTheLoopMiddleware(
interrupt_on={
"tool_with_callable": {"allow_accept": True, "description": custom_description},
"tool_with_string": {"allow_accept": True, "description": "Static description"},
}
)
```
This PR adds a model call limit middleware that helps to manage:
* number of model calls during a run (helps w/ avoiding tool calling
loops) - implemented w/ `UntrackedValue`
* number of model calls on a thread (helps w/ avoiding lengthy convos) -
standard state
Concern here is w/ other middlewares overwriting the model call count...
we could use a `_` prefixed field?
Removed:
- `libs/core/langchain_core/chat_history.py`: `add_user_message` and
`add_ai_message` in favor of `add_messages` and `aadd_messages`
- `libs/core/langchain_core/language_models/base.py`: `predict`,
`predict_messages`, and async versions in favor of `invoke`. removed
`_all_required_field_names` since it was a wrapper on
`get_pydantic_field_names`
- `libs/core/langchain_core/language_models/chat_models.py`:
`callback_manager` param in favor of `callbacks`. `__call__` and
`call_as_llm` method in favor of `invoke`
- `libs/core/langchain_core/language_models/llms.py`: `callback_manager`
param in favor of `callbacks`. `__call__`, `predict`, `apredict`, and
`apredict_messages` methods in favor of `invoke`
- `libs/core/langchain_core/prompts/chat.py`: `from_role_strings` and
`from_strings` in favor of `from_messages`
- `libs/core/langchain_core/prompts/pipeline.py`: removed
`PipelinePromptTemplate`
- `libs/core/langchain_core/prompts/prompt.py`: `input_variables` param
on `from_file` as it wasn't used
- `libs/core/langchain_core/tools/base.py`: `callback_manager` param in
favor of `callbacks`
- `libs/core/langchain_core/tracers/context.py`: `tracing_enabled` in
favor of `tracing_enabled_v2`
- `libs/core/langchain_core/tracers/langchain_v1.py`: entire module
- `libs/core/langchain_core/utils/loading.py`: entire module,
`try_load_from_hub`
- `libs/core/langchain_core/vectorstores/in_memory.py`: `upsert` in
favor of `add_documents`
- `libs/standard-tests/langchain_tests/integration_tests/chat_models.py`
and `libs/standard-tests/langchain_tests/unit_tests/chat_models.py`:
`tool_choice_value` as models should accept `tool_choice="any"`
- `langchain` will consequently no longer expose these items if it was
previously
---------
Co-authored-by: Mohammad Mohtashim <45242107+keenborder786@users.noreply.github.com>
Co-authored-by: Caspar Broekhuizen <caspar@langchain.dev>
Co-authored-by: ccurme <chester.curme@gmail.com>
Co-authored-by: Christophe Bornet <cbornet@hotmail.com>
Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>
Co-authored-by: Sadra Barikbin <sadraqazvin1@yahoo.com>
Co-authored-by: Vadym Barda <vadim.barda@gmail.com>