This is tool emulation middleware. The idea is to help test out an agent
that may have some tools that either take a long time to run or are
expensive to set up. This could allow simulating the behavior a bit.
**Description:**
currently `mustache_schema("{{x.y}} {{x}}")` will error. pr fixes
**Issue:** na
**Dependencies:**na
---------
Co-authored-by: Bagatur <baskaryan@gmail.com>
Easiest to review side by side (not inline)
* Adding `dict` type requests + responses so that we can ship config w/
interrupts. Also more extensible.
* Keeping things generic in terms of `interrupt_on` rather than
`tool_config`
* Renaming allowed decisions -- approve, edit, reject
* Draws differentiation between actions (requested + performed by the
agent), in this case tool calls, though we generalize beyond that and
decisions - human feedback for said actions
New request structure
```py
class Action(TypedDict):
"""Represents an action with a name and arguments."""
name: str
"""The type or name of action being requested (e.g., "add_numbers")."""
arguments: dict[str, Any]
"""Key-value pairs of arguments needed for the action (e.g., {"a": 1, "b": 2})."""
DecisionType = Literal["approve", "edit", "reject"]
class ReviewConfig(TypedDict):
"""Policy for reviewing a HITL request."""
action_name: str
"""Name of the action associated with this review configuration."""
allowed_decisions: list[DecisionType]
"""The decisions that are allowed for this request."""
description: NotRequired[str]
"""The description of the action to be reviewed."""
arguments_schema: NotRequired[dict[str, Any]]
"""JSON schema for the arguments associated with the action, if edits are allowed."""
class HITLRequest(TypedDict):
"""Request for human feedback on a sequence of actions requested by a model."""
action_requests: list[Action]
"""A list of agent actions for human review."""
review_configs: list[ReviewConfig]
"""Review configuration for all possible actions."""
```
New response structure
```py
class ApproveDecision(TypedDict):
"""Response when a human approves the action."""
type: Literal["approve"]
"""The type of response when a human approves the action."""
class EditDecision(TypedDict):
"""Response when a human edits the action."""
type: Literal["edit"]
"""The type of response when a human edits the action."""
edited_action: Action
"""Edited action for the agent to perform.
Ex: for a tool call, a human reviewer can edit the tool name and args.
"""
class RejectDecision(TypedDict):
"""Response when a human rejects the action."""
type: Literal["reject"]
"""The type of response when a human rejects the action."""
message: NotRequired[str]
"""The message sent to the model explaining why the action was rejected."""
Decision = ApproveDecision | EditDecision | RejectDecision
class HITLResponse(TypedDict):
"""Response payload for a HITLRequest."""
decisions: list[Decision]
"""The decisions made by the human."""
```
User facing API:
NEW
```py
HumanInTheLoopMiddleware(interrupt_on={
'send_email': True,
# can also use a callable for description that takes tool call, state, and runtime
'execute_sql': {
'allowed_decisions': ['approve', 'edit', 'reject'],
'description': 'please review sensitive tool execution'},
}
})
Command(resume={"decisions": [{"type": "approve"}, {"type": "reject": "message": "db down"}]})
```
OLD
```py
HumanInTheLoopMiddleware(interrupt_on={
'send_email': True,
'execute_sql': {
'allow_accept': True,
'allow_edit': True,
'allow_respond': True,
description='please review sensitive tool execution'
},
})
Command(resume=[{"type": "approve"}, {"type": "reject": "message": "db down"}])
```
Refactor tool call middleware from generator-based to handler-based
pattern
Simplifies on_tool_call middleware by replacing the complex generator
protocol with a straightforward handler pattern. Instead of yielding
requests and receiving results via .send(),
handlers now receive an execute callable that can be invoked multiple
times for retry logic.
Before vs. After
Before (Generator):
```python
class RetryMiddleware(AgentMiddleware):
def on_tool_call(self, request, state, runtime):
for attempt in range(3):
response = yield request # Yield request, receive result via .send()
if is_valid(response) or attempt == 2:
return # Final result is last value sent to generator
```
After (Handler):
```python
class RetryMiddleware(AgentMiddleware):
def on_tool_call(self, request, handler):
for attempt in range(3):
result = handler(request) # Direct function call
if is_valid(result):
return result
return result
```
Follow up after this PR:
* Rename the interceptor to wrap_tool_call
* Fix the async path for the ToolNode
1. Main fix: when we don't have a response format or middleware, don't
draw a conditional edge back to the loop entrypoint (self loop on model)
2. Supplementary fix: when we jump to `end` and there is an
`after_agent` hook, jump there instead of `__end__`
Other improvements -- I can remove these if they're more harmful than
helpful
1. Use keyword only arguments for edge generator functions for clarity
2. Rename args to `model_destination` and `end_destination` for clarity