ToolCall from the LLM response. Before
the handler ever runs, the runner evaluates a policy gate that can allow, deny, or defer
the call for human approval. The outcome of that gate determines whether the tool
executes at all, and the events emitted along the way tell you exactly what happened and
why.
Event timeline
Event reference table
Every event is anAgentRunEvent published on the run handle. The type field
identifies the event, and the data dict carries event-specific fields.
| Event type | When it fires | Key data fields | Description |
|---|---|---|---|
tool_batch_started | After LLM response contains tool calls | tool_call_count | Signals the start of a batch of one or more tool calls from a single LLM turn. |
policy_decision | After policy evaluation for each tool | tool_name, action, reason | Records the policy engine’s decision for one tool call. Emitted for allow, deny, defer, and request_user_input actions. |
run_paused | When a tool call requires human approval | tool_name, reason, payload | The run is suspended waiting for external input. Only emitted when policy returns defer or request_approval. |
run_resumed | After human approval or denial is received | approved | The run resumes after the approval decision. |
tool_completed | After each tool call resolves (success or failure) | tool_name, success, output, error, tool_call_id | The terminal event for one tool call. Always emitted regardless of whether the tool ran, was denied, or timed out. |
warning | When non-fatal issues occur | message | Advisory events such as sandbox violations handled under a continue policy. |
Policy decisions: allow, deny, defer, and request_user_input
ThePolicyEngine evaluates a PolicyEvent with event_type="tool_before_execute" for
every tool call. The decision drives the rest of the lifecycle:
allow — The tool executes immediately. No human interaction required. This is the
default when no policy rules match.
deny — The tool is blocked. A ToolExecutionRecord with success=False and
error="Denied by policy" is recorded. The denied result is fed back to the model as a
tool message so it can adjust its behavior. After a denial, the runner consults
fail_safe.approval_denial_policy to decide the run-level outcome:
continue(default) — The run proceeds. The model sees the denial and may try a different approach.degrade— The run transitions to adegradedstate and terminates with a degradation message.fail— The run raises anAgentExecutionErrorand transitions tofailed.
run_paused event,
and waits for the InteractionProvider to deliver an approval decision. If the provider
is HeadlessInteractionProvider, the configured approval_fallback (default: deny)
is used. The approval has a configurable timeout (approval_timeout_s, default 300s).
request_user_input — Similar to defer, but instead of a yes/no approval, the runner
asks the user for a value (such as a confirmation code or corrected parameter). The
policy’s request_payload can specify a target_arg field, and the user’s input is
injected into the tool arguments before execution.
Timing and latency tracking
Every tool execution is timed from the momentregistry.call() is invoked to the moment
it returns. The latency is captured in milliseconds and recorded in two places:
-
ToolExecutionRecord.latency_ms— Available on theAgentResult.tool_executionslist after the run completes. -
Telemetry histogram — The
agent.tool_call.latency_msmetric is emitted with attributestool_name,result(success/error), andsource(execute/replay).
agent.tool_batch.latency_ms histogram captures the
wall-clock duration of the entire batch (all concurrent tool calls), with attributes for
call_count and failure_count.
Denied tool calls have no execution latency since the handler never runs.
How denied tools affect the agent run
When a tool is denied, the model still receives feedback. The runner appends a tool message to the conversation with the denial reason:fail_safe.approval_denial_policy:
| Policy | Behavior after denial |
|---|---|
continue | Run proceeds normally. Model sees the denial and may self-correct. |
degrade | Run transitions to degraded state and exits the step loop with a degradation message. |
fail | Run raises AgentExecutionError and transitions to failed state. |
fail or degrade policy, the remaining tool calls in the batch are skipped.