ToolRegistry,
has its JSON Schema exposed to the LLM, and then participates in a tightly controlled
loop: the model proposes a call, AFK validates the arguments, evaluates policy gates,
executes the handler, sanitizes the output, and feeds the result back into the
conversation for the model’s next turn. This page walks through every stage of that
pipeline end-to-end.
Registration to execution map
Step-by-step breakdown
-
Define
@tool— You write a Python function and decorate it with@tool. The decorator extracts the function name, docstring, and a Pydantic model for its arguments to produce aToolSpec(name, description, parameters_schema). -
Tool registry — When the agent starts, all tools (declared on the agent plus
runtime-injected extras like MCP tools and skill tools) are collected into a
ToolRegistry. The registry is the single source of truth for tool lookup, schema export, and call dispatch. -
Expose tool schema to model — The registry converts every registered tool into
the OpenAI function-calling format via
registry.to_openai_function_tools(). This list is attached to theLLMRequest.toolsfield so the model knows what tools are available. -
Model emits tool call — The LLM response (
LLMResponse) may contain one or moreToolCallobjects. Each carries anid, atool_name, and a JSONargumentsdict. AFK processes these as a batch. -
Schema validation — Before execution, each tool call’s raw arguments are
validated against the tool’s Pydantic
args_modelviaBaseTool.validate(). If validation fails, aToolResult(success=False)is returned immediately and the tool handler is never invoked. -
Policy gate — The runner evaluates the
PolicyEnginefor each tool call via atool_before_executepolicy event. The policy canallow,deny,defer(require human approval), orrequest_user_input. See the Tool Call Lifecycle page for the full decision matrix. - Execute handler — If the policy allows execution, AFK runs the tool through the full hook/middleware chain: PreHooks (argument transforms) -> Middleware chain -> core handler -> PostHooks (output transforms). Timeout enforcement applies at every layer.
-
Sanitize output — The raw
ToolResult.outputis passed throughapply_tool_output_limits()which enforcesmax_output_charsand sandbox output policies. The runner then wraps the result in an untrusted-data content envelope viarender_untrusted_tool_message()to prevent prompt injection from tool output. -
Emit events — A
tool_completedevent is emitted on the run handle, carrying the tool name, success flag, output, and any error. Telemetry counters and histograms are recorded for latency and success/failure counts. -
Tool result fed back to model — The sanitized output is appended to the
conversation as a
Message(role="tool", name=tool_name, content=...). The loop returns to step 4 for the next LLM turn.
Full example: tool definition through result inspection
Inspecting tool executions
Every completed run exposes atool_executions list on AgentResult. Each entry is a
ToolExecutionRecord:
| Field | Type | Description |
|---|---|---|
tool_name | str | Name of the executed tool. |
tool_call_id | str | None | Provider/LLM tool-call identifier. |
success | bool | Whether execution succeeded. |
output | JSONValue | None | JSON-safe tool output payload. |
error | str | None | Error message when execution failed. |
latency_ms | float | None | Execution latency in milliseconds. |
Hook and middleware pipeline
Tools support three extension points that wrap the core handler:| Extension | When it runs | Signature | Purpose |
|---|---|---|---|
| PreHook | Before core handler | (args) or (args, ctx) | Transform or validate arguments. Must return a dict compatible with the tool’s args model. |
| Middleware | Wraps core handler | (call_next, args, ctx) | Cross-cutting concerns (logging, timing, caching). Calls call_next(args, ctx) to proceed. |
| PostHook | After core handler | ({"output": ..., "tool_name": ...}) | Transform or audit the output before it reaches the model. |
ToolResult.
Error handling
The tool system is designed to be non-throwing by default.BaseTool.call() catches all
exceptions and wraps them in a ToolResult(success=False, error_message=...) unless
raise_on_error=True is set on the tool.
Validation errors — When the model passes arguments that do not match the Pydantic
schema, a ToolValidationError is caught and the error message is returned to the model
so it can self-correct on the next turn.
Timeout errors — Each tool can set a default_timeout in seconds. If execution
exceeds this limit, an asyncio.TimeoutError is caught and surfaced as a
ToolTimeoutError in the result.
Execution errors — Any unhandled exception from the handler, PreHook, or PostHook
is caught and wrapped in a ToolExecutionError result.
Hook/middleware failures — If a PreHook returns a non-dict value or produces args
that fail re-validation against the main tool’s model, execution stops and a failure
result is returned. PostHook failures similarly short-circuit the chain.
Policy-driven behavior — After a tool failure, the runner consults the agent’s
fail_safe.tool_failure_policy to decide whether to continue (default), degrade the
run, or fail the entire run. This is configurable per agent.