How to Use AFK

Use AFK incrementally. Start with one narrow agent, add tools only when the task needs action, and add production controls before real users depend on the system.

Phase 1: narrow agent

Build one agent that solves one task without tools.

from afk.agents import Agent
from afk.core import Runner

agent = Agent(
    name="ticket-classifier",
    model="gpt-5.5",
    instructions="""
    Classify the support ticket as exactly one of:
    billing, technical, account, or other.
    Return only the category.
    """,
)

result = Runner().run_sync(
    agent,
    user_message="I cannot log into my account.",
)
print(result.final_text)

Move on when the agent is reliable on real examples for the narrow task.

Phase 2: tools and safety

Add typed tools and hard limits.

from pydantic import BaseModel

from afk.agents import Agent, FailSafeConfig
from afk.core import Runner, RunnerConfig
from afk.tools import tool


class TicketArgs(BaseModel):
    ticket_id: str


@tool(args_model=TicketArgs, name="lookup_ticket", description="Fetch ticket details.")
def lookup_ticket(args: TicketArgs) -> dict:
    return {"ticket_id": args.ticket_id, "status": "open", "priority": "high"}


agent = Agent(
    name="ticket-agent",
    model="gpt-5.5",
    instructions="Look up tickets before answering. Never modify data.",
    tools=[lookup_ticket],
    fail_safe=FailSafeConfig(
        max_steps=8,
        max_tool_calls=4,
        max_total_cost_usd=0.10,
    ),
)

runner = Runner(
    config=RunnerConfig(
        sanitize_tool_output=True,
        tool_output_max_chars=8_000,
    )
)

Move on when all tools have Pydantic argument models, cost limits are set, and mutating operations are gated or absent.

Phase 3: production controls

Before shipping, add the controls that make failures diagnosable:

evals for expected behavior;
telemetry for latency, usage, errors, and tool calls;
persistent memory or queues if runs must survive process restarts;
security controls for sandboxing, secret scope, and tool output limits;
troubleshooting docs for on-call and operators.

Useful pages:

Evals

Test agent behavior and enforce budgets.

Observability

Export metrics, traces, and run records.

Security Model

Understand policy gates, sandboxing, and secret isolation.

Task Queues

Run agents through distributed workers.

Phase 4: release discipline

Once the agent is in production:

run evals in CI before prompt, tool, or model changes;
compare behavior across releases with golden traces where appropriate;
monitor cost per run and failure rate;
version system prompts in files;
document operator actions for approval, resume, and rollback flows.

Common mistakes

Mistake	Better approach
Starting with a multi-agent system	Start with one narrow agent and split only when roles are genuinely different
Writing untyped tools	Use Pydantic argument models for every tool
Treating prompts as the only safety layer	Add `FailSafeConfig`, policy gates, sandboxing, and evals
Hiding internals in public docs	Keep builder docs behavior-first and maintainer docs internals-first
Shipping without run records	Export telemetry and inspect `AgentResult` fields

Next steps

Read Building with AI for production design patterns, then Troubleshooting for common operational failures.

Start Here

Core Building Blocks

LLM Runtime

Production

Integrations

How to Use AFK

Phase 1: narrow agent

Phase 2: tools and safety

Phase 3: production controls

Evals

Observability

Security Model

Task Queues

Phase 4: release discipline

Common mistakes

Next steps

​Phase 1: narrow agent

​Phase 2: tools and safety

​Phase 3: production controls

Evals

Observability

Security Model

Task Queues

​Phase 4: release discipline

​Common mistakes

​Next steps

Phase 1: narrow agent

Phase 2: tools and safety

Phase 3: production controls

Phase 4: release discipline

Common mistakes

Next steps