Skip to main content
This page explains how agents connect to the LLM layer — from model resolution to request construction, streaming, and error handling.

Model resolution

Agents can specify their model in two ways:
Pass a model name string. AFK resolves it to an LLM client using the default provider.
agent = Agent(name="demo", model="gpt-5.2-mini", ...)
The resolution order:
  1. Check agent.model_resolver (custom function)
  2. Check registered adapters for matching provider prefix
  3. Default to OpenAI adapter

How the runner uses the LLM

On each step of the agent loop:

Request construction

The runner builds an LLMRequest from multiple sources:
SourceContributesPriority
Agent.instructionsSystem messageHighest
Thread historyPrevious messages
user_messageUser message
Agent.toolsTool schemas
Agent.subagentsTransfer tool schemas
RunnerConfigTemperature, max_tokensLowest

Streaming integration

When using run_stream(), the runner passes through streaming events from the LLM:
handle = await runner.run_stream(agent, user_message="Explain DNS")

async for event in handle:
    match event.type:
        case "text_delta":
            # ← Comes from LLM streaming
            print(event.text_delta, end="")
        case "tool_started":
            # ← Comes from the runner
            print(f"\n[TOOL] {event.tool_name}")
text_delta events come from two paths:
  • Provider streaming when the adapter supports stream deltas.
  • Runner fallback chunking of final text for non-streaming providers.
Tool and step events are generated by the runner.

Error handling

LLM errors are classified and handled automatically:
ErrorClassificationRunner behavior
Rate limit (429)RetryableRetry with backoff
Server error (500, 502, 503)RetryableRetry with backoff
Auth error (401, 403)TerminalFail the run
Invalid request (400)TerminalFail the run
TimeoutRetryableRetry (if attempts remain)
Circuit breaker openTerminalTry fallback model or fail
All retries exhaustedTerminalTry fallback model or fail

Model selection guide

TaskRecommended modelWhy
Simple Q&A, classificationgpt-5.2-nanoFast, cheap, good enough
General purpose with toolsgpt-5.2-miniBest balance of cost and capability
Complex reasoning, codinggpt-5.2 or claude-opus-4-5Better at multi-step reasoning
Cost-sensitive batchgpt-5.2-nanoLowest cost per token
Maximum qualitygpt-5.2 + temperature=0.0Deterministic, highest quality

Next steps