The five levels
Level 1: Single Agent
One agent, one model, no tools. The simplest possible setup.AFK features used:
Agent, Runner, run_syncGood for: Text classification, summarization, translation, simple Q&AMove up when: The agent needs to take actions or access external dataLevel 2: Agent + Tools
Add typed tool functions for the agent to call.AFK features added:
@tool, Pydantic models, FailSafeConfigGood for: RAG, data lookup, calculations, API integrationsMove up when: Different parts of the task need different expertise or modelsLevel 3: Multi-Agent Delegation
Coordinator delegates to specialist subagents.AFK features added:
subagents, join policies, backpressureGood for: Complex tasks, parallel work, specialist expertise, consensusMove up when: Tasks take minutes, need async processing, or need queue-based reliabilityLevel 4: Queue-Backed Async
Decouple producers and consumers with task queues. Long-running jobs execute asynchronously.AFK features added:
TaskQueue, TaskItem, workers, dead-letter handlingGood for: Batch processing, background jobs, retryable pipelines, high-throughputMove up when: You need cross-system communication or external agent interopLevel 5: Cross-System A2A
Agents communicate across systems using the A2A protocol with authenticated endpoints.AFK features added:
A2AClient, A2AServer, auth providers, external adaptersGood for: Microservice agent meshes, third-party integrations, federated AI systemsThis is the ceiling — most applications don’t need Level 5.Capability comparison
| Capability | L1 | L2 | L3 | L4 | L5 |
|---|---|---|---|---|---|
| Text generation | |||||
| Tool calling | — | ||||
| Multi-agent delegation | — | — | |||
| Async processing | — | — | — | ||
| Cross-system communication | — | — | — | — | |
| Policy engine | |||||
| Observability | |||||
| Evals |
Decision guide
[!TIP] Start at Level 1. The simplest system that works is the best system. Premature complexity is the most common mistake in agent design.
Signals to level up
| Signal | Current Level | Move to |
|---|---|---|
| Agent needs external data | 1 | 2 — Add tools |
| One prompt can’t cover all expertise | 2 | 3 — Split into specialists |
| Users waiting too long for results | 3 | 4 — Queue for async |
| Tasks fail and need to be retried automatically | 3 | 4 — Queue with DLQ |
| Other systems need to invoke your agents | 4 | 5 — Expose A2A |
| Third-party agents need to use your tools | 4 | 5 — Use MCP server |