Skip to main content

Documentation Index

Fetch the complete documentation index at: https://afk.arpan.sh/llms.txt

Use this file to discover all available pages before exploring further.

Not every use case needs the full agent loop. Sometimes you want to call an LLM directly with a specific prompt and get back a structured, schema-validated response. AFK’s LLMBuilder provides a fluent API for constructing LLM clients that can return Pydantic-validated objects directly, without the overhead of the agent run lifecycle. Use this pattern for classification, extraction, summarization, and any scenario where you want a single LLM call with a guaranteed output schema.

Example

from pydantic import BaseModel
from afk.llms import LLMBuilder
from afk.llms.types import LLMRequest, Message

# Define the output schema as a Pydantic model
class Summary(BaseModel):
    title: str
    bullets: list[str]

# Build an LLM client using the fluent builder
client = LLMBuilder().provider("openai").model("gpt-4.1-mini").profile("production").build()

# Make a structured request
resp = await client.chat(
    LLMRequest(messages=[Message(role="user", content="Summarize incident timeline")]),
    response_model=Summary,
)
print(resp.structured_response)  # {"title": "...", "bullets": ["...", "..."]}
print(resp.text)                 # The raw text response

The builder pattern

LLMBuilder uses a fluent (method-chaining) API to construct an LLM client with the exact configuration you need:
client = (
    LLMBuilder()
    .provider("openai")          # Which LLM provider to use
    .model("gpt-4.1-mini")      # Which model
    .profile("production")       # Apply a preset profile (retry, timeout, etc.)
    .build()                     # Return the configured LLMClient
)
Each method returns the builder instance, so calls can be chained. The .build() call at the end constructs the final LLMClient with all specified settings. Available builder methods:
MethodPurpose
.provider(name)Set the LLM provider ("openai", "litellm", "anthropic_agent").
.model(name)Set the model identifier.
.profile(name)Apply a named configuration profile ("production", "development", etc.).
.settings(settings)Replace the loaded LLMSettings.
.with_middlewares(stack)Attach chat, stream, or embedding middleware.
.with_observers(observers)Attach LLM lifecycle observers.
.with_cache(cache_backend)Attach a cache backend instance or registered backend id.
.with_router(router)Attach a router instance or registered router id.
.build()Construct and return the LLMClient.
Sampling controls are request fields, not builder methods. Set them on LLMRequest, for example LLMRequest(..., temperature=0.0, max_tokens=1000).

Structured output with Pydantic

When you pass response_model=YourModel to client.chat(), the client instructs the LLM to return output that conforms to the model’s JSON schema. The response is parsed and validated against the Pydantic model:
  • If the LLM returns valid structured output, resp.structured_response contains the parsed dictionary and resp.text contains the raw response.
  • If the LLM returns output that does not match the schema, a LLMInvalidResponseError is raised.
This is powered by the LLM provider’s native structured output support (e.g., OpenAI’s response_format parameter) when available, with a fallback to prompt-based JSON extraction.

When to use LLMBuilder vs Runner

Use CaseApproach
Single LLM call, no tools, no memoryLLMBuilder — simpler, faster, no lifecycle overhead.
Structured extraction or classificationLLMBuilder with response_model.
Multi-turn conversation with toolsRunner — provides the full agent loop with tool execution, policy, and memory.
Subagent delegationRunner — only the runner supports subagent dispatch.
Event streaming to a UIRunner with run_stream().
Eval-driven developmentRunner — evals require the full AgentResult lifecycle.
Use LLMBuilder when you want precision and control over a single LLM interaction. Use Runner when you need the full agentic lifecycle.