Building Production AI Agents: Architecture, Guardrails, and Evaluation
A prototype agent that works in a demo is easy. A production agent that is safe, reliable, observable, and cost-controlled is hard.
This article focuses on practical production design choices.
Production Agent Architecture
A robust architecture separates concerns:
- input router
- planner/executor loop
- tool gateway
- memory subsystem
- policy engine
- evaluator and telemetry pipeline
Separation helps enforce control boundaries and simplify incident response.
Tool Governance and Permissions
Every tool should define:
- input schema
- output schema
- side-effect level (read-only vs write)
- required permission scope
- rate and retry policy
Never allow unconstrained arbitrary tool execution from raw model output.
Guardrail Layers
Use multiple layers:
- pre-action validation (policy, auth, schema)
- runtime checks (timeouts, quotas, anomaly checks)
- post-action validation (result sanity)
- final response filters (safety/compliance)
Single-layer guardrails are brittle.
Evaluation Framework for Agents
Evaluate agent quality across dimensions:
- task completion rate
- tool-call success rate
- step efficiency (steps/task)
- policy compliance rate
- cost per successful task
- human escalation rate
Agent evaluation is process-centric, not only answer-centric.
Test Matrix You Should Have
At minimum:
- happy path tasks
- ambiguous task definitions
- tool outage simulation
- malformed tool responses
- adversarial prompts
- high-cost action attempts
A strong test matrix catches most expensive failures before launch.
Observability and Traceability
Log at step level:
- user goal
- plan decisions
- selected tools and args
- tool outputs
- policy decisions
- final response and confidence
Step-level traces are essential for audits, debugging, and trust.
Cost and Latency Control
Agent systems can be expensive due to multi-step loops. Control with:
- max step budget
- model tiering by task complexity
- tool result caching
- deterministic shortcuts for common tasks
Measure cost per completed goal, not just cost per LLM call.
Incident Response for Agents
Agent incidents differ from chat incidents because actions can have side effects.
Playbook:
- stop risky action paths
- switch to read-only mode
- rollback planner/prompt/tool policy versions
- audit affected actions
- apply remediation and policy update
Have “kill switches” for high-risk tools.
End-to-End Code Example (Guardrailed Tool-Calling Agent)
from dataclasses import dataclass
from typing import Dict, Any, Callable
MAX_STEPS = 5
ALLOWED_TOOLS = {"search_kb", "create_ticket"}
@dataclass
class Decision:
action: str
args: Dict[str, Any]
# Mock planner (replace with LLM planner call)
def planner(goal: str, state: Dict[str, Any]) -> Decision:
if not state.get("found_answer"):
return Decision("search_kb", {"query": goal})
if state.get("needs_escalation"):
return Decision("create_ticket", {"title": "Escalation", "priority": "high"})
return Decision("finish", {"answer": state.get("answer", "Done")})
# Mock tools
def search_kb(query: str) -> Dict[str, Any]:
if "refund" in query.lower():
return {"answer": "Refund allowed within 30 days", "needs_escalation": False}
return {"answer": "No exact policy found", "needs_escalation": True}
def create_ticket(title: str, priority: str) -> Dict[str, Any]:
return {"ticket_id": "SUP-1042", "status": "created", "title": title, "priority": priority}
TOOLS: Dict[str, Callable[..., Dict[str, Any]]] = {
"search_kb": search_kb,
"create_ticket": create_ticket,
}
def validate_action(decision: Decision) -> None:
if decision.action == "finish":
return
if decision.action not in ALLOWED_TOOLS:
raise ValueError(f"Blocked tool: {decision.action}")
if decision.action == "create_ticket":
if decision.args.get("priority") not in {"low", "medium", "high"}:
raise ValueError("Invalid priority")
def run_agent(goal: str) -> Dict[str, Any]:
state: Dict[str, Any] = {}
trace = []
for step in range(1, MAX_STEPS + 1):
d = planner(goal, state)
trace.append({"step": step, "decision": d.action, "args": d.args})
validate_action(d)
if d.action == "finish":
return {"ok": True, "result": d.args.get("answer"), "trace": trace}
tool_fn = TOOLS[d.action]
out = tool_fn(**d.args)
trace[-1]["tool_output"] = out
if d.action == "search_kb":
state["answer"] = out.get("answer")
state["found_answer"] = True
state["needs_escalation"] = out.get("needs_escalation", False)
elif d.action == "create_ticket":
return {"ok": True, "result": f"Escalated: {out['ticket_id']}", "trace": trace}
return {"ok": False, "error": "max steps reached", "trace": trace}
if __name__ == "__main__":
result = run_agent("Customer asks for refund after purchase")
print(result)
This example demonstrates schema checks, tool allowlisting, and step-budget control.
Rollout Strategy
For production release:
- shadow mode with full traces
- read-only tool mode
- limited write actions for low-risk cohorts
- canary rollout with strict guardrails
- full rollout after stability window
Progressive rollout is critical for side-effecting agent systems.
Key Takeaways
- Production agents require architecture-level guardrails, not prompt-only safeguards.
- Tool governance and permission scoping are core safety controls.
- Evaluate process metrics (steps, tool success, compliance), not just final answers.
- Step-level observability and rollback controls are mandatory.