Agents are services.
Treat every agent like a microservice: define a contract, measure it, version it, and keep it replaceable.
Why “agents as microservices” works
Single responsibility
Less prompt bloat, less ambiguity.
Replaceability
Swap one agent without re-training the whole system.
Testability
Unit-test agent outputs with fixtures + eval rubrics.
Reliability
Isolate failures; retry or fallback per agent.
Scalability
Parallelize micro-agents; cheaper than “one big brain”.
Governance
Permissions per agent (e.g., finance agent can’t email).
Observability
Metrics by agent: latency, cost, success rate.
Versioning
Agents are APIs; release v2 safely with canaries.
Composability
Build higher-level workflows from primitives.
The key: crisp contracts
Every micro-agent should have: Name + purpose, inputs (typed), outputs (typed), allowed tools + permissions, guardrails, evaluation rubric, known failure modes + fallback behavior.
{
"name": "ExtractorAgent",
"purpose": "Extract invoice totals into a strict schema.",
"inputs": {
"document_text": "string",
"currency": "string"
},
"outputs": {
"invoice_total": "number",
"confidence": "number"
},
"allowed_tools": ["ocr.read", "json.validate"],
"guardrails": ["no external calls", "no PII leakage"],
"eval_rubric": ["schema valid", "total matches source"],
"failure_modes": ["ambiguous totals", "missing currency"],
"fallback": "Escalate to ValidatorAgent"
}
Operational pillars
Contracts
Input/output schema + tool boundaries. Everything is typed, auditable, and versioned.
Observability
Traces, evals, cost, latency, and regression alerts per agent.
Governance
Permissions, PII boundaries, and audit logs for every tool invocation.
Versioning
Ship v1/v2 agents like APIs with canary rollouts.