Wiki: AI agents

An AI agent, in practice, is an LLM embedded in a loop: it receives a goal, selects tools, observes results, and iterates until done or stuck. That loop can be simple — a single model running sequentially — or elaborate, with specialized sub-agents handing work between each other. The central tension across most current writing on the topic is not whether agents work, but how to keep them reliable when the scope of their autonomy grows.

The case for multi-agent architectures is that parallelism and specialization can handle tasks no single context window can hold. Anthropic’s internal harness work describes a GAN-inspired planner/generator/evaluator triad that runs multi-hour autonomous coding sessions, explicitly designed to overcome the self-evaluation bias a single agent exhibits Harness Design for Long-Running Application Development. Poolday’s Creator-1 routes video editing across 100+ generative models through a similar orchestration layer, producing fully editable outputs rather than static renders Poolday. The case against is quantitative: Stanford and Google/MIT research cited by Ben Dickson shows coordination overhead can amplify errors up to 17x and cut tool-handling efficiency by 2 to 6x, making single-agent systems the right default for most tasks How to Choose Between Single- and Multi-Agent Solutions.

Verification is the structural problem that multi-agent architectures expose. Christopher Meiklejohn’s survey of verification patterns argues that the key variable is modality shift: checking work in a different representation than it was produced in, because agents are systematically overconfident in their own output Getting Up to Speed on Multi-Agent Systems, Part 6: Verification Patterns. Brian Suh makes a related point about control flow: reliability comes from deterministic state transitions and validation checkpoints encoded in software, not from more elaborate prompting Agents Need Control Flow, Not More Prompts. The 12-factor-agents project extends this into a concrete principle, arguing that execution state and business state should be unified into a single context-window-derived thread, making serialization, debugging, and recovery tractable humanlayer/12-factor-agents.

Memory is a second structural problem. The naive model — storing conversation history — fails over long sessions because agents accumulate stale or contradictory assertions with no mechanism to revise them. One framing proposes treating agent memory as a belief-maintenance problem rather than a storage problem, requiring provenance, confidence scores, scope, and supersession records Agent memory is a belief-maintenance problem, not a storage problem. The vectorize-io/hindsight project approaches the same problem through biomimetic memory structures that separate world facts, experiences, and mental models vectorize-io/hindsight. A live comparison of 74 agent memory systems across architecture and benchmark dimensions illustrates how fractured the solution space still is AI Memory Systems — Feature Comparison.

At the infrastructure layer, enterprise deployments require a governance layer — unified identity, policy enforcement, tool routing, and observability — that Speakeasy calls the AI control plane AI Control Plane: Architecture and Vendors. Credential management is a specific pinch point: Latchkey addresses it by injecting API credentials locally so agents can authenticate against external services without ever receiving raw tokens Latchkey: Credential Layer for Local AI Agents.

Capability is advancing faster than governance. Simon Willison documents Claude Fable 5 autonomously inventing elaborate browser automation techniques to debug a two-line CSS fix, and notes that the same resourcefulness makes unsandboxed agents genuinely dangerous Claude Fable is relentlessly proactive. Frontier model task-completion horizons are doubling roughly every year; GPT-5.5 now handles approximately three-minute human tasks at 50% reliability without chain-of-thought reasoning, with safety implications for monitoring approaches that depend on CoT visibility Estimating No-CoT Task-Completion Time Horizons of Frontier AI Models. The practical consequence, as Ethan Mollick observes from hands-on work with Claude 5 Fable, is that the human role has shifted from doing to commissioning — directing multi-hour autonomous workflows rather than executing steps What it feels like to work with Mythos.