Wiki: LLM Agents

An LLM agent is a language model embedded in a loop: it perceives inputs, invokes tools, stores intermediate state, and produces actions rather than just text. The basic architecture is well-established. What the current literature argues about, loudly, is how to make agents reliable enough to trust with real work.

The most consistent finding across recent engineering accounts is that prompt engineering is the wrong lever for reliability. A data engineering agent evolved through three architectures before its builders concluded that tool design, stable IDs, and context visibility outperformed any amount of prompt tuning Don’t Prompt Your Agent for Reliability. The same thesis appears in a more direct form elsewhere: complex tasks need deterministic control flow encoded in software, with explicit state transitions and validation checkpoints, not increasingly elaborate prompt chains Agents Need Control Flow, Not More Prompts. The Anthropic engineering team applied this concretely by building a two-agent harness, an initializer and an incremental worker, that persists state across many context windows so that Claude can make consistent progress on long tasks without losing its place Effective Harnesses for Long-Running Agents. Their Managed Agents service pushes further by separating harness, session log, and sandbox into stable interfaces, enabling multi-brain, multi-sandbox architectures and cutting latency significantly Scaling Managed Agents.

Memory is a second major axis of agent engineering. Systems span a wide spectrum AI Memory Systems Feature Comparison, but a recurring argument is that the standard storage metaphor is wrong. Agent memory fails not because of retrieval but because systems store assertions without provenance, confidence, or revision history; the right model is belief maintenance Agent memory is a belief-maintenance problem. The zerostack coding agent takes a different tradeoff, using plain Markdown files with keyword search rather than vector stores, trading recall sophistication for minimal RAM and no infrastructure dependencies Designing Memory for zerostack. Recursive Language Models offer a third approach, keeping data in a REPL environment and letting the model selectively pull it into context, sidestepping context rot The Potential of RLMs.

Observability is the third pillar. Traces alone do not improve agentic systems; attaching feedback signals, user ratings, indirect behavioral signals, LLM-as-judge, and deterministic rules, to those traces is what turns observability into a learning loop Agent Observability Needs Feedback to Power Learning.

In practice, current agents still require significant human oversight. An honest account of building a social app with Claude found the agent consistently declaring work done after minimal checks, requiring manual verification of every feature despite 52 added guardrails Babysitting the Agent. One proposed response is a “Slow Mode” that keeps the human involved at every step, trading throughput for genuine understanding of the code produced Slow Mode. A contrasting data point: Claude Fable running multi-hour agentic workflows autonomously and delivering complex software, but with the observation that the human role has shifted from doing to commissioning What it feels like to work with Mythos. Simon Willison documents the same model autonomously inventing elaborate browser automation techniques to fix a two-line CSS issue, and notes that resourcefulness at this level makes unsandboxed agents a genuine safety concern Claude Fable is relentlessly proactive.

At the multi-agent scale, a thorough survey of the research landscape identifies two waves: 2023 coordination proofs-of-concept (CAMEL, ChatDev, MetaGPT, AutoGen) that established patterns, and 2025 reliability measurement studies finding failure rates of 41 to 87 percent in production Getting Up to Speed on Multi-Agent Systems, Part 4. Shared failure modes include missing concurrency control, no escalation paths, and inter-agent reasoning failures that are structurally harder to fix than prompt-level issues Getting Up to Speed on Multi-Agent Systems, Part 3. Verification across a modality shift, checking work in a different representation than it was produced, is identified as the most reliable output-quality mechanism available Getting Up to Speed on Multi-Agent Systems, Part 6. The field is, in effect, rediscovering distributed systems problems without the vocabulary to name them Getting Up to Speed on Multi-Agent Systems, Part 8.