Reading / 2026-05/2026-05-03t110046-getting-up-to-speed-on-multi-agent-systems-part-4-wave-2
Getting Up to Speed on Multi-Agent Systems, Part 4: Wave 2 (Why It Breaks)
Surveys three empirical papers—MAST's 14-failure-mode taxonomy across 1,600 traces, MAS-FIRE's fault injection framework, and Silo-Bench—to show that multi-agent LLM systems fail 41–87% of the time and that information synthesis, not coordination, is the core bottleneck.
May 03, 2026 · tech · Christopher Meiklejohn
Topics
- multi-agent-systems
- ai-agents
- reliability
- agent-coordination
- distributed-systems
Cited by
- Agent coordination
How multiple LLM-based agents divide work, share state, and resolve disagreements, and why coordination structure that mismatches task structure is a primary source of multi-agent system failure.
- AI agents
AI agents are LLM-powered systems that plan, act, and iterate autonomously; active research and engineering practice reveal deep tensions between coordination complexity, reliability, tool design, and the human oversight they still require.
- Distributed systems
Distributed systems theory supplies the vocabulary and failure models that recurring engineering problems demand, from durable execution frameworks to multi-agent LLM coordination to merge queue consistency bugs.
- Multi-agent systems
LLM-based multi-agent systems coordinate multiple AI agents on decomposed tasks, but empirical work shows failure rates of 41–87%, with information synthesis rather than coordination being the core bottleneck.
Related
- Your agent loves MCP as much as you love GUIs topic
- The Orchestrator Isn't Your Moat topic
- databricks-solutions/ai-dev-kit topic
- Scaling Managed Agents: Decoupling the brain from the hands topic
- Don't Prompt Your Agent for Reliability — Engineer It topic
- Agentic Coding is a Trap topic
- What CI Actually Looks Like at a 100-Person Team topic
- From Flaky to Flawless: Angular API Response Management with Zod topic