Fail fast, fix faster: Why faster AI models beat smarter ones — AJ Fisher at AI Engineer Melbourne 2026
Fail Fast, Fix Faster: Why Faster AI Models Beat Smarter Ones
Intuition says the smartest model should win. The model that reasons deeper, thinks longer, produces better results—that's the one you'd pick for an agentic coding loop.
But intuition is wrong about this. In practice, a model that's 10x faster but only marginally competent often beats a frontier model that's exponentially more capable. And the mathematics explain why.
This is the counterintuitive insight that AJ Fisher explores in his work on loop velocity in AI-assisted engineering. The conventional wisdom assumes that reasoning quality dominates. Spend more tokens on reasoning, get better results. Let models think harder about problems, ship better code. Simple cause and effect.
But agentic loops aren't simple cause and effect. They're iterative. The model makes an attempt. It fails or produces something incomplete. The loop tries again. It refines. It improves. Each iteration compounds.
Now compare two scenarios. Scenario one: a frontier model that takes 30 seconds to reason through a problem and produces an excellent attempt on the first try. It might need two iterations to perfect. Total time: 60 seconds. Scenario two: a faster diffusion-based model like Mercury 2 that produces an attempt in 3 seconds but only gets it 80% right. It refines in parallel, removing the serial bottleneck that autoregressive models create. Twelve iterations per minute. After just two minutes, those iterations have compounded into convergence.
The difference is architectural. Autoregressive models generate tokens sequentially—one token at a time, one after another. That's fundamentally serial. It's optimized for quality, but quality comes at a cost to speed. Diffusion models refine outputs in parallel, removing that serial bottleneck entirely. They're optimized for iteration speed.
In a loop where each attempt improves a solution by even 20%, loop velocity becomes the dominant factor. The fast model that can attempt dozens of times per minute compounds improvement faster than the slow model that produces perfect output every few minutes.
Fisher demonstrates this with live code examples and napkin mathematics, showing how the math actually works. It's not handwaving about optimization. It's concrete calculation of how iteration rate and improvement rate interact to determine convergence speed.
There's a systems-level implication here about how organizations should think about model selection for agentic engineering tasks. The instinct is to reach for the most capable model available. But if the task is agentic—if the system is designed to iterate—then loop velocity might matter more than raw capability. You might genuinely get better results, faster, with a model that's "worse" in isolation but "faster" in context.
This connects to broader thinking about how engineering practice changes when you're building with AI agents. The bottleneck isn't individual model quality. It's iteration speed, feedback loops, how quickly you can try ideas and learn from results. That's a different optimization problem than asking for the best possible output from a single model.
AJ Fisher is a technologist and writer working at the intersection of AI, web, media, and digital innovation. A regular speaker at Web Directions conferences, he brings a pragmatic, builder-first perspective to how emerging technologies reshape software engineering practice. He writes at ajfisher.me on topics from agentic coding workflows and local LLM setups to the strategic implications of AI adoption in the enterprise. His perspective is grounded in actually building systems, not theoretical prediction.
His talk at AI Engineer Melbourne 2026 challenges assumptions about what "better" means in the context of agentic systems—and why you might want fast and iterable over slow and brilliant.
See AJ Fisher at AI Engineer Melbourne 2026, June 3-4. Tickets at https://aiengineer.webdirections.org
Great reading, every weekend.
We round up the best writing about the web and send it your way each Friday.
