How long is your loop?
A few months ago I sat down to work through some thoughts about the different ways developers are working with AI. The list seemed straightforward enough at the time. There was the spicy autocomplete you got in your IDE. There was the turn-by-turn chat with Claude or ChatGPT. There was the Git-integrated agent — Codex, Claude Code — that worked on a branch and handed you a pull request. And at the far end there was specification-driven development, where you wrote a spec and the system built what you described.
That sketch is still mostly right. But sitting with it again now, I can see what I was circling without quite naming. The thing that actually distinguishes these approaches is not the level of abstraction, or the size of the chunk of code being generated, or how far up the agentic ladder you’ve climbed. It’s how long the loop is between you and the agent. And the length of the loop is, in the end, a question about how much autonomy you are willing to grant.
The spectrum, viewed as loops
Spicy autocomplete is the tightest loop possible. The agent suggests the next few characters, the next line, the next function signature. You accept or you don’t, in the same breath as the keystroke. You read every character it produces because the act of reading is indistinguishable from the act of writing. There is no autonomy here in any meaningful sense. The agent is a faster typist than you, and a slightly better one, and that’s about it.
Turn-by-turn chat — Claude or ChatGPT, with or without artefacts — is a longer loop. You describe what you want, the agent produces a chunk of code, you read it, you run it, you come back with the bug or the next request. The loop might be measured in minutes rather than seconds. The agent is producing substantial work between your interventions. But the human is still in the room, reviewing every output, making the next call.
The Git-integrated agent — Claude Code, Codex, Cursor running an agentic task — is longer again. You set a task, walk away, come back to a pull request, review the diff. The loop might be measured in tens of minutes or hours. You are no longer reading every line as it is written. You are reading the result and deciding whether to accept it.
Specification-driven development extends the loop further still. You write the spec, you define the patterns, you set the platform conventions, and the system builds against them. The loop closes when the system delivers something against the spec. You might be hours or days away from the moment of generation.
And then, beyond that, sit the multi-agent orchestration approaches that have arrived in the last six months or so. Steve Yegge’s Gas Town, where a “Mayor” agent coordinates twenty or thirty Claude Code instances against a backlog. Geoff Huntley’s Ralph technique, where you run the same prompt in a loop until it converges on something that works. The Wasteland, Yegge’s federated network of Gas Towns. Here the loop is measured in hours or days at a time, and you are not reviewing the diffs at all in any meaningful sense. You are managing a fleet. You are, as one Gas Town reviewer put it, keeping your Tamagotchi alive.
What loop length actually controls
Once you see the spectrum this way, a few things fall into place that the abstraction-level framing obscures.
The length of the loop is the amount of autonomy you give the agent. They are the same variable, looked at from different angles. A tight loop means low autonomy: the agent does a small thing, you check it, the agent does the next small thing. A long loop means high autonomy: the agent does many things in sequence, perhaps in parallel, perhaps coordinating with other agents, before you see the result.
This isn’t a value judgement. There are domains and stages of work where high autonomy is exactly what you want, and others where it is the wrong choice. The question is not which loop length is best. The question is which loop length matches the problem in front of you.
When the problem is well-defined and the verification is cheap — running tests, checking outputs against a known-good answer, applying a static analyser — long loops work well. The agent can iterate against the verification, fail repeatedly, and converge. This is why Ralph works. This is why Gas Town can run twenty agents against a bug backlog. The verification is doing the work that human review would otherwise do.
When the problem is poorly defined, when the verification is expensive, when “is this right?” requires human judgement that the agent cannot replicate — long loops fail badly. The agent runs for hours, produces something plausible, and you spend longer untangling what it did wrong than you would have spent doing the work yourself. The DoltHub reviewer who tried Gas Town on his own product spent sixty minutes, generated four pull requests, closed all four of them, and burned a hundred dollars in tokens for the privilege. The loop was too long for the work.
The token cost is worth dwelling on, because it shapes the calculus in a way the discourse mostly elides. Long-loop agents consume tokens at rates that surprise people the first time they see the bill. A Gas Town session that runs at ten times the cost of a normal Claude Code session is not unusual. Multi-agent orchestration multiplies this further — every agent in the fleet is burning tokens, often re-reading the same context, often doing speculative work that gets thrown away. The cheapest loops are the shortest ones. The most expensive loops are the longest ones. When the long loop produces good work, the cost is justified. When it doesn’t, you have paid for the privilege of generating something you’ll have to throw away. This isn’t a reason not to use long loops. It is a reason to use them deliberately, and to know what you’re paying for.
When you are exploring a problem space — the product engineering case, the early prototype, the “I’m not sure if this is the right shape yet” moment — the right loop length is short. You want the human in the room at every decision point because the decisions are what the work is. This is why turn-by-turn chat with artefacts remains, for me, the right tool when I’m trying to figure out whether an idea is even worth pursuing.
When you know what you want and the question is how to get it — the building-against-a-known-pattern case — longer loops start to make sense. The decisions have already been made. The agent’s job is to execute against them. Spec-driven development is, at heart, a bet that you can encode enough of your decisions into the spec that the agent’s autonomous execution will produce the right thing.
Where I actually sit
This is probably worth disclosing, since I’m asking you to think about your own loop length. To be honest, I rarely look at a line of generated code anymore. But that doesn’t mean my loops are uniformly long. The loops in my work nest inside each other, and the lengths shift as the work progresses.
Take a recent example. I wanted a system to manage attendees for our conferences. I have used a lot of CRM and email tools over the years, and I’ve built my own systems that stitch them together, so I have a clear mental model of what I want such a thing to do. I know the platform and technologies I want it built on — Cloudflare, in our case — and we have an existing document that captures those choices, because we reuse the same patterns across most of the systems we build. So the starting move is a long loop. I broadly specify what the system needs to do, point at the document that describes our platform and conventions, and let the agent take a first shot at the whole thing.
The first shot is never the finished thing. It will get some of what I want right and some of it wrong, and a fair amount will not be there at all. So the next phase is shorter loops on a per-feature basis. Refine the bits that nearly work. Specify the bits that didn’t get built. And then, once the core system exists, additional features start to suggest themselves — things that only become visible once you can actually use the thing. Those get their own loops, often shorter still, sometimes down to the level of refining a single behaviour in a single feature.
So I’m running long loops to scaffold and short loops to refine. The loops nest. The work moves through them as it moves through the project. I’m not Ralph-looping and I’m not running Gas Town, but I’m also not in the chat reviewing every line as it lands. The shape of the loop changes with the shape of the work.
I’ve learned — through a fair amount of pain — that human language is imprecise, that my specifications are rarely as complete as I think they are, and that the cost of an agent confidently building the wrong thing for two hours is much higher than the cost of catching the wrong turn early. That said, the agents are surprisingly good at taking my imprecise specifications and getting very close to what I had in mind. More often than not, the first shot is recognisably the thing I was trying to describe, even when I described it badly. The bet on long loops is, in the end, a bet that you communicated precisely enough — and “enough” turns out to be a lower bar than I would have guessed a year ago. I do not yet trust myself, or the agents, enough to make that bet on the whole of a project. But I trust it enough to make the first move long, and to tighten from there.
The mistake of the linear story
The dominant story about AI-assisted development right now is a linear one: autocomplete is for laggards, chat is for the middle of the pack, agents-on-pull-requests is the current frontier, and orchestrated fleets are where the future lives. If you are still using your IDE the old way you are, in Steve Yegge’s phrase, nine to twelve months behind the curve.
There is something to this story, and it’s worth granting before pushing back. A lot of developers who are using only short-loop tools are using them because they haven’t yet moved up the ladder. They haven’t tried Claude Code. They haven’t built a working spec they trust enough to hand to an agent. They are in autocomplete because that’s where they got comfortable, not because the work calls for it. For these developers the story is broadly accurate. There is more value available to them at longer loops, and they should go and find it.
What the story gets wrong is the teleology — the assumption that everyone is, or should be, moving toward Gas Town, and that the only reason to be at a shorter loop is that you haven’t got there yet.
Short loops are sometimes a maturity issue. They are also, often, exactly the right tool for the work in front of you. If you’re adding a small nuance to an existing feature, autocomplete may be the right answer. If you’re trying to figure out whether a feature should exist at all, turn-by-turn chat may be the right answer. The short loop isn’t a mark of being behind. It’s a mark of having matched the loop to the work.
The two cases — short-because-immature and short-because-appropriate — look similar from the outside. They are not the same thing. Distinguishing them requires knowing what work you’re doing and why you chose the loop length you chose. The discourse mostly does not distinguish them, and so it ends up telling working developers they are behind the curve when in fact they are simply doing different work to the people writing the essays.
There is a Stage 7 developer somewhere managing twenty agents. There is also a developer fixing a regression in a four-thousand-line legacy file who needs autocomplete and nothing more, and who is not behind the curve in any meaningful sense. They are doing different work. They need different loops.
The hot takes about the death of the IDE, the rise of orchestrators, the obsolescence of code review — these are claims about loop length, and they are mostly true for some kinds of work and misleading for others. The error is not in the observation that long loops have arrived and are powerful for certain problems. The error is in concluding that everyone, on every problem, should be at the longest possible loop.
What this means for the next year
A few things follow from taking loop length seriously as the operative variable.
The first is that the question to ask of any new tool, any new methodology, any new claim about the future of development is: what loop length does this assume, and is that loop length appropriate for the work I’m doing? Most of the loud claims about where development is heading are claims about loop length in disguise. Once you can see them that way, they become easier to assess on their merits, rather than on the volume of the person making them.
The second is that the skill of choosing your loop length deliberately — knowing when to tighten it and when to extend it, knowing the cost of getting it wrong in either direction — is becoming a real part of the craft. It has not been named yet, that I’ve seen. But it is the thing that distinguishes practitioners who get value from these tools from those who flail with them.
The third is that the second-order questions I was poking at a few months ago — about programming languages, about file structure, about the IDE itself — are really questions about which loop lengths the tools we use today were designed to support. Our languages, our editors, our review processes were all built when the loop was the keystroke and the function. They will need to change as more of our work happens at longer loops. But not all of it will. The keystroke loop is not going anywhere. It is, for some kinds of work, still the right loop. The question is what we build alongside it.
I’ll come back to those second-order questions in a follow-up. For now, the move I’d suggest is a small one. The next time you reach for a tool, notice the loop length it implies. Notice whether that’s the loop length the work calls for. If it is, carry on. If it isn’t, change tools — not because you’re behind the curve, but because the loop you’re in isn’t the one you wanted.
References
People and ideas referenced
- Steve Yegge — long-time engineer (ex-Amazon, ex-Google, ex-Sourcegraph) and writer behind the “death of the IDE” thesis and Gas Town. The “nine to twelve months behind the curve” line comes from his conversation with Gene Kim. They co-authored Vibe Coding together. Gas Town announcement: “Welcome to Gas Town”. Successor project: “Welcome to Gas City”.
- Nathan Sobo — CEO and co-founder of Zed, co-creator of Atom. Provides the most considered counter-position to the IDE-is-dead claim: “The Death of the IDE?” debate at Zed.
- Geoff Huntley — creator of the Ralph Wiggum technique. Original explainer: “Ralph Wiggum as a software engineer”. Practical playbook: how-to-ralph-wiggum. Geoff is keynoting at AI Engineer with “Everything Is a Factory.”
- Shawn Wang (swyx) — founder of AI Engineer, editor of Latent Space, now at Cognition. His pushback on the IDE-is-dead framing — that people building atop Claude Code immediately rebuild a UI, file explorer, and so on around it — is in “Cognition: The Devin is in the Details”.
- Brett Taylor — referenced in earlier thinking on this for his observation that programming languages are optimised for human reading and writing, not machine generation.
- The DoltHub Gas Town reviewer — Tim Sehn’s hour-long, four-PR, hundred-dollar Gas Town test drive: “A Day in Gas Town”.
- The “Tamagotchi” framing for managing fleets of agents comes from Enterprise Vibe Code’s Gas Town review.
Concepts and tools
- Gas Town — Yegge’s multi-agent orchestrator coordinating twenty-plus Claude Code instances. Repo on GitHub. Architecture uses named agent roles (Mayor, Polecats, Refinery, Witness, Deacon, Dogs, Crew) and persistent state via Beads.
- Ralph / Ralph Wiggum technique — Huntley’s deterministic loop pattern: a Bash
while truethat repeatedly feeds an agent a prompt file until the work is done. There is now an official Anthropic Claude Code pluginimplementing it as a Stop hook. A curated list of Ralph resources is at awesome-ralph. - Wasteland — Yegge’s federated network linking multiple Gas Towns: “Welcome to the Wasteland”.
- Beads — Yegge’s coding agent memory system, which underpins Gas Town’s persistent work tracking.
- Yegge’s “Evolution of the Programmer” stages 1–8 — autocomplete-only through to building-your-own-orchestrator. From the Welcome to Gas Town post.
- Spec-driven development — referenced as the methodology bet that you can encode enough of your decisions into a spec for autonomous execution. Tessl is the most cited example. Spec-kit and similar tools have formalised the workflow.
Related Web Directions / AI Engineer program references
The framing in this piece will be sharpened by several talks at AI Engineer in Melbourne (June 3–4, 2026):
- Geoff Huntley, Everything Is a Factory — keynote tracing the journey from Ralph through to industrialised software factories.
- Nick Beaugeard, Spec-driven AI development — A Real World Perspective — what it actually takes to ship spec-driven systems in production.
- Annie Vella, Craft in the Time of Agents — the human cost of long-loop work, and the “middle loop” that’s exhausting senior engineers.
- Ben Taylor, Engineering without reading code — Stile Education’s journey from 2 to 100+ interactives, with engineering moved out of the build loop.
- Daniel Rodgers-Pryor, Fully Automated Luxury Gay Space Engineering — the most-agentic-end position, “you are the bottleneck.”
- Jason Cornwall (SEEK), Why AI coding tools might not make the slightest difference — the Theory of Constraints counter-argument.
- Krishna Kanth Mundada, The AI Tax and “legal” ways to minimise it — engages with the METR study and the felt-faster/measured-slower gap.
- Navin Keswani, Beat Burnout, Find Flourishing: The AI Edition — unpacks Yegge’s “Dracula Effect” and what it means for teams.
- Anannya Roy Chowdhury (AWS), How Many Agents Are Too Many? The Hidden Cost of Multi-Agent Systems — directly addresses the cost-of-long-loops point.
Full speaker line-up: webdirections.org/ai-engineer/#speakers
Great reading, every weekend.
We round up the best writing about the web and send it your way each Friday.