The Structure of Engineering Revolutions

John Allsopp 2nd March, 2026 @johnallsopp

TL;DR: The resistance to AI-assisted software development among experienced software engineers isn’t random or capricious–it follows the pattern Thomas Kuhn identified in scientific revolutions sixty years ago. What we’re witnessing isn’t a tooling debate. It’s a paradigm shift, complete with anomaly denial, incommensurable worldviews, and paradigm defence mechanisms that have played out very similarly in every intellectual revolution Kuhn observed. Understanding this pattern won’t make the transition painless, but it might make it possible.

A photograph of the cover of The Structure of Scientific Revolutions by Thomas S. Kuhn. The book has a plain yellow cover with a simple black border and centered text. The title appears in large serif type, with the author’s name below, and a small Harvard University Press emblem at the bottom.

Note: The quotes throughout this essay are drawn from public posts, blog articles, and social media by experienced, well-respected software engineers. I’ve chosen not to name them, because this essay isn’t a debate with individuals. These are people I respect, whose expertise and good faith I don’t question. They happen to be representative of common lines of reasoning, and it’s the reasoning I’m interested in examining — not the individuals.

In 1962, Thomas Kuhn published The Structure of Scientific Revolutions, one of the most cited academic works of the twentieth century which spawned the term “paradigm shift”. Kuhn’s central insight was deceptively simple: science doesn’t progress through the steady accumulation of knowledge. It progresses through paradigm shifts–periods where the fundamental assumptions of a field get replaced, not incrementally but wholesale.

These shifts are resisted not by fools, but by the most accomplished practitioners of the existing paradigm. Precisely because their expertise is founded on the assumptions being displaced.

Max Planck, who knew something about revolutionary ideas in physics (he was foundational to the science of quantum physics), put the dark corollary more bluntly: “A new scientific truth does not triumph by convincing its opponents and making them see the light, but rather because its opponents eventually die, and a new generation grows up that is familiar with it.” Or more bluntly still–”science advances one funeral at a time”.

Kuhn and Planck come to mind as I watch the debate over AI-assisted software development. Because what’s playing out right now–in blog posts and social media threads and conference hallways, but above all at the keyboards of software engineers–isn’t a disagreement about tools. It’s a paradigm shift. And it’s following Kuhn’s observations with remarkable fidelity.

The Existing Paradigm

Let’s be clear about what’s being displaced, because it deserves respect before being critiqued.

The established paradigm of software engineering has been extraordinarily successful. Humans write code. They do it in integrated development environments, in files organised into projects, in languages optimised for human cognition and human collaboration. Quality means human-readable, human-maintainable, human-elegant code. An entire professional culture has been built on this foundation–career paths, hiring practices, status hierarchies, conference circuits, book deals, open source reputations. Not to mention methodologies, patterns, practices, technologies, and tool chains. And some incredibly valuable businesses.

This paradigm gave us the modern computing landscape–the internet, desktop and mobile computing, cloud infrastructure, and essentially every piece of software that runs the world. It has been refined over decades through genuine hard-won wisdom: version control, test-driven development, continuous integration, agile methodologies, code review practices. Programming paradigms like object orientation and functional programming. These aren’t arbitrary conventions. They’re the accumulated, empirical solutions to real problems, discovered through real experience.

The people who built their careers within this paradigm aren’t wrong to value this. They built remarkable things.

Anomalies

Kuhn observed that every paradigm eventually encounters anomalies–results that don’t fit the existing framework. At first, these are easily absorbed. Minor curiosities. Edge cases. Parlour tricks. Like the subtle precession of the orbit of Mercury, which Newtonian physics couldn’t explain. Anomalies in orbits, like that of Uranus, had been observed and explained before.

When GitHub Copilot emerged, it was easy to categorise: “spicy autocomplete”. A productivity boost within the existing paradigm. Your IDE got smarter. You got a bit more productive. Nothing fundamental changed.

When developers started having conversations with ChatGPT and pasting code into their projects, it was clunky enough to dismiss. Copy, paste, debug, repeat. Clearly inferior to writing code yourself. An interesting toy, not a serious tool. We as developers were still very close to the code we write. We touched it, even if briefly, by copying and pasting it from our chat to our IDE.

But anomalies kept accumulating. The code Frontier models produced got better. The tools got more integrated. Artifacts let you see and run complete applications without leaving the conversation. Claude Code and Codex began integrating directly into established Git workflows. Developers started shipping production software built primarily by AI systems. Companies discovered that adoption was happening from the bottom up–engineers expensing their own subscriptions, sometimes hiding their AI usage from employers who hadn’t caught up yet.

And then came the moment that, for software engineering, resembles what the double slit experiment was for classical physics.

When you fire single photons at a double slit, they produce an interference pattern–as if a photon were both a wave and a particle. Within the classical paradigm, this is not merely surprising. It is impossible. Light cannot be both a particle and a wave. And yet there’s the pattern on the screen. The apparatus isn’t broken. Classical theory is no longer sufficient.

For software engineering an equivalent moment arrived in the form of Geoff Huntley’s Ralph Wiggum technique: a loop that takes the output from a coding agent and feeds back the original prompt along with that output to the agent over and over again. It sits far outside the existing paradigm of software engineering. Eppur si muove.

Consider what it tramples on. Human code review–treated as essentially sacred for decades–is simply absent. Deterministic human reasoning about correctness is replaced by brute-force iteration against a test suite. The developer as craftsperson, someone who reasons carefully about architecture and writes clean, expressive code, is replaced by a bash loop. Planning methodologies refined across the entire waterfall-to-agile spectrum are replaced by an agent that grabs the next ticket and goes. And the economics are deliberately provocative: roughly ten dollars an hour of compute versus a developer’s cost of around $150 an hour.

Within the existing paradigm of software engineering, this cannot produce working software. Every principle, every best practice, every hard-won lesson says so. Code review exists for a reason. Human reasoning about correctness exists for a reason. Planning exists for a reason.

And yet there’s the interference pattern on the screen. Tickets get closed. Tests pass. Software ships.

The reactions from many established software engineers are straight from Kuhn. People describe “rage smashing keyboards at the pure regressive behavior AI falls into.” Others warn of “patches, workarounds, hiding errors, abstracting abstractions for abstractions sake.” The old paradigm’s defenders don’t say “interesting, how does that work?” They say, in effect, there must be an error in the apparatus. Just as classical physicists confronted with the double slit didn’t investigate the anomaly–they denied it.

Huntley himself has described feeling physically sick about what he’d discovered. “but it made me want to vomit ’cause I could actually see where we’re going. I could see where we were going. Like I was building software in my sleep” he told the Linear B podcast. That visceral response captures what it feels like to live through a paradigm shift from the inside. You can be intellectually convinced that the results are real and still be unsettled at a gut level, because every instinct formed within the old paradigm is screaming that this is wrong.

It may be that the Ralph Wiggum technique, like many early paradigm-breaking experiments, turns out to be a crude early form of something more refined. The double slit experiment didn’t immediately give us quantum computing. But it was the anomaly that the classical paradigm could not absorb. It demanded a new framework.

Dario Amodei predicted about a year ago that by the end of 2025, ninety percent of code would be written by AI systems. Even halfway through last year this looked like a potentially very foolish prediction. It turned out to be right.

At some point, the anomalies become too numerous and too significant to absorb within the existing paradigm. Kuhn called this moment crisis.

Crisis

You can tell when a paradigm is in crisis in part by the emotional temperature of its defenders. “Normal” (Kuhn’s term) scientific disagreement is calm, technical, focused on specifics. Paradigm defence is urgent, sweeping, and often heated.

Look at the discourse around AI-assisted software development right now and you’ll find very experienced, accomplished engineers making claims that go well beyond measured technical critique. The assertion that “code is free” is compared to Ancient Aliens conspiracy theories. Advocates are dismissed as “LinkedIn AI bros” who have “ditched the concept of money.” A well-known, thoughtful trainer of development teams declares that AI “solves none of those problems. None. I swear.” Somebody uses an AI coding tool to attempt to build a C compiler–one of the most challenging problems in software engineering, a task that took teams of experts years–and when the tool doesn’t produce a world class compiler in a matter of weeks, this is seized upon as evidence that agentic coding is fundamentally broken.

These aren’t calm assessments of limitations. They’re the responses of people defending a paradigm under threat. And Kuhn documented exactly these patterns in every major scientific revolution.

The Defence Mechanisms

Kuhn identified specific ways that practitioners of an existing paradigm respond to anomalies. Each one maps onto what we’re seeing in software engineering right now.

Explaining away anomalies. When evidence contradicts the paradigm, reinterpret it so that it fits. “LLMs hallucinate” becomes a permanent characterisation of the technology rather than a rapidly improving metric. “The code doesn’t match my style” becomes evidence that AI can’t write good code, rather than evidence that you haven’t worked with it in ways that help ensure that outcome.

Six months ago, these tools genuinely couldn’t do certain things. Today they can. But the evaluation of what they can and can’t do was locked in six months ago and hasn’t been updated–because within the old paradigm, six months shouldn’t matter. Technologies don’t change that quickly. Except these ones do.

I regularly talk to experienced engineers who tell me they tried AI coding tools and were unimpressed. When I ask when, the answer is typically like “six months ago”. Sometimes longer. In a field where capabilities are improving at a monthly cadence (or more frequently), a six-month-old evaluation is essentially meaningless. But updating these priors (to use the currently popular Bayesian terminology) would mean reopening a settled question, and settled questions are what paradigms are made of.

Or I might ask, “how do you work with these technologies”? And they’ll talk about using an older approach of chat interface and the back and forth, generating code, pasting it into an IDE, running the code, as opposed to using tools like Claude Code or Codex with significantly lower friction and faster feedback loops.

These are conversations I have had in recent weeks, and accounts I read not infrequently online.

Incommensurability. This is Kuhn’s deepest and most unsettling idea: people on different sides of a paradigm shift aren’t just disagreeing about facts. They’re operating with different definitions of what counts as evidence, what counts as quality, what counts as the thing they’re even talking about.

When a well-known engineer with decades of experience writes that “almost never” does AI-generated code look like “code I would be proud to have in my code base”, or that it’s “not my style and is naive or too imperative,” “not code that tells a story of what it is I was trying to solve”–those are the old paradigm’s quality criteria talking. Code as craft artefact. Quality as partly aesthetic.

This same engineer describes being “deeply, [in] principle opposed to automated agents”–an “immutable philosophical objection”–because they represent an “inversion of control” where “instead of me using the tool, the tool uses me.” That’s not a technical assessment. It’s a paradigm commitment.

Within the emerging software engineering paradigm, code is not an artefact of craft. It’s a means to an end–a functional artefact whose quality is measured by whether it works, whether it’s verifiable, whether it solves the problem. The process by which it was generated is irrelevant. Its elegance is not pertinent. This isn’t a lower standard. It’s a different standard. And the two are genuinely hard to compare, because they’re not measuring the same thing.

This is why debates about AI and software engineering so often feel like people talking past each other. Because we are. We’re arguing from within different paradigms, using the same words to mean different things.

Paradigm-preserving reframing. When an anomaly threatens the core of the paradigm, redefine the core so the anomaly becomes irrelevant.

This is the engine behind perhaps the most common objection to agentic software systems: “writing code was never the bottleneck.”

It’s a sophisticated argument. The bottleneck isn’t writing code; rather, it’s understanding requirements, designing systems, testing, integration, deployment. AI just speeds up the part that wasn’t slow.

There’s a kernel of truth here–typing speed was never the bottleneck. But the argument conflates typing code with the entire cost of implementation. More importantly, it holds the entire development methodology fixed and changes only one variable. It assumes you pour faster code generation into an unchanged process which creates a pile-up downstream. That’s like arguing email would slow down business because the mailroom can’t sort that many letters.

The deeper error is failing to see that cheap implementation changes the methodology entirely. The reason we built elaborate discovery, research, and prototyping processes wasn’t because understanding problems is inherently slow. It was because implementing a candidate solution was so expensive and time consuming you couldn’t afford to be wrong.

When implementation cost drops by an order of magnitude, you don’t just build the same things faster–you explore the landscape of possibility. You take more shots on goal. You don’t “run 10 times faster in the wrong direction”. You run ten times faster in ten directions and see which lead somewhere. That’s not recklessness. It’s a fundamentally different methodology for product development. But it’s invisible from within the old paradigm, because the old paradigm assumes implementation cost is fixed, and high.

Special pleading. Applying standards to the new paradigm that were never applied within the old one.

“Developers should regularly work without AI tools to keep their skills sharp” as one well known software engineer recently put it–a standard never applied to IDEs, compilers, garbage collectors, package managers, or any other tool that reduced engineers’ cognitive burden. “LLMs hallucinate”–a concern conveniently overlooked about Stack Overflow answers, which also frequently contain bugs, or about real live actual human developers, who also as it turns out produce incorrect code with such regularity that we built entire methodologies and toolsets around catching their errors!

There’s a related version of this special pleading worth identifying: we consistently compare humans at their best to AI systems at their worst. The expert developer, writing careful, well-architected code on a good day, versus the perhaps out of date model given a terse prompt with no context. The more honest comparison–the average developer, under deadline pressure, on unfamiliar code, versus a frontier model given reasonable context and guidance–looks very different. But that comparison is threatening to the paradigm, so it rarely gets made.

Gatekeeping. When outsiders start confirming the anomalies, redefine who counts as a legitimate observer.

“The AI commentary class are talking confidently about software engineering productivity despite having little experience building and maintaining software” (again recently from a well known developer).

This conveniently erases the reality that from the outset advocates for AI-based software generation have been working developers, and it excludes from the conversation precisely the people–product managers, designers, business leaders–who experience the consequences of software engineering productivity every day and whose perspectives are valuable because they’re not bound by the old paradigm’s assumptions.

But is it really the “AI commentary class” with “little experience building and maintaining software” “talking confidently about software engineering productivity”?

This is perhaps the weakest of all arguments in this debate (as ad hominem arguments tend to be). It’s hard to describe Simon Willison (Co-creator of Django, Co-founder of Lanyard, former CTO of Eventbrite), Armin Ronacher (creator of Flask), Andrej Karpathy, Brett Taylor (early Google Maps, ex CTO Facebook…) or numerous deeply experienced software engineers who attest to the transformation of software engineering as “having little experience building and maintaining software”.

One could make the opposing case, and that is those critical of agentic coding systems who have little experience building software with that technology.

A related (and better) counter argument is that studies like METR’s Experienced Open-Source Developer Productivity study of 2025 show that while developers’ perspective of their productivity increases when they use these tools, their productivity when objectively measured in fact decreases.

But the study’s author, Joel Becker, has been publicly rethinking these findings. In a recent interview with Latent.space, Becker acknowledged that METR has been attempting to redo the study but found the original design increasingly impossible to replicate–and that the reasons are themselves telling.

Developers now refuse to be randomised into an “AI disallowed” condition, and the sequential single-task workflow the study assumed has given way to concurrent agentic coding across multiple issues. Becker still stands by the finding that developers overestimate their AI-assisted speedup, but is careful to distinguish this from a claim that AI provides no productivity gain.

He describes even the most senior and sceptical developers committing entirely to agentic coding, and notes he “feels it in my own case.”

There is increasing empirical evidence for the case that these tools are leading to an extraordinary increase in developer productivity.

In a recent interview of Anthropic CEO Dario Amodei by Dwarkesh Patel, this particular interchange stands out.

Patel:

I’m sure you saw last year, there was a major study where they had experienced developers try to close pull requests in repositories that they were familiar with. Those developers reported an uplift. They reported that they felt more productive with the use of these models. But in fact, if you look at their output and how much was actually merged back in, there was a 20% downlift. They were less productive as a result of using these models.

So I’m trying to square the qualitative feeling that people feel with these models versus, 1) in a macro level, where is this renaissance of software? And then 2) when people do these independent evaluations, why are we not seeing the productivity benefits we would expect?

Amodei:

Within Anthropic, this is just really unambiguous. We’re under an incredible amount of commercial pressure and make it even harder for ourselves because we have all this safety stuff we do that I think we do more than other companies.

The pressure to survive economically while also keeping our values is just incredible. We’re trying to keep this 10x revenue curve going. There is zero time for bullshit. There is zero time for feeling like we’re productive when we’re not. These tools make us a lot more productive.

Now, you might argue “he would say that, wouldn’t he”? But it’s a very plausible argument backed up with evidence like Anthropic’s development of their knowledge work harness Cowork in just 10 days, almost entirely using their own technologies.

Eppur Si Muove

When Galileo was forced to recant his position that the Earth moves around the Sun, he (likely apocryphally) was said to mutter eppur si muove–”and yet it moves”. Whatever the theoretical objections, whatever the weight of authority and tradition, the empirical reality was unchanged.

Galileo is, of course, the paradigmatic example of Kuhn’s paradigm shifts. And the phrase keeps coming to mind as I watch this debate unfold.

These technologies work. They’re being used in production, at scale, by serious engineers at serious companies. The adoption is grassroots–developers dragging their organisations along, not the other way around. The capabilities are improving on a curve that has been consistent and dramatic. This is an empirical reality, and it remains stubbornly unchanged regardless of how many social media posts argue otherwise.

Three years ago, these criticisms of AI code generation were entirely valid. Those of us who were early adopters–and I was among them, along with folks like Simon Willison–knew this and said so openly. We weren’t boosters peddling hype. We were practitioners reporting honestly on what worked and what didn’t.

But the hypothesis we shared was that these technologies would improve dramatically. As they have.

The Sunk Cost Trap

Kuhn observed that defenders of the old paradigm rarely convert. Einstein for all his genius and impact on modern physics maintained about quantum physics that “God does not play dice with the universe”.

Planck put it bleakly as we saw-“science advances one funeral at a time”. But Kuhn was describing scientific revolutions that played out over decades. This transformation is playing out over years, perhaps months. Which means there’s a window of opportunity for people to make the transition within their careers, if they can get past the defence mechanisms, as many folks are.

The challenge is that paradigm defence is self-reinforcing. Once you’ve publicly and repeatedly argued that AI coding tools don’t work, that they have fundamental and inescapable flaws, that the people using them are naive or ignorant–every subsequent statement is an additional investment. Reversing this position feels like writing off everything invested to date. The sunk cost fallacy kicks in. The longer you wait, the harder it becomes.

But it can be done, and it has been. Armin Ronacher–creator of Flask and significant pieces of Python infrastructure–has been open about how his views on AI tools evolved through sustained engagement. His stance wasn’t always positive. The difference was he kept engaging, kept learning, and found the results ultimately undeniable. He’s not unlike many other very experienced developers, with deep expertise and established public reputations who managed to update their priors as the evidence changed.

These are existence proofs that the funeral isn’t the only way forward.

The Emotional Core

There’s something beneath all of this that deserves to be treated with genuine understanding rather than dismissal.

For many experienced software engineers, coding isn’t just a profession. It’s a vocation, a craft, an identity. When someone with thirty-five years of experience writes that they would have to “detach all emotion and passion from coding to accept AI agents writing code for them”–that’s not a technical argument. That’s grief. It’s the anticipated loss of something that has defined them for their entire adult life. This is something Annie Vella observed in her popular essay from March 2025, The Software Engineering Identity Crisis.

The irony cuts deep: for years, we’ve said that software engineering transcends mere coding. Requirements, design, testing, operations – these were all supposedly part of our craft. Yet the industry pushed us in the opposite direction. We handed these responsibilities to specialists – Product Owners, Architects, Quality Engineers, Platform Engineers – while we doubled down on our coding expertise. We became masters of our code, proud wielders of a modern magic.

And now, just as we’ve perfected this craft, AI is threatening to take it away from us.

Every paradigm defence mechanism in this essay is, at some level, a way of managing that grief. The dismissiveness, the adversarial benchmarks, the philosophical manifestos, the adherence to outdated workflows–these are ways of holding the world still long enough to not feel the ground moving beneath our feet.

Kuhn understood this. He wrote with genuine sympathy about the practitioners of the old paradigm. They weren’t villains or fools. They were people whose entire intellectual formation had taken place within a framework that was now being displaced. Their resistance wasn’t irrational–it was deeply human.

I think the same is true here. The engineers pushing back against AI-assisted development aren’t foolish. Many of them are among the most accomplished practitioners in our field. Their resistance comes from a place of genuine expertise and genuine care for their craft and hard-earned knowledge.

But I also believe that the response of constructing ever more elaborate intellectual frameworks in opposition to the reality of AI code generation technologies–why the tools can’t work, shouldn’t be used, or don’t matter–is the path most likely to lead to the outcome these engineers fear most. The paradigm is shifting. The evidence is in. The question is not whether the shift will happen, but whether the people who built remarkable things within the old paradigm can find their place in the new one.

Kuhn would say probably not. Planck would say wait for the funerals. But the compressed timeline of this particular revolution, the example of the likes of Armin Ronacher, means there’s still a choice available. The developers who are thriving aren’t the ones who felt no discomfort. They’re the ones who felt it and kept going. Who were willing to be beginners again in their own domain. Who updated their priors when the evidence changed. Who let go of the idea that the way they’d always worked was the only way that counted.

The paradigm is shifting. Eppur si muove. The question is whether we shift with it.

Web Directions Year round learning for product, design and engineering professionals