AI After an Apocalypse — Simon Knox at AI Engineer Melbourne 2026
AI After an Apocalypse
There's an unexamined assumption that has quietly become load-bearing infrastructure for entire categories of software: the internet is always on. This assumption used to matter mostly to mobile apps and remote workers. Now it matters to AI. An unreliable connection doesn't just interrupt your browsing. It interrupts your development workflow. You can't write code, can't talk to LLMs, can't iterate. It's a different kind of outage.
The apocalypse Simon Knox is talking about isn't civilizational. It's infrastructural. It's the cloud outages that are becoming more frequent, the flaky connections that kill your local LLM servers, the moments when your deployment environment goes dark. These events are happening, and our systems are built on the assumption they won't.
The Brittle Infrastructure Problem
For years, the pattern was simple: your application runs in the cloud, the cloud fails occasionally, your application goes down. That was acceptable because it was rare and everyone understood the tradeoff. But the introduction of AI into the development loop changes the equation. If your development environment depends on cloud APIs, and those APIs go down, you don't just lose deploy capability. You lose the ability to work at all.
This is especially acute for LLM-dependent workflows. You write code by talking to Claude or GPT-4. You train models in the cloud. You run inference on remote endpoints. Stack this dependency on top of the existing infrastructure fragility, and the fault tolerance surface area explodes. Every single layer can fail in ways that ripple upward.
The naive solution is to throw more reliability money at the problem—better SLAs, more redundancy, failover regions. But that scales with complexity and cost. There's another approach: build your systems to degrade gracefully when connections fail, and to function meaningfully on less powerful models when the frontier ones aren't available.
Fault-Tolerant AI Stacks
What does a fault-tolerant AI system look like? First, it means having working patterns for when your LLM API is unavailable. Can you fall back to a lighter local model? Can you queue work for later? Can the system do something useful even if it can't do the optimal thing?
Second, it means being intentional about where you use expensive models and where you use smaller ones. There's a deep assumption in the industry that bigger is always better, that you should use GPT-4 for everything. That's not just wrong for cost reasons. It's wrong for resilience reasons. A diverse stack that includes smaller, local models gives you fallback paths when the internet fails or the expensive API becomes unreachable.
Third, it means rethinking the cost model. LLM APIs can get expensive very quickly, especially if you're not careful about which models you use for which tasks. A system that has to call GPT-4 for everything is not just vulnerable to outages. It's vulnerable to bill shock. It's vulnerable to rate limiting. A system that understands what each model is good at, and reserves the expensive ones for where they actually add value, is cheaper, faster, and more resilient.
Building What Actually Works
This isn't theoretical. Simon has encountered these problems directly, building and maintaining systems that depend on both cloud services and local compute. The tension between wanting the power of frontier models and needing the robustness of systems that can survive without them is very real.
The practical framework that emerges from lived experience looks like this: identify your critical paths and your enhancement paths. Critical paths need to work when the internet is down and the expensive models are unavailable. Use smaller models, local models, or algorithmic solutions there. Enhancement paths can use frontier models when available, falling back gracefully when they're not. Make every outage a chance to learn what you actually need versus what's nice to have.
This mindset extends to cost as well. If you're making decisions about which model to use based on "the best one available," you'll spend a lot of money optimizing the wrong thing. If you're making decisions based on "the right tool for this specific job," you'll find opportunities to use lighter models, to batch requests, to cache outputs, to do less computation.
Simon Knox, a computer programmer and computing enthusiast based in Melbourne, believes in making things as simple and as silly as possible—but no simpler. He'll be sharing practical patterns for building AI systems that work when things break at AI Engineer Melbourne 2026, June 3-4.
Great reading, every weekend.
We round up the best writing about the web and send it your way each Friday.
