Orbital Lasers vs For Loops: Economically Matching Models to Tasks — Stephen Sennett at AI Engineer Melbourne 2026
Orbital Lasers vs For Loops: Economically Matching Models to Tasks
There's a persistent mythology in AI adoption: bigger is better. If GPT-4 is powerful, use GPT-4. If a model can handle complex reasoning, use it for everything. The assumption is that you're being conservative by using the most capable model available. You're minimising risk, guaranteeing quality, playing it safe.
In reality, you're usually wasting money.
The economics of model selection are almost never discussed. Teams reach for the biggest, most capable model because they're familiar with it, because it has a recognisable brand, because they're uncertain about smaller alternatives. But the actual economics are simple: if a task can be solved by a smaller, cheaper model just as well as by a larger one, the smaller model is the correct choice.
This matters more than it immediately appears. The difference in cost between models can be an order of magnitude or more. GPT-4 is more expensive than GPT-3.5, which is more expensive than a fine-tuned smaller model, which might be orders of magnitude cheaper than a frontier model. If you're running millions of inferences, that difference compounds rapidly.
Stephen Sennett's framework for matching models to tasks starts with a heretical question: what's the minimum capability required to solve this problem well? Not "what's the maximum capability available?" but "what's the minimum that will work?"
The answer requires honest evaluation. Some tasks genuinely need the reasoning capabilities of a frontier model. Complex multi-step reasoning, novel problem-solving, understanding nuance and context — these are tasks where more capable models do things cheaper models can't. But many tasks don't. Classification, extraction, simple summarisation, templated generation — these tasks are often solved adequately by smaller models at a fraction of the cost.
The key word is "adequately." Not perfectly. Adequately. The question isn't "is GPT-4 better at this task than GPT-3.5?" (it probably is). The question is "is GPT-3.5 good enough for this specific use case, considering the error rate I can tolerate, the cost I need to hit, and the scope of deployment?"
This reframing changes the decision-making process. You need to know your tolerance for errors. If you're using the model for customer-facing applications, the tolerance might be very low. If you're using it for internal research or creative exploration, it might be high. You need to measure that tolerance in concrete terms: what percentage of outputs can be incorrect before the system becomes unusable?
You also need to understand what "correct" means for your specific task. A general-purpose large model trained on broad data might be worse at your specific task than a smaller model fine-tuned on your data, even if the large model is nominally more capable. Domain-specific knowledge often trumps general capability.
The economics become even more interesting when you consider the direction of model development. Smaller models are improving faster than larger ones in many domains. The gap between a flagship model and a smaller alternative this year might be significant, but next year it might be negligible. Betting on frontier models locks you into ongoing high costs; investing in alternatives gives you options.
There's also a hidden cost to large models: latency and infrastructure. Running GPT-4 on your infrastructure is more expensive and slower than running smaller models. For applications where latency matters — real-time customer interactions, high-throughput batch processing — the economic advantage of smaller models compounds.
This doesn't mean never using frontier models. It means making the decision consciously, based on actual requirements and actual economics, not hype or habit. It means running evaluations that tell you whether the extra cost is delivering actual value for your specific use case.
It also means building systems that can upgrade and downgrade models easily. The optimal choice today might not be optimal in six months when new models are available, prices change, or your use case evolves. If model selection is architecture-baked-in, you're stuck. If it's a parameter you can change, you have options.
The meta-lesson is important: AI adoption decisions should be made with economic rigour, not technological aspiration. The most advanced model isn't the best model for most tasks. The best model is the one that solves the problem well enough at a cost that makes economic sense.
Stephen Sennett, AWS Community Hero and Lead Consultant at V2 AI, is presenting this talk at AI Engineer Melbourne 2026 on June 3-4.
Great reading, every weekend.
We round up the best writing about the web and send it your way each Friday.
