Web Directions Conferences (and more)

Not Everything Needs an LLM — Dave Hall at AI Engineer Melbourne 2026

john allsopp 28th April, 2026

Not Everything Needs an LLM

There's a gravitational pull in AI right now: when you have a hammer called GPT-4, everything starts to look like a nail. A support ticket router? Obviously you need an LLM. A text classifier? LLM. A data extraction problem? LLM. It's understandable. Frontier models are impressive. They work across domains without fine-tuning. They're general-purpose.

But general-purpose is not the same as optimal. And the cost of using a frontier model for every task in your system—whether measured in latency, in dollars, or in inference overhead—adds up quietly until one day you realize you're burning money and complexity for marginal gains.

The Zendesk Ticket Routing Problem

This is what Dave Hall encountered when tackling the problem of misrouted support tickets at his organization. The symptom was clear: tickets were going to the wrong teams. The naive solution was equally clear: throw a large language model at it. Use the LLM to read each ticket and route it intelligently. It would work. It would be reasonably accurate. It would also be expensive.

Instead, Hall took a step back and asked a different question: what is this system actually trying to do? It's trying to categorize tickets. It's trying to figure out which team should handle each one based on content and context. That's a routing problem, and routing problems have been studied for decades before large language models existed.

The insight was to separate concerns: use a fine-tuned BERT model for routing—trained on your organization's actual ticket history, so it learns your specific categories and your specific patterns of misclassification. BERT is smaller, faster, cheaper, and after fine-tuning on your data, likely more accurate than a general-purpose model at this specific task.

But classification isn't all that's needed. Some tickets are actually complicated enough to benefit from an LLM's reasoning capability. So the system uses LLMs for what they're actually good at: priority classification and summarization. Read the ticket, understand its urgency, understand its essence. Then feed that summary to the classifier.

Matching Problems to Tools

This architecture is elegant because it's honest about what each tool does. It's not using an LLM as a universal problem solver. It's using LLMs for reasoning tasks where their language understanding and generalization capabilities matter. It's using specialized models for pattern recognition on domain-specific data. It's using classical logic for what doesn't need learning at all.

The framework this suggests is simple but powerful: start with the problem statement. What are you actually trying to do? Once you understand that, the question becomes: what's the right tool for this particular problem?

Sometimes the right tool is an LLM. If you need to understand novel, unstructured text with high variability, and the domain is so broad that fine-tuning isn't practical, a frontier model might be your answer. But often the right tool is something else. It might be a fine-tuned smaller model. It might be a classical classifier. It might be a rule-based system with well-defined logic. It might be multiple tools in sequence, each doing one thing well.

The cost implications are substantial. A fine-tuned BERT model on commodity hardware might cost a fraction of a cent per inference. A frontier model API call might cost cents. At scale—thousands or millions of inferences—that difference determines whether your system is viable. It also determines whether you can afford to experiment and iterate. If every change requires a new batch of expensive API calls, you innovate more slowly.

But the implications go deeper than economics. Using the right tool for each part of your system makes the system more understandable. Each component has a clear purpose. You're not trying to convince an LLM to do something that's not in its wheelhouse. Debugging is easier. Performance is more predictable. The whole architecture is simpler to reason about.

Building the Right Stack

Dave Hall brings two decades of experience in automation, cloud, and DevOps to this kind of thinking. He built Gata Router as a practical instantiation of these ideas: a production system that's effective because it's been deliberate about which tools to use and why. It's open source, which means the thinking is visible and the architecture can be learned from.

The invitation here is to think differently about how you approach new problems. Don't start with "what AI tool should I use?" Start with "what am I actually trying to solve?" Sometimes the answer will be "an LLM." More often, it will be "a combination of tools, carefully chosen."

Dave Hall will be sharing this framework and the lessons learned from building Gata at AI Engineer Melbourne 2026, June 3-4.