The Problem with “Mathematically Proven” Claims About LLMs

john allsopp 7th May, 2026

How a recurring rhetorical move keeps proving the wrong thing

There is now a recognisable pattern in AI commentary. It runs roughly as follows. A paper appears on arXiv. It contains real mathematics — definitions, lemmas, a theorem, sometimes several. The theorem establishes that a particular formal object, under a particular set of assumptions, has a particular limitation. The paper is then circulated by a second author — a blogger, a LinkedIn poster, a journalist — under a headline of the form “Researchers mathematically prove AI cannot X.” The headline travels. The assumptions don’t.

I want to take three recent specimens, show that they share a structural pattern, and say something about why the pattern matters.

Specimen one: AI cannot self-improve

The proximate cause of this essay is a blog post titled “AI Cannot Self Improve and Math behind PROVES IT!”, summarising a recent arXiv preprint by Hector Zenil (King’s College London), “On the Limits of Self-Improving in Large Language Models: The Singularity Is Not Near Without Symbolic Model Synthesis”. The blog post’s framing is uncompromising. It opens by claiming that “a new arXiv paper formally proves that recursive self-improvement in LLMs is mathematically impossible — the mechanism everyone believed would lead to superintelligence is actually a one-way ticket to model collapse.” Later: “the very mechanism people proposed to transcend human limitations — training on AI-generated data to break free from the finite supply of human knowledge — is mathematically proven to destroy the model’s representation of reality. The escape route collapses into a trap.” And, more lyrically: “the universe doesn’t give you compound interest on noise.”

The actual paper is more careful than its summariser. Zenil models recursive self-training as a dynamical system on probability distributions, assumes a KL-divergence-based objective and a vanishing supply of fresh authentic data (formally, the proportion of exogenous signal $\alpha_t \to 0$), and proves that under those assumptions the system converges to a degraded fixed point. This is a formalisation of the model collapse phenomenon Shumailov et al. described empirically in their 2023 Nature paper.

What the popularisation strips away is everything Zenil himself says about the scope of his result. Section 5 of the paper opens like this:

The results do not prove that all forms of recursive self-improvement collapse.

He goes on:

If $\inf_t \alpha_t > 0$, meaning the system receives persistent exogenous signal, then the contraction toward $P$ remains active. Systems operating under fixed axioms, externally defined objectives, or invariant verifiers (e.g. formally specified environments) do not satisfy the $\alpha_t \to 0$ condition.

And in the conclusion:

The impossibility result is conditional rather than universal. … Our results therefore do not rule out improvement in externally anchored systems; they rule out fully autonomous recursive density matching as a path to indefinite intelligence growth.

The proof says: if you train recursively on your own samples without sufficient fresh signal, under a KL objective, you collapse. The headline says: AI cannot self-improve. These are not the same statement. The gap between them is filled by an unexamined assumption: that “self-improvement” must mean naive autophagy. But that is not what self-improvement looks like in practice anywhere it is currently working. AlphaZero recursively self-improved through self-play because Go has a ground-truth winner. RLVR works because unit tests, proof checkers, and graders supply external signal. Distillation from stronger teacher models works. Verifier-filtered synthetic data works. The whole point of these regimes is that the loop is not closed — there is some external source of truth disciplining each iteration. The theorem about the closed loop is a theorem about a system nobody is building, and the paper itself says so.

Specimen two: hallucination is inevitable

The pattern is older than this paper. In January 2024, Xu, Jain, and Kankanhalli published “Hallucination is Inevitable: An Innate Limitation of Large Language Models”. The argument is elegant. They define a “formal world” of computable functions. They define hallucination as follows:

Hallucination occurs whenever an LLM fails to exactly reproduce the output of a computable function.

They then invoke a diagonalisation argument from learning theory to show that no computably enumerable family of LLMs can learn every computable function, and conclude that any LLM must hallucinate on some inputs. The headline — hallucination is mathematically inevitable — was widely repeated.

What gets buried is what “hallucination” had to be defined as for the proof to go through. Under the paper’s definition, every finite system “hallucinates,” because no finite system can compute every computable function. By that standard, your pocket calculator hallucinates the Ackermann function and you hallucinate the fifteen-digit prime factorisation. The proof says less than the headline implies; it says any general problem-solver will be wrong about something, somewhere.

And once again, the paper itself is more careful than its reception. Xu et al. explicitly note the get-out:

Knowledge-Enhanced LLMs … receive extra information about the ground truth function $f$ other than via training samples. Therefore, Theorem 3 is inapplicable herein.

The paper’s section on practical implications begins with “All LLMs trained only with input-output pairs will hallucinate when used as general problem solvers” (emphasis mine). The qualifier vanishes from the popularisations. The whole modern stack — retrieval, tool use, code execution, formal verifiers, knowledge bases — is, by the paper’s own admission, outside the theorem’s scope.

A 2025 follow-up by Suzuki et al. makes the point neatly in its subtitle: “Hallucinations are inevitable but can be made statistically negligible. The ‘innate’ inevitability of hallucinations cannot explain practical LLM issues.” The mathematical inevitability and the practical incidence are different problems. The former tells us almost nothing about the latter.

Specimen three: the math ceiling

The same shape, again, in 2025 and into 2026. Varin Sikka and Vishal Sikka’s paper “Hallucination Stations: On Some Basic Limitations of Transformer-Based Language Models” was widely reported as proving that LLM agents have a fundamental “math ceiling.” The core theorem is straightforward:

Given a prompt of length $N$, which includes a computational task within it of complexity $O(N^3)$ or higher, where $d < N$, an LLM, or an LLM-based agent, will unavoidably hallucinate in its response.

The proof is one paragraph: cite the Hartmanis-Stearns time hierarchy theorem; observe that an LLM’s per-token computation is $O(N^2 \cdot d)$; conclude that tasks requiring asymptotically more time cannot be carried out correctly. It’s true. It is also, by construction, a result about the LLM’s core forward pass. The Sikkas are explicit about this in their discussion:

While our work is about the limitations of individual LLMs, multiple LLMs working together can obviously achieve higher abilities. … various approaches are being developed, from composite systems to augmenting or constraining LLMs with rigorous approaches.

In other words: an unaided transformer of fixed dimensions, evaluated on tasks whose complexity exceeds its forward-pass complexity, will fail. Yes. And this tells us almost nothing about what an LLM-with-tools can do. Agents do not run inside the assumptions of the proof. They use scratchpads. They call solvers. They invoke MathJS and Lean and Wolfram. They write Python and run it. The theorem says transformers-without-tools cannot do TSP-in-a-fixed-context. The actual systems being deployed are transformers-with-tools, and the relevant empirical question — how good can the composite get? — is not addressed by the theorem at all.

The paper’s authors are clear-eyed about this. Tudor Achim, quoted in the WebProNews coverage, takes the productive view: “I think hallucinations are intrinsic to LLMs and also necessary for going beyond human intelligence.” His company’s bet is on what they call “mathematical superintelligence” — verified niches where formal checking provides external signal. That’s exactly the right response to the proof. It is not the response the headlines pick up.

The shape of the move

Three papers, three different negative claims, the same structural pattern. It works like this:

1. Take a maximalist version of the claim being attacked. RSI must mean closed-loop autophagy. Hallucination must mean failure to compute any computable function. Reasoning must mean executing tasks unaided in a fixed forward-pass budget. In each case, the strongest, most cartoonish reading is selected, because the strongest reading is what the math will handle. Zenil makes this almost explicit: he models “the autonomy regime” specifically because that’s the regime in which the theorem applies.

2. Prove a theorem about that reading. The theorems are usually fine. The mathematical content is real. KL flows do collapse under vanishing exogenous signal. Computably enumerable families cannot exhaust the computable functions. Fixed-precision transformers cannot solve arbitrarily large computational problems in $O(N^2 \cdot d)$. None of this is in dispute.

3. Drop the assumptions in the popularisation. The conditional becomes unconditional. “Under these assumptions”becomes “in principle.” “For this class of systems” becomes “for AI.” The author’s own qualifications — the results do not prove that all forms of recursive self-improvement collapse; Theorem 3 is inapplicable to systems with external knowledge; multiple LLMs working together can obviously achieve higher abilities — disappear in transit. The reader, encountering the headline, has no easy way to recover the lost qualifications. Qualifications are exactly what doesn’t survive a headline.

4. Garnish with vibes. The universe doesn’t give you compound interest on noise. The escape route collapses into a trap.It’s like trying to bootstrap yourself off the ground by pulling your own shoelaces. The aesthetics borrow the gravity of mathematics — the QED, the elegance, the inevitability — and graft them onto claims the mathematics did not establish. The form does the work the content cannot.

The result is a kind of vibes-laundering machine wearing a lab coat. A narrow, conditional technical result is converted, by stages, into a metaphysical conclusion. And because the source paper is real, and the math is real, the conclusion borrows credibility it has not earned.

Why it matters

It would be churlish to object to this if the technical results themselves were not interesting. They are. Model collapse is a real phenomenon and worth understanding. Computability bounds are worth knowing. The complexity ceiling on unaided transformers tells us something genuine about where to put effort. The papers, on the whole, are fine. It’s the inferential layer above the papers — the popularisation, the headline, the LinkedIn post — where the damage happens.

What gets lost is the actual operating principle of where AI progress has come from in the last three years, which is precisely not closed-loop magic. It is the patient construction of external discipline: graders, checkers, tools, environments with ground truth, humans in the loop, formal verifiers. Where verification is cheap, recursive improvement is not speculative — it is shipping. Where verification is hard, hallucination is not theoretically inevitable in any meaningful sense — it is empirically common, and the work is to find better verifiers. The Bitter Lesson, applied to applied AI, says roughly this: stop trying to engineer the limits in; build the loop and let the loop teach you. These “mathematically proven” results, read carelessly, tell people the loop cannot work. Read carefully — and at least two of the three I’ve discussed are explicit about this — they tell us what shape the loop has to have.

There is also a class element to the rhetorical move that is worth naming. Mathematically proven is a phrase with enormous social power. It signals that the question is settled, that disagreement is not just wrong but innumerate, that the priesthood has spoken. To unpack the assumptions requires either mathematical literacy or a willingness to be told you don’t understand the math. The asymmetry favours the headline. This is why the pattern keeps repeating — it pays in attention, and the cost of correction falls on someone else.

The honest version of each of these papers is, in fact, the version their authors wrote. Zenil’s conclusion: recursive self-improvement framed as progressively self-contained generative retraining cannot yield unbounded growth under standard distributional learning dynamics. Xu et al.’s caveat: all LLMs trained only with input-output pairs will hallucinate when used as general problem solvers. The Sikkas’: multiple LLMs working together can obviously achieve higher abilities.Each of these is a careful, conditional, useful result. None of them is “AI cannot X.”

A modest proposal

I am not arguing with the math. The math is fine. I am arguing with a habit of inference: the move from theorem about idealised object X to fact about real-world object Y, when Y is not X and the popularisation pretends it is.

The next time you see a piece claiming that mathematics has proven some negative about LLMs, the question to ask is not is the proof correct? It almost certainly is. The questions are: what exactly was modelled? What assumptions did the proof require? Do the systems we actually run satisfy those assumptions? And — usually the most damning — does the paper’s author themselves disclaim the strong reading? In every example I have looked at, the answer to the third question is no, and the answer to the fourth is yes. The gap between the modelled object and the deployed system is where all the interesting work is happening.

Eppur si muove. The systems keep getting better. The theorems keep arriving to explain why they cannot. Both can be true. They are usually about different things.

Web Directions Year round learning for product, design and engineering professionals