Perspective

The Illusion of AGI, or What Language Models Can Do Without Thought

Eryk Salvaggio / Feb 9, 2026

Manchester, United Kingdom: Alan Turing memorial and statue.

A recent commentary in Nature declares that artificial general intelligence (AGI) has arrived in machines. By applying the same standard used to identify general intelligence to humans, rather than “perfection” or even “superintelligence,” the authors suggest that large language models (LLMs) show sufficient breadth and depth across many domains. Thus, they are AGI. It’s the latest claim that AGI is already here — if we squint.

The claim derives from a theory of mind, functionalism, which has influenced thinking about AI since the days of Alan Turing. It assumes that if a machine behaves consistently as if it were intelligent, it is intelligent. Things have changed since Turing’s time, and the AI industry has flipped philosophical inquiry into a design brief. Instead of looking for signals of thought, it uses language to forge the evidence.

The authors of the Nature article would likely agree that human-like thought does not exist within a language model. An LLM is a dense set of numerical rules optimized to shift words to their most plausible neighbors. The problem with claiming they have human-level general intelligence is not that these systems are designed; Turing’s machine is built to trick us, too. But an LLM is a system designed by engineers for the explicit task of deceiving functionalists. It is circular reasoning to use the same tests to build and to evaluate the same system.

Given the ambiguity surrounding definitions of thought, it’s not a useful distinction. We have a system capable of producing language in remarkable and deceiving ways. It is not simple stubbornness to insist it is not “intelligent,” much less a form of “general” intelligence. My goal is to find useful categories that make sense of this inverted relationship between intelligence and its indicator lights. A faith in humanlike thinking machines frames significant psychological, social, political and financial investments into this technology. This illusion of thought demands an intervention. We should focus instead on clear descriptions of what machines do.

AGI & the 'Functionalist' trap

Defining intelligence is difficult. I might describe it as an inner state that makes discernment possible: an impulse to produce and resolve doubt necessary to weigh evidence, experience, and environmental cues in response to a changing world. You may describe it differently.

The authors of the Nature article emphasize that we cannot know the inner experience of being an LLM, just as we cannot know the inner experience of another human. In the functionalist view, gauging intelligence refers merely to the collection of evidence suggesting intelligence, rather than ascribing self-knowledge to a system. I understand this, but do not find it useful. The authors write:

“[G]eneral intelligence is about having sufficient breadth and depth of cognitive abilities, with ‘sufficient’ anchored by paradigm cases. Breadth means abilities across multiple domains — mathematics, language, science, practical reasoning, creative tasks — in contrast to ‘narrow’ intelligences, such as a calculator or a chess-playing program. Depth means strong performance within those domains, not merely superficial engagement.”

To assess “sufficient breadth and depth of cognitive abilities” requires observation. How do we evaluate an LLM transcending “superficial engagement”? With LLMs, we have no concrete way of knowing. The authors use paradigm cases — a consensus that LLMs can do certain things. I’m a critic of AI: LLMs certainly do things. But “intelligence” risks inferring a humanlike process to results achieved by other means.

The authors invoke the Turing test, which would suggest that if a machine produces communication that one human mistakes for another human, then the machine could be classified as intelligent. It would be more precise to say that when the machine creates this signal it behaves as if it were intelligent: It displays behavior associated with thought. We are fooled not just by LLMs but by far simpler image diffusion models every day. An LLM can pass the bar exam, so some might say it knows the law. We should say that it behaves as if it knows the law. But to suggest it knows the law is to infer something more. Current systems do not behave as if they know the law, particularly in ways relevant to practice such as valuing truth.

To that end, I don’t know what the Nature article suggests AGI is beyond a category choice. I worry about the inferences and projections enabled by words such as knowing and intelligence that unhelpfully cloud our insights into the system. We can call a system AI, or AGI, or a word processor. A name is only useful if it helps us see the system clearly.

The 'as-if' problem

If we project understanding onto language, writing can be mistaken for evidence of a thoughtful collaborator, a source of guidance, or a sign of a task suitable for automation. The written word is a surface through which depth is transmitted, and so we may interpret it as a product of a mind that chooses words with awareness and discernment. A procedural mechanism designed to produce language that is superficially indiscernible from human writing need not have human thoughts to produce text. Nor do we have to dismiss the capabilities of an LLM to dismiss the idea that it thinks: to refute that a machine is intelligent is not to claim it is not useful.

Stripping the illusion of thought from the system helps us decide if language ordered by syntactic shuffling is suitable for a task, instead of assuming a “general” aptitude. Sometimes, functionality is a useful metric. Code runs or doesn’t run. A machine that behaves as if it can code is hard to distinguish from a machine that codes. If code is functional, the machine that writes it is functional for coding tasks. The as if leaves room for error, but human coders create errors too.

Many language tasks are not defined by functionality, and are inappropriate to solve with complex language shuffling. The role of language in these tasks is to convey something rather than to produce language. Using an LLM to write code is different from a study of police brutality. How “functional” must an LLM be to behave as if it has had the experience of a baton in its back?

We might say that the model behaves as if it can console a lonely person, because it can produce language. The user may feel consoled by the presence of their own projection onto the source of these words, feeling connected to them while becoming increasingly isolated. A therapist might suggest that the system does not function. The machine can provide language but cannot provide presence. The user may say it does. Deferring to intelligence, or “general” intelligence, offers no help in determining if a machine “functions” across nuanced contexts.

The illusion of intent in language models

The Nature authors suggest critics evoke language to prove that AI is constrained by it. They write that, by contrast, “language is humanity’s most powerful tool for compressing and capturing knowledge about reality.” Indeed, that is the problem with language: it compresses without knowledge and is decompressed by a reader who acts as-if it expresses knowing. It’s not that language is limited, but that it is unlimited and untethered to anything but the prompt. That’s not exclusive to machines: I could brief you on a philosophy paper or some concept in astrophysics without mastery of either. Unless the listener is versed in either subject, the spectacle might resemble erudition.

If not intelligence, then what is happening within an LLM? By abstracting language and reassembling it into legibility, the user’s prompt is restructured as a response. Through automated chain-of-thought expansion of the original text and processed through a neural net, the text is broken up and shuffled through to a point of recompression, where this scramble of language is restructured to appear conversant and knowledgeable. Language models are optimized and fine tuned with this goal in mind. Through this repeated process of abstraction and concretization by the model, language shifts to resemble a thoughtful expansion and reframing of the user’s query. It’s true — and remarkable — that shuffling words into fuzzier arrangements can inspire new interpretations of our own ideas. Recursion is a powerful force.

Through this mechanism, the production of new knowledge is indirect, bound not by the vast combinatorial power of the vector space but by the prompt window — and the sense we make of the output. It may seem that endless new ideas could arise from billions of interconnected vectors, but these are ultimately constrained to shuffling sentences through a final bottleneck of output.

The human is the first and last node in the network. Human understanding transforms text into intuitions. When machines produce language through this fusion of mathematics and prose, it relies on the interpretation of the reader to decode and infer their intuitions. Unfortunately, the machine can also be engineered to steer them. Machine text is orchestrated to look like human text, resembling language rather than using it.

Many critics suggest that LLMs are useless. I disagree. It’s debatable if they are worth the costs of its uses — economically, environmentally, socially — but users of an LLM do something both powerful and concerning: they engage with an alienated abstraction of their own thoughts. The powers attributed to an LLM do not come from within it. They do not understand a problem and cannot offer advice. The power they possess is embedded, already, within the power of language. Automating this restructuring of language is a powerful and often useful illusion, with risks. This illusion can diminish our ability to distinguish truth from plausible fiction, to unsettle boundaries between our words, thoughts, evidence, and experience. An LLM operates through disorientation of the mind’s ability to express and understand ideas. Paired with innate sycophantism and hallucinations, yet framed through ill-suited metaphors, the LLM is a powerful tool for imagination capture.

AI and the crisis of accountability

A more helpful question than “Is it intelligent” might be: what does it mean to know? What does it mean to believe that others know? How does this impact trust in a society, and how might AI colonize those mechanisms to seize that trust? The public often assumes this designation of AGI suggests the subjective experience of knowledge: a self-aware state of knowing it knows, expressed in a human-compatible format of language. This puts misplaced trust in these systems. If we ask a question demanding discernment, how do we imagine that discernment arises? Already, many people cannot discern for themselves that AI cannot discern. This is the risk we have to take seriously. It’s an engineered spectacle — a question that has nothing to do with intelligence.

Of course, we cannot prove that others know or think or feel. We learn to trust anyway, by making minds familiar. We evaluate the thinking process behind decisions, confer with others and work together to arrive at conclusions through multiple shared subjectivities. It is distinct from the useful but independent knowledge of storage, as we find in a book or hard drives, and from the knowledge of complex transformational rules, like that of an LLM. We need not agree on whether such a system is possible. We should agree that no system does this now. When we act as-if it can, we build a world upon simulation, and disconnect our decision-making from the anchor of collective discernment.

Authors

Eryk Salvaggio

Eryk Salvaggio is a Gates Scholar researching AI and the humanities at the University of Cambridge and an Affiliated Researcher in the Machine Visual Culture Research Group at the Max Planck Institute, Rome. He was a 2025 Tech Policy Press fellow, and he writes regularly at mail.cyberneticforests.co...

Most Researchers Do Not Believe AGI Is Imminent. Why Do Policymakers Act Otherwise?March 19, 2025

Artificial General Intelligence: What Are We Investing In?March 24, 2025

Can Democracy Survive Artificial General Intelligence?February 13, 2024

DeepSeek and the Race to AGI: How Global AI Competition Puts Ethical Accountability at RiskJanuary 29, 2025

Perspective

Machines Cannot Feel or Think, but Humans Can, and Ought ToMay 5, 2025

Perspective

The Myth of AGIJune 3, 2025