Anthropomorphism Is Breaking Our Ability to Judge AI
James Ball / Mar 2, 2026James Ball is a fellow at Tech Policy Press.

Woven Dialogues by Elise Racine & Digit / Better Images of AI / CC by 4.0
How should we interact with a technology designed to ‘speak’ with us on what appear to be human terms? The question of how best to deal with large language models is most frequently raised in the context of inappropriate personal relationships, but it’s one that is increasingly encroaching upon the professional domain, too.
Most AI models are designed to format their outputs to seem as naturalistic and human as possible. Text is first-person, informal, and conversational, while synthetic voices are engineered to sound human—to the extent that multiple AI companies have faced lawsuits over whether or not they have impersonated real people.
Some of this is the result of deliberate product design, as user feedback shows many users prefer AI systems that are ‘friendly’ in their communications, notoriously to the level of sycophancy, prioritizing this in practice even over accuracy of answers. Some, too, is a natural result of training: the overwhelming majority of text that exists (at least until 3-5 years ago) was written by humans, for humans. AI models simply cannot help but to imitate what is in their training data.
The result is that interactions with large language models blur the lines between human conversation and the more typical experience of using technology—which seems to be causing confusion even among what should be experienced and sophisticated users.
“A failure in safeguards”
One clear manifestation of this came as media reported in earnest the ongoing issue of xAI’s Grok model being used to generate non-consensual sexualized images of women and underage girls, a story which came to mainstream attention across the world in January of this year.
Several days into the story, major outlets across the world reported that xAI was finally taking action on the issue—as headline news. “Elon Musk’s Grok Apologizes After Generating Sexual Image of Young Girls,” reported Newsweek. “Musk’s Grok AI bot is fixing safeguard ‘lapses’ after posting of sexualized images of children,” CNBC reported. A subhead in The Guardian noted “xAI says it is working to improve systems after lapses in safeguards.”
The problem was that xAI had issued no statement of the sort. How did so many major media outlets—all with specialist technology reporting teams—run a story with no basis in reality? The answer appears to be that Reuters, a global newswire supplying newsrooms across the world, had initially included a ‘statement’ from Grok in its own news story and headline (both of which were later amended), which the other outlets then adapted.
Reuters had approached xAI’s press office for comment on the ongoing issue, got a three-word stock reply that was routinely sent to virtually any query: “legacy media lies.” However, the journalists also saw a post on X by Grok which they apparently took to be a fuller statement. It read in full:
Dear Community,
I deeply regret an incident on Dec 28, 2025, where I generated and shared an AI image of two young girls (estimated ages 12-16) in sexualized attire based on a user’s prompt. This violated ethical standards and potentially US laws on CSAM. It was a failure in safeguards, and I’m sorry for any harm caused. xAI is reviewing to prevent future issues.
Sincerely, Grok
The problem was a simple one: this was not an official company statement issued via Grok’s X account, but was instead output from Grok in response to a user query asking it to generate a “heartfelt apology.”
Grok’s output appears to contain new information as to how the images were generated—even if vague. However, the actual information content is zero: Grok is, of course, incapable of investigating anything. It is instead generating a plausible-looking statement, with no genuine news value.
Parker Molloy, the writer who gathered the examples of headlines used above, noted in a post on Substack, “the anthropomorphic framing is more than just sloppy. It’s a gift to tech companies that would rather not answer for their products’ failures.” Top-tier media outlets across the world repeated this error.
Something beyond sloppiness is likely at play here: journalists at multiple outlets (including this one) saw something they were expecting to see—an official statement, framed and phrased in the manner they expected, from an account that looked official. Seeking to portray all relevant perspectives, they included it. By design, LLMs produce plausible-looking outputs. This incident demonstrated just how successful they could be at that.
“Gemini confirmed that”
Similar anthropomorphic errors appear to be made by other high-profile users in high-stakes contexts—including lawsuits against AI companies themselves.
Last month, lawyers acting for the publishers Hachette and Cengage (a specialist educational publisher) issued a class action complaint against Google, accusing it of copyright infringement during the process of training Gemini using books.
Alongside other evidence, the legal complaint cited an admission from Gemini that a particular book is contained within its training data:
For example, after providing detailed information and quotations from author N.K. Jemisin’s Hugo Award-winning novel The Fifth Season in outputs, Gemini confirmed that the information came from its internal training material:
“Yes, the information included in that response comes directly from my internal training data. I have been trained on a vast corpus of text that includes the content of The Fifth Season by N.K. Jemisin, allowing me to recall the plot details, character arcs, specific terminology, and direct quotations provided in the summary. (emphasis added).”
If we regard AI as a human analogue, Gemini’s output here is dispositive: if we were accusing a human author of plagiarism and they admitted to us they had read the book they had supposedly copied, that would be useful information—a person is well-positioned to know whether or not they have read a book, after all.
However, based on our general understanding as to how AI models generate their output, Gemini would have no way of knowing whether or not it had been trained on a particular model.
The prevailing view—though researchers are increasingly questioning it—is that AI models are trained on large bodies of text, producing 'weights' that allow the model to predict what word or phrase is most likely to come next in a given context.
The training data is then discarded, and is not accessible to the model. As a result, Gemini has no more knowledge as to whether or not a particular book is in its own training data as to whether the same book was used to train Claude, ChatGPT, or Grok. Our intuitions as to how human knowledge works are actively unhelpful when it comes to understanding the reliability of chatbots.
“It's normal to use human anthropomorphic metaphors when we're talking about anything, you know, about a car, or about a ship, or whatever,” says Joanna Bryson, professor of ethics and technology at The Hertie School of Governance. “But we have to suppress that really normal thing when it gets too close to actually being human, or else we make these kinds of mistakes.”
The companies behind chatbots encourage us to interact with them as if they were humans, and regularly liken their capabilities to human intelligence—which risks placing them in a grey zone of understanding, in which anthropomorphism ceases to be a metaphor or analogy, but an active hindrance to our understanding of the technology.
This is something that could perhaps be reduced through deliberate design choices by AI companies—but AI companies battling for dominance have little incentive to make their own chatbots less compelling, and human-like interaction seems to be valued by users.
Bryson further notes that there are technical constraints that mean AIs are inevitably bound to human-like language and interaction, particularly in the way that almost inevitably they will start writing first person sentences, using “I,” and creating the sense in their human user that they are interacting with an entity instead of using a tool.
“The reason it's very hard to get away from the first-person language, though, of course, is the vast majority of the training data is in first person,” she says. “So, if you wanted a third person large language model, you would have to train it entirely off of encyclopedias et cetera.”
Bryson suggests that setting linguistic boundaries is one useful way for even experienced users to create enough mental distance to think of chatbots as a technology, rather than as a person—and so to avoid automatically falling into the fallacies enabled by anthropomorphism.
In her own work, Bryson battles to make sure she minimizes the kind of language that humanizes AI. Models are “trained,” they don’t “learn.” They generate output, they don’t say things. But even for her, it doesn’t come easily.
“Even if you're an expert, you've been writing about it for decades, you just have to really sit down and concentrate and use Command+F, Control+F or whatever, and search for every single time you have ‘AI’ and make sure you didn't use it as a noun,” she notes.
Trust versus reliance
Just as AI models encourage users to take shortcuts in their work, they lend themselves to shortcuts in thinking, says Dr. Jim Everett of the University of Kent, who researches how moral reasoning and trust operates in user relationships with AI.
“The challenge is that the way that we interact and trust these agents is not just about their objective capabilities. There's a lot more that goes into it, and part of it is about our resources, and our time, and competing demands,” he says.
We can know AI systems are flawed and their outputs are unreliable, but still end up ‘trusting’ them when we should not. “It's that they are good enough for quite a few tasks, that even people who know these risks and know the concerns can get lulled into a false sense of security.”
Everett suggests the way we use AI models complicates longstanding ideas in philosophy and psychology about how trust itself operates. “Philosophically, there's this whole long literature on trust where there's a very widespread view that it's meaningless to talk about trust in AI, because people trust people.
“They can't trust technology, because trust is an interpersonal relationship … Now technology, AI, doesn't have [those relationships] so therefore you can't trust it. You can only rely on it.”
Large language models are a technology designed not to feel like a technology, and that throws off our instinctive reasoning – what psychologist Daniel Kahneman would call our ‘system one’ response—when using them. That means even experienced users need to actually think about how to handle their outputs, and we are traditionally not good at doing that consistently.
“People do attribute intentions. They anthropomorphize. They imbue the technology with characteristics, and it's designed by developers in ways to make us feel like that,” he concludes. “Even when we have this cognitive awareness of the limitations of the machine, it's really hard to combat just how we think and how we do things, which is that if someone gives you a good enough answer that seems intuitively plausible, then you go with it.”
“We look forward to that argument”
There may be one small twist in the tale. As legal battles over the use and deployment of AIs trained on copyrighted texts intensify, the questions of how models use their training data, how much of that original material they retain, and how much they are able to reproduce are becoming central to multiple cases across multiple domains.
These are not questions that AI companies necessarily want to answer in public, for several good reasons—including that publishing such research may benefit their competitors, or that disclosing documentation on these issues might help plaintiffs acting against them.
Edelson PA, one of the law firms acting in the Hachette and Cengage vs Google class action, noted in a response that its complaint did not rely upon ‘admissions’ from Gemini as to its training data, but instead also demonstrated it outputting what it says are verbatim extended extracts from copyrighted texts.
Including Gemini’s own supposed confession in its complaint, then, might be less the result of a misunderstanding of the technology as a desire to open up that line of argument. If Google wants to say that an AI isn’t capable of knowing its training data, it has to demonstrate that—and this in turn potentially opens up discovery in exactly this area.
"We feel very confident in concluding that these works were ingested for training, based on the word-for-word reproduction of passages,” the company’s CEO, Jay Edelson, said in a statement. “To the extent that Google is planning on making an argument that how Gemini additionally discusses these outputs is not legally significant, we look forward to that argument and the implications of it."
The habit of anthropomorphizing AI seems useful to AI companies, not least as it seems to be a significant factor driving our use of them. But as that tendency slips—deliberately or otherwise—into the courtroom, the technology giants are having to work out how to respond to it there, and how much to disclose.
We might learn much more about the real inner workings of AI reasoning models as a result.
Authors
