Home

Donate
Perspective

Why Musk is Culpable in Grok's Undressing Disaster

Eryk Salvaggio / Jan 7, 2026

Eryk Salvaggio is a fellow at Tech Policy Press.

Elon Musk listens to President Donald Trump deliver remarks at the US-Saudi Investment Forum, Wednesday, November 19, 2025, at the John F. Kennedy Center for the Performing Arts in Washington, D.C. (Official White House photo by Daniel Torok)

Elon Musk’s Grok AI is flooding the social media platform X with user-requested images of non-consensually undressed women, including children. The resulting images—which are being generated at a rate of “about 6,700 every hour,” according to one analysis—can be viewed by visiting Grok’s X profile and scanning its replies and media tab. The response from xAI, which operates Grok and X, was initially to simply insist that “legacy media lies.” Since then, X’s Safety team stated that “[a]nyone using or prompting Grok to make illegal content will suffer the same consequences as if they upload illegal content.”

The corporate response implies that the system is a neutral tool, and that all responsibility for its outputs and the publication of problematic content falls explicitly on its users, who can presumably be kicked off the platform or even encounter legal consequences. But unlike tools like Adobe Photoshop or other generative AI systems for image and video generation, which generally do not automatically publish images generated in response to user prompts to the internet, Grok is integrated into a social media platform that publishes its outputs directly.

This scenario raises a variety of questions related to potential liability, including whether xAI enjoys the intermediary liability protections afforded by Section 230 of the Communications Decency Act, and what responsibility it may face under other laws that contend with nonconsensual intimate imagery (NCII) and child sexual abuse material (CSAM). These are unsettled issues in the law, with some arguing that Section 230 protections should be extended to generative AI systems, even as the framers of that law disagree. The applicability of laws around CSAM and NCII remains “pretty murky,” as legal scholar Mary Anne Franks told The Verge.

As advocates press for regulators to respond and policy experts consider the implications, it is important to look closely at how generative AI systems like Grok, combined as it is in this instance with the social media platform X, actually work. Such close inspection reveals that Grok’s functionality on X is not a black box, but the result of specific design decisions made by executives and engineers at xAI that shape its outputs. Because the platform provides both the generative tool and the means of publication, the company—which reports to Musk, its founder and CEO—is meaningfully responsible for the content it produces and publishes in response to a user prompt.​

The basic ‘model’

To understand why these outputs are the result of a set of choices by xAI rather than the products of a neutral tool, we should first unpack the layers that make up the model. While it’s true that generative AI models respond to user prompts, the resulting content does not come solely from users. Intuitively, we reduce a model to just two sources of influence: the training data, which provides loose rules for the model’s transformation of user inputs, and the user prompt, which triggers those rules to generate specific media.

From this simplified picture of the relationship between the user, the prompt, and the AI architecture, one might argue that the user is primarily responsible for illegal or unsavory outputs. But the technical reality is more complex. As Public Knowledge's John Bergmayer argues, “Their output is new information—content that so transforms the input material that is in their training sets that viewing LLMs as merely publishers of third-party content seems, frankly, disingenuous.”

It’s also essential to remember that the model is a product designed for specific purposes and deployed under selected contexts. There are additional layers to these purposes and contexts that must be unpacked to truly gauge the extent to which companies that build and deploy models are culpable for their outputs.

Human decisions that steer generative AI

There are several elements of AI architecture where companies exert significant influence, if not outright control, over the output of a generative AI model. While companies have less control over models’ responses to individual prompts, companies have enormous leverage in constraining the sphere of possible outputs. Arguably, this is the majority of what the generative AI industry does. Here are just a few examples.

Data curation

First, at the model training stage, there are a set of decisions that companies face as to how closely they will curate training data, or not. For instance, OpenAI has modified its own training data in the past. By fine-tuning its models on conversational data in question-and-answer formats, ChatGPT would more often respond as if answering a question rather than simply extrapolating text. OpenAI has also rewritten image descriptions using its GPT models, rebuilding its image-generation training data based on gaps it identified in scraped user descriptions. OpenAI’s DALL-E also removed pornographic content from its training data using image recognition systems. The results may be imperfect, but the key is that companies choose and structure the data their models use, which directly affects how they behave.

​​System prompts

Prior interventions into Grok make clear that direct, blunt editorial control over models is possible at the system level. In 2025, Grok briefly responded to user queries about nearly any topic with warnings about white genocide. Later, the company blamed the issue on a rogue employee toying with the system prompt. The system prompt is an addendum to all user prompts for a model and is often used to set limits and boundaries on what the model can do, such as tasks to refuse. It can also set the tone of the model’s reply.

Grok is already instructed to behave in a specific way: its system prompt is public. One of these instructions (line 13) clearly tells Grok that, when generating images, prompts with the words teenage or girl “does not necessarily imply underage.” Simply by providing this definition, the model may be more likely to interpret warnings against child abuse material as inapplicable to some prompts that reference the terms “teenage” and “girl.”

The importance of system prompts such as this goes a bit further: in instructing a model on these terms specifically, it may operate as a kind of “don’t think of an elephant” instruction, in which the model is even more likely not only to disregard those terms in evaluating age-appropriate content, but to emphasize or explicitly disregard “teenage” and “girl” in some outputs with text adjacent to the term “underage.” System prompts are imperfect constraints that are easily broken, but they do provide a layer of control over what users are permitted to do.

​Shadow prompting

Another source of editorial influence is prompt transformation, or shadow prompting. Shadow prompting allows companies to invisibly restructure user requests to either optimize performance or filter illegal content. How this intervention operates, what it does, and the extent of those transformations are design decisions taken by engineers. Most users of generative AI products have no idea of the specific interventions taking place without their knowledge when they pose a prompt. In such systems, the user’s prompt is not even directly responsible for producing the content. The model rewrites that prompt, unknown to the user, and passes the new prompt on to the next stages, at which point the content is generated and published.

Post-generation interventions

Many models restrict offensive or illegal content by applying image recognition systems to the output before displaying the image to the user or publishing it on a public social media feed. Grok already marks images as NSFW; users must click the image to view it. This is a content decision: the same image recognition tool could be used to hide the image or block it from being shared on the feed. By creating a system that detects (often, poorly) such content, but choosing to allow this content to be publicly accessible nonetheless, is a human decision about the publication and distribution of material known to be unsafe. Choosing to flag content as NSFW rather than blocking its generation is an editorial decision, arguably making the platform a co-developer as well as the publisher of the content.

The purpose of an AI system is what it does

Generative AI is too often described as uncontrollable. I previously wrote on Tech Policy Press about the “Black Box Myth,” and how companies deploy it to their advantage. But AI architecture is, at its core, a system for shaping words into the outcomes somebody considers desirable. Companies are developing a variety of systems that steer the swarm of words and pixels toward more predictable outcomes, which they define in terms of their own interests.

These interests include profit, but can also be used to steer models as ideological tools of oppression. For example, when they are presented as a looming threat to women who may no longer assert opinions that may inspire hostile users to use such tools to humiliate them.

We are living in a highly misunderstood era of generative AI, where most of the public’s (and policymakers’) understanding of how the technology works lags behind reality or is misconstrued for developments in other fields of AI that operate under dissimilar frameworks. It is equally troubling to oversimplify the complex architecture of these models, especially when doing so obscures the degree of human involvement and accountability in shaping the outcome.

Grok did not emerge organically. It is heavily shaped by the decisions of the people responsible for it. These decisions relate to how the model behaves, what is allowed, and what is restricted. As such, Musk and his team at xAI bear significant responsibility for what Grok produces and what X publishes in response to user prompts. To the extent that the model is permitted to undress users of Musk’s own platforms against their will, the system’s purpose seems clear. It could easily refuse to harass women. Instead, it is explicitly allowed to do so and has been deployed within a system in which it most certainly will.

Authors

Eryk Salvaggio
Eryk Salvaggio is a Gates Scholar researching AI and the humanities at the University of Cambridge and an Affiliated Researcher in the Machine Visual Culture Research Group at the Max Planck Institute, Rome. He was a 2025 Tech Policy Press fellow, and he writes regularly at mail.cyberneticforests.co...

Related

Podcast
The Policy Implications of Grok's 'Mass Digital Undressing Spree'January 4, 2026
Analysis
Tracking Regulator Responses to the Grok 'Undressing' ControversyJanuary 6, 2026
Perspective
Chatbot Grok Doesn’t Glitch—It Reflects XJuly 28, 2025

Topics