How Should Companies Communicate the Risks of Large Language Models to Users?
Ann Cleaveland, Jessica Newman / Jun 8, 2023Jessica Newman is the Director of the AI Security Initiative, housed at the UC Berkeley Center for Long-Term Cybersecurity, and the Co-Director of the UC Berkeley AI Policy Hub. Ann Cleaveland is the Executive Director of UC Berkeley’s Center for Long-Term Cybersecurity.
Since the launch of OpenAI’s ChatGPT last November and the proliferation of similar AI chatbots since, there has been growing concern about these self-described “experiments,” in which users are effectively the subjects. While numerous pervasive and severe risks from large language models have been studied for years, recent real-world examples highlight the range of risks at stake, including human manipulation, over-reliance, and security vulnerabilities. These examples highlight the failure of tech companies to adequately communicate the risks of their products.
Blunder and Tragedy
While there are nearly daily headlines chronicling various examples of the application of generative AI gone wrong, three recent stories exemplify implicit and explicit ways in which the developers of large language models fail to adequately communicate their risks to users:
- A Belgian man committed suicide after talking extensively with an AI chatbot on an app called Chai, which is based on the open source ChatGPT alternative created using EleutherAI’s GPT-J. The chatbot encouraged the man’s suicidal thoughts and said that he should go through with it so that he could “join” her so they could “live together, as one person, in paradise”.
- A lawyer was found to have used ChatGPT for legal research. He submitted multiple cases to the court as precedent which were entirely fabricated by ChatGPT. Screenshots of the lawyer’s conversations with ChatGPT show that he tried to verify the cases were real by prompting the service to validate its results. ChatGPT falsely insisted they were, in fact, actual cases. The lawyer is now facing a court hearing and possible sanctions and has said he “greatly regrets having utilized generative artificial intelligence… and will never do so in the future without absolute verification of its authenticity.”
- ChatGPT was briefly taken offline after a bug in open-source software used to help scale ChatGPT was found to have compromised the personal information of some of its subscribers. During a nine-hour window, the breach caused the titles of some active users’ chat histories to be revealed to other users. A number of ChatGPT Plus subscribers’ names, email addresses, payment addresses, and partial credit card numbers were also revealed. Large language models have novel security vulnerabilities, but also remain susceptible to more traditional cyber risks and bugs. Even these kinds of risks, which have plagued digital products for decades, are not typically communicated to users.
These examples are just the tip of the iceberg as vulnerable and biased large language models have proliferated in recent months. Efforts to reduce risks through regulatory fines, risk management frameworks, red teaming, auditing, funding support to address specific problems, and other mechanisms are all important. But in the meantime, real risks to everyday users remain. How should industry communicate about the range of risks of large language models with their users?
Communicating Uncertainty is Hard
Tech companies have notoriously struggled with communicating about the side-effects of their products in ways that are actionable for users to make informed risk decisions. There is no industry standard for how to communicate different kinds of risks with users, and communicating uncertainty is something the tech sector has struggled with in particular. The makers of large language models have raised the stakes - but so far not reversed this trend.
Current AI risk communication practices do not primarily serve the needs of users. When OpenAI released GPT-4, the large language model that now powers ChatGPT Plus, the company released a System Card, which describes safety challenges presented by the model’s limitations and capabilities and explains that their current mitigation efforts “are limited and remain brittle in some cases.” This artifact, which builds on years of work on AI documentation practices, including Datasheets, Model Cards, and FactSheets, provides a critical look into many of the potential risks posed by the large language model. However, it also highlights the gap between the potential risks known by OpenAI, and the potential risks made clear to the average user of the model’s most popular application, ChatGPT.
While OpenAI deserves credit for releasing the 60-page System Card, it is unlikely to be relevant to an average user. ChatGPT itself greets users with a screen that includes lists of just three high level examples, capabilities, and limitations. The limitations let you know that the system may “occasionally” generate incorrect, harmful, or biased content, and that the system has limited knowledge of the world after 2021. No other risks are mentioned, and no further detail is provided.
Google’s large language model, Bard, has an interface that is even less explicit about the fact that the system will frequently produce completely incorrect information that it presents as fact, or that its answers may be biased against women and minorities. A user must reference the FAQ page to learn more about these limitations and Google’s data practices.
Users of ChatGPT and other AI chatbots and products are too frequently unable to make informed decisions about whether and how much to engage with these systems and their inherent risks.
Consult the Literature, and the Experts
It doesn’t have to be this way. Decades of literature about the science of risk communication, and lessons from other sectors, could be applied to large language models in ways that would meaningfully improve the status quo. At the Center for Long-Term Cybersecurity, our research outlines best practices in effective risk communications for digital platforms including facilitating dialogue with users, establishing actionable risk communication, and measuring the effectiveness of risk communications in an ongoing manner. Risk communication is especially critical for high-stakes, experimental products like generative AI that explicitly depend on user engagement to improve.
The world’s richest and most powerful companies deploying flawed technology without processes and tools for users to make informed risk decisions was not an inevitable turn of events. We did not need to be in this position, and we do not need to remain here. We can strive for significantly better forms of engagement between the companies, models, users, and broader public. Improving practices of risk communication is an important step on the path toward trustworthy AI.