Artificial intelligence hallucinates. So we are told by news headlines, think pieces, and even the warning labels on AI websites themselves. It’s by no means a new phrase. As early as the 1980s, the term was used in the literature on natural language processing and image enhancement, and in 2015 no article on the acid phantasmagoria of Google’s DeepDream could do without it. Today, large language models such as ChatGPT and Bard are said to “hallucinate” when they make incorrect claims not directly based on material in their training sets.
The term has a certain appeal: It uses a familiar concept from human psychiatry as an analogy for the falsehoods and absurdities that spill forth from these computational machines. But the analogy is a misleading one. That’s because hallucination implies perception: It is a false sense impression that can lead to false beliefs about the world. In a state of altered consciousness, for example, a person might hear voices when no one is present, and come to believe that they are receiving messages from a higher power, an alien intelligence, or a nefarious government agency.
A large language model, however, does not experience sense impressions, nor does it have beliefs in the conventional sense. Language that suggests otherwise serves only to encourage the sort of misconceptions that have pervaded popular culture for generations: that instantiations of artificial intelligence work much like our brains do.
If not “hallucinate,” then what? If we wanted to stick with the parlance of psychiatric medicine, “confabulation” would be a more apt term. A confabulation occurs when a person unknowingly produces a false recollection, as a way of backfilling a gap in their memory. Used to describe the falsehoods of large language models, this term marches us closer to what actually is going wrong: It’s not that the model is suffering errors of perception; it’s attempting to paper over the gaps in a corpus of training data that can’t possibly span every scenario it might encounter.
But the terms “hallucination” and “confabulation” both share one big problem: As used in medicine, they each refer to states that arise as a consequence of some apparent malfunction in an organism’s sensory or cognitive machinery. (Importantly, perspectives on what hallucinations and confabulations are — and how they manifest — are profoundly shaped by cultural and social factors.)
The “hallucinations” of large language models are not pathologies or malfunctions; rather they are direct consequences of the design philosophy and design decisions that went into creating the models. ChatGPT is not behaving pathologically when it claims that the population of Mars is 2.5 billion people — it’s behaving exactly as it was designed to. By design, it makes up plausible responses to dialogue based on a set of training data, without having any real underlying knowledge of things it’s responding to. And by design, it guesses whenever that dataset runs out of advice.
It’s not that the model is suffering errors of perception; it’s attempting to paper over the gaps in a corpus of training data that can’t possibly span every scenario it might encounter.
A better term for this behavior comes from a concept that has nothing to do with medicine, engineering, or technology. When AI chatbots flood the world with false facts, confidently asserted, they’re not breaking down, glitching out, or hallucinating. No, they’re bullshitting.
Bullshitting? The philosopher Harry Frankfurt, who was among the first to seriously scrutinize the concept of bullshit, distinguishes between a liar, who knows the truth and tries to lead you in the opposite direction, and a bullshitter, who doesn’t know or care about the truth one way or the other. A recent book on the subject, which one of us co-authored, describes bullshit as involving language intended to appear persuasive without regard to its actual truth or logical consistency. These definitions of bullshit align well with what large language models are doing: The models neither know the factual validity of their output, nor are they constrained by the rules of logical reasoning in the output that they produce. And this is the case, even as they make attempts towards transparency: For example, Bing now adds disclaimers which prime us to its potential for wrong, and even cites references for its answers. But like supercharged versions of the autocomplete function on your cell phone, large language models are making things up, endeavoring to generate plausible strings of text without understanding what they mean.
One can argue that “bullshitting” — which involves deliberate efforts to persuade with willful disregard of the truth — implies an agency, intentionality, and depth of thought that AIs do not actually possess. But maybe our understanding of intent can be expanded: For ChatGPT’s output to be bullshit, someone has to have intent, but that someone doesn’t have to be the AI itself. Algorithms bullshit when their creators design them to impress or persuade their users or audiences, without taking care to maximize the truth or logical consistency of their output. The bullshit is baked into the design of the technology itself.
ChatGPT is not behaving pathologically when it claims that the population of Mars is 2.5 billion people — it’s behaving exactly as it was designed to.
“Bullshitting” may not have the same air of scientific formality as “hallucination” or “confabulation,” but it is closer to the truth of the matter. And the language we use for AI is consequential. It shapes the way we use and interact with the technology, and how we perceive its errors.
There is a gulf between what AI technologies do and what the average user understands them to do. This problem is not unique to AI; it plagues many modern technologies. We’ve learned to live with the comforts — and discomforts — of allowing black boxes to run our lives, from smartphones to video games, and now large language models.
But when it comes to AI, the gap between how it works and what we know has higher stakes. How might ChatGPT influence practices like education and clinical medicine, long defined by meaningful human interactions between experts (teachers and clinicians) and the people they serve (students and patients)? When ChatGPT creates presumptive medical facts on-the-fly — bullshitting, perhaps, about which drug regimen is best for a patient in an examination room next door — the consequences could be corporeal, and dire.
The language we use to discuss the mistakes AI has made and will continue to make is critically important — not to spur symbolic protest against modern technology or fear-mongering over a pending “war against the machines,” but to sharpen the way we think about many sorts of complicated interactions between human users and AI algorithms that seem inevitable. If we wait too long, the flaws may become too big to fix, and our lives beset by problems that we don’t have the right words to describe.
Carl T. Bergstrom is a professor of biology at the University of Washington in Seattle and the coauthor of “Calling Bullshit: The Art of Skepticism in a Data-Driven World.”
C. Brandon Ogbunu is an assistant professor in the Department of Ecology and Evolutionary Biology at Yale University, and an MLK Visiting Assistant Professor in the Department of Chemistry at the Massachusetts Institute of Technology.