A scientist holds a multiwell plate with cell cultures infected by coronavirus in a Biosafety Level 3 laboratory (high-security laboratory) at the Helmholtz Centre for Infection Research HZI. Photographed in May 2020.

The Challenges of Calculating a Lab Leak Risk


On a sunny September morning in 2011, at a conference on the Mediterranean island of Malta, virologist Ron Fouchier made an announcement that shook the scientific world. His lab, he said, had taken H5N1 avian influenza virus — which kills around 60 percent of people with known cases, but which cannot spread easily from person to person — and altered it to transmit among mammals.

He had created, he told a reporter later that year, “probably one of the most dangerous viruses you can make.”

Fouchier and others have said such research can help scientists prepare for future pandemics. But several thousand miles away in Massachusetts, Lynn Klotz reacted with concern. A physical biochemist, Klotz was on the Harvard University faculty in the 1970s, during contentious disputes about recombinant DNA research. Since 2005, he has been a senior science fellow at the nonprofit Center for Arms Control and Non-Proliferation, where he has written about biological weapons.

Other scientists were also expressing alarm about the risks of Fouchier’s experiments, which were performed at the Erasmus Medical Center in the Netherlands, and of similar research at a University of Wisconsin-Madison lab. Klotz decided to search for an actual figure: What were the odds that a virus would escape from a high-containment lab, and then spark a global pandemic?

Klotz began by scouring the academic literature, searching for records of laboratories working with viruses that are worrisome for pandemics: SARS coronaviruses or especially risky strains of influenza. Then, based on a government risk assessment and past incidents of escape at high-containment laboratories, he estimated the probability that laboratory workers could accidentally become infected with a pathogen — rare incidents that, at least in theory, might seed outbreaks.

Overall, Klotz estimated in 2012, there was perhaps a 50 percent chance that a pathogen would escape from one of those laboratories in the next five-and-half years. His less conservative estimates calculated an even greater likelihood of an escape. (Klotz now describes these results as out-of-date.) Whether that leak would lead to a pandemic or simply fizzle out, he and a collaborator wrote in an article that year, was uncertain, but the risk was “too high.”

Not everyone agrees the risk is so steep, and federal officials point to the extensive measures that high-containment laboratories take to mitigate risks. But, a decade later, such questions are in the news again, amid concerns that the Covid-19 pandemic could have emerged from a laboratory accident. Now 81 years old, Klotz continued to analyze lab accident risk from his home on the New England coast until last year. He has been joined by other scientists, who use the methods of formal risk analysis to estimate the odds of a lab escape.

The figures they have produced can range dramatically. And some experts say producing reliable figures about the risks of a lab-induced pandemic is impossible with existing data, and perhaps even counterproductive.

At the very least, some analysts say, the ranges highlight the large unknowns that remain about laboratory safety — and the challenges of using specific risk estimates to make sense of the complexities of human error and system failures.

In 2014, after a string of embarrassing safety lapses at government laboratories, the federal government placed a moratorium on funding studies that, like Fouchier’s work on H5N1, give a pathogen new, enhanced properties. (Such work is called gain-of-function research, although the exact definition of the term is contested.)

Shortly afterward, two prominent infectious disease experts, Marc Lipsitch and Tom Inglesby, called for a rigorous, quantitative analysis of the risks of such research, “so as to provide specific calculations and information to inform decisions.” (Both scientists assumed senior pandemic response roles in the federal government, at the Centers for Disease Control and Prevention and the White House, respectively, though Inglesby left his position recently; neither was available for comment.)

In their paper, Lipsitch and Inglesby also came up with their own risk assessment, following a similar recipe to what Klotz has also used. It goes something like this: First, the risk analyst draws on historical records of laboratory accidents to estimate the frequency of escapes. In one recent paper, Klotz, for example, drew on a series of nearly 200 incident reports that he received via a Freedom of Information Act request from the National Institutes of Health in 2017. (Klotz’s paper, published on the website of the Center for Arms Control and Non-Proliferation, was not peer-reviewed.)

The reports reveal several cases where workers left their laboratories and later tested positive for tuberculosis, tularemia, and other diseases.

A summary of the reports, which he shared with Undark, is a catalog of mishaps: Researchers have pricked themselves with contaminated needles, dropped plates of monkeypox-infected cells, and spilled a vial of Rift Valley Fever virus. In one 2013 incident, a lightly anesthetized mouse, infected with a genetically altered strain of SARS virus, slipped from a researcher’s hands and ran underneath a lab freezer. (The scientists eventually caught it.) Laboratories maintain extensive safety precautions, and most of these incidents did not cause an infection. But the reports reveal several cases where workers left their laboratories and later tested positive for tuberculosis, tularemia, and other diseases.

Based on the number of those infections that occur in a particular time period, analysts can approximate the probability of such accidents. Next, analysts estimate the odds that an infection from a lab accident would actually start a pandemic. That can depend on many factors, such as the transmissibility of the pathogen and the location of the laboratory, which are fed into sophisticated epidemiological models. Finally, risk analysts try to estimate the number of deaths such a pandemic would cause, based on the mortality rates of various pathogens.

At times, these assessments can grow elaborate. For instance, a 1,021-page risk/benefit analysis of gain-of-function research, commissioned by the U.S. government, estimates a probability for various kinds of mishaps, and then builds complex models to simulate the odds that those events line up to permit a catastrophe. One of their models, attempting to simulate the release of a bird-transmissible virus from a facility, factors in the volume of air inhaled each minute by a typical duck.

Risk analyses may be rigorous, but they can involve a lot of subjective decisions. Which accident data is relevant? When scientists disagree on the lethality of a specific virus, whose results are the most believable? Should the model factor in malicious actors stealing a pathogen — and, if so, how?

“What people often fail to appreciate is how many underlying assumptions there are in these risk analyses,” said Daniel Rozell, a researcher at Stony Brook University and the author of “Dangerous Science,” a 2020 book on science policy and risk analysis. “Very well-informed and reasonable people will often look at the data in totally different ways and come up with entirely different assessments.”

Indeed, risk analyses of pandemic pathogens can vary widely in their conclusions. In their 2014 paper, Lipsitch and Inglesby estimated that for each year of experimentation “on virulent, transmissible influenza virus,” a single laboratory had a .01 to 0.1 percent chance of causing a pandemic. That pandemic, they projected, would kill between 20 million and 1.6 billion people.

Fouchier, in a reply, said they had ignored crucial safety measures in place at many labs. There, he argued, the odds of seeding an outbreak in a given year were more like .0000000003 percent. Such an event, he continued, “would be expected to occur far less frequently than once every 33 billion years.”

“This probability could be assigned the term ‘negligible,’” he added, “given that the age of our planet is only 5 billion years.”

Some experts found that estimate implausible, based on the historical record. SARS viruses alone, for example, have escaped from laboratories at least five times in the past 20 years. (Fouchier declined to comment for this story.)

Perhaps the most authoritative estimates come from the federal government-commissioned risk/benefit analysis. Completed in 2016 by a company called Gryphon Scientific, the report estimated that the odds of a lab causing a pandemic were low, but not zero. An accidental infection acquired from a U.S. influenza or coronavirus laboratory could be expected once every three to 8.5 years, with a roughly 1 in 250 chance the incident would lead to a global pandemic.

“This probability could be assigned the term ‘negligible,’” Fouchier added, “given that the age of our planet is only 5 billion years.”

Most notably, though, the Gryphon report questions whether it’s even possible to produce an accurate estimate of absolute risk. Data on the frequency of human error in the lab is sparse. And, as two of the report’s authors would write in 2017, “the United States has no standardized or comprehensive system for tracking laboratory incidents or near misses in high-containment laboratories,” making it hard to gauge how often such incidents occur. Meanwhile, security breaches — such as someone intentionally letting a pathogen loose — pose unknown, hard-to-quantify risks.

“In the very short amount of paper we spend on talking about absolute risk — because we, you know, had to — we say why it’s a bad idea,” said Rocco Casagrande, an author of the report and the co-founder of Gryphon Scientific, a research and consulting firm that often works for government agencies. In subsequent public writing, Casagrande has argued for more rigorous research into the sources and consequences of laboratory accidents, in order to give policymakers a clearer sense of the possible risks, and to improve safety in facilities.

Klotz, whose estimates have drawn on data obtained through FOIA requests as well as other laboratory accident reports, told Undark he disagrees with Casagrande that there is too little data to produce a specific risk estimate. One recent analysis he conducted found that, in a given five-year period, labs like Fouchier’s have a 2.37 percent chance of seeding a global pandemic.

Such odds, Klotz says, are far too high. He opposes nearly all research on enhanced potential pandemic pathogens. “It doesn’t have to be a very high probability,” he said, “that you would start being afraid of it.”

Formal risk analysis has its roots in nuclear research — where, as with pandemic pathogens, a safety lapse at one facility can have global consequences. By the 1960s, officials at the U.S. Atomic Energy Commission were seeking to quantify the risk of nuclear power accidents. They developed techniques for estimating the odds of an accident and the expected number of lives lost.

Not everyone was convinced by this exercise. Critics of the emerging discipline sometimes described it as a kind of political strategy, aimed at using authoritative-sounding expert pronouncements to quell public debate. Some critics also questioned whether the figures were all that reliable. “Actual risk estimates are always very rough and imprecise,” wrote the philosopher Kristin Shrader-Frechette in a 1991 book on risk. And, she added, “some of the most important aspects of hazards, whether real or perceived, are not amenable to quantification.” That includes unknowns that people may not even think to factor into their analysis.

When, in 2014, the federal government began reevaluating funding for gain-of-function research, at least one adviser raised concerns. Baruch Fischhoff, a psychologist and risk analysis expert at Carnegie Mellon University, served on an advisory panel to the National Science Advisory Board for Biosecurity, which was tasked with providing recommendations on the evaluation process.

“In the end, the moratorium was lifted, and the process that they ended with was not all that different from the starting point,” said Rozell.

Fischhoff, a past president of the Society for Risk Analysis, said the tools can be useful — but, he stresses, they have limits.

“Nobody understands these systems in toto,” he said. Corporations and government regulators may feel pressure to find and use a specific number for the risk — and can often find well-meaning contractors able to fill that need. “I think the whole system has kind of spun out of control,” he added. “Things are impenetrable to members of the general public, largely impenetrable to other technical experts.”

Fischhoff had specific concerns about analyzing pathogen research. “I was really skeptical that you could do formal risk analysis, partly because we don’t have the numbers,” he told Undark. Fischhoff said the Gryphon Scientific team did a “conscientious job” on the report, even as he expressed some reservations about its implications. “It looked authoritative,” he said. “But there was no sense of just how much — you know, what you should do with those numbers, and like most risk analyses, it’s essentially unauditable.” The reason it was difficult to assess, he added, was that the analyses were so complex and the report so long.

Casagrande said he agreed with some of the concerns about calculating absolute risk, although he stressed that the report had helped clarify the risks of gain-of-function research relative to other pathogen research. (For example, he noted, the report finds that experiments altering the transmissibility of coronaviruses could be risky.)

But he said the team’s “desire to show all our work undermined us, in that the report was just too complex.” Today, he said he wonders whether a two-page report that highlighted specific risks and benefits would have been better. “I think a lot of people would have maybe hung on to that a bit more. But unfortunately, what we did is we wrote the Bible, right? And so you could basically just take any allegory you want out of it to make your case.”

A year after the report was published, the National Institutes of Health decided to restart funding of gain-of-function research, with new oversight procedures. To some observers, it seemed that little had changed.

“In the end, the moratorium was lifted, and the process that they ended with was not all that different from the starting point,” said Rozell, the “Dangerous Science” author. “So one might wonder if this is one of those examples of the fig leaf: We were going to do this anyway, and here’s our cover.”

Today, it’s difficult to know exactly how risk analyses are used to evaluate gain-of-function research. The current federal framework for evaluating such work, released in 2017, instructs U.S. Department of Health and Human Services officials to review a risk/benefit analysis of proposed research before recommending whether to fund it. (The report uses the term “research involving enhanced potential pandemic pathogens” rather than gain-of-function research.) But, despite recommendations from the National Science Advisory Board for Biosecurity and the Obama White House, those deliberations are conducted out of the public view. (An NIH spokesperson referred questions about the gain-of-function review process to HHS; HHS did not respond to requests for comment from Undark.)

In February 2022, federal officials announced a review of this process. The new recommendations are expected later this year.

Within the biosafety research community, there has recently been a push to develop better data on how — and how often — accidents occur in laboratories.

Casagrande’s team is currently conducting human reliability studies, with funding from Open Philanthropy, a Silicon Valley-linked foundation. “We’re basically using real clinical settings and simulator settings and watching people make mistakes of various kinds,” he said. “Some of them involve, you know, just: How clumsy are people? How often do we spill crap?” The team is also measuring how often people actually follow protocols, like washing their hands before leaving the lab.

At least in principle, that work may one day help formulate more evidence-based biosafety policies — and give a better sense of how often human errors occur. But as Casagrande points out, even then there are elements of human behavior that are harder to pin down. “You can’t necessarily catch the really big oopsies, like, being told that you can’t work, you can’t travel, because you might have been exposed to a pathogen, and you just ignore it and do it anyway,” said Casagrande. “But you can catch the ‘Oh, I was supposed to wear a lab coat, and I didn’t.”


Michael Schulson is a contributing editor for Undark. His work has also been published by Aeon, NPR, Pacific Standard, Scientific American, Slate, and Wired, among other publications.