CDC reporting of Omicron numbers

After initially estimating that almost three-quarters of U.S. COVID cases between Dec. 11 and Dec. 18 were attributable to the omicron variant, the CDC revised that number on December 28th to just 23%. Since then, they have revised it once more to 38%. Except they didn’t: none of those estimates were as precise as we just made them sound. For example, the current estimate is somewhere between 31.4% and 44.7%.

Polls published during the pandemic suggest that trust in the CDC is faltering both among health professionals and the general public. These and other polls don’t get into the reasons for mistrust, but experts generally agree that trustworthiness depends on the answers to five questions: do I have reason to think this person or institution is: competent, reliable, sincere, benevolent, principled?

Events over the past two years have caused many people to question each of these judgments. Massive revisions to accounts, such as those cited above, call into question the CDC’s reliability, specifically. But this particular mishap was preventable–with better reporting.

The core narrative that the omicron variant spread quickly and is now dominant has proven true. However, especially in light of the mass amount of misinformation around Covid, getting the numbers wrong further erodes public trust in the one institution that both the public and the media should be able to depend on. This is the place where accuracy is the most crucial.

There are two easy things the CDC–and every news outlet that picked up the initial 73% number–could, and should, have done differently. Both of them increase transparency about uncertainty, which is a key part of all emerging science. We’re part of a team of social scientists and journalists who work closely together to encourage better reporting about statistics. Specifically, we’re trying to figure out how to help people without statistical expertise draw more accurate conclusions. Here’s what we’ve learned, and how it could have improved this situation.

First, all visualizations and news reports about omicron’s prevalence should have included confidence intervals. Statistics is, essentially, the science of estimating something about a population by looking carefully at a randomly selected subset of that population (known as a “sample”). Even if the sample is perfectly representative, there will always be some uncertainty. That’s just how probability works, and a confidence interval is a way of representing this kind of uncertainty. It tells us the range where the true value is likely to fall. Although confidence intervals are often known as “margins of error,” they don’t mean someone has made a mistake. They simply help us see the variability inherent in all estimates.

The CDC’s prediction page includes this information in a sidebar, but their visuals don’t show it. The 73% figure made headlines in part because it was so shockingly high—but the original estimate was between 34% and 99%. That’s a truly enormous range, and one that should have been included in the CDC’s own visualization–and every single news report. After all, revising from a range that includes 34% down to 23% seems much less dramatic than revising from 73% down to 23%. At least we’re seeing some improvement in the reporting: the NPR story on the revision notes that the interval remains large.

Second, all reporting should have focused on the limitations of the methods. The CDC’s variant information page notes in small text that “data [from the last few weeks] include Nowcast estimates, which are modeled projections that may differ from weighted estimates generated at later dates.” But the news coverage didn’t include such caveats, for the most part. And even the later weighted estimates rely on local public health reporting, which varies by jurisdiction, as well as statistical procedures for imputing missing data.

The role of news is to keep the public informed, and that means being as clear about what we don’t know as what we do. In terms of public health, it should also mean working with the CDC to ensure that everyone is adhering to best practice in statistical communication. All of this uncertainty is part of how science works. Leaving it out isn’t just simplifying the story – it’s actively misleading.

Jena Barchas-Lichtenstein, Ph.D., is a linguistic anthropologist who leads the media research at Knology. They are co-PI of Meaningful Math, a four-year NSF-funded collaboration with PBS NewsHour to improve statistical reporting.

John Voiklis, Ph.D., leads research on behaviors, norms, and processes at Knology. He trained in social and cognitive psychology and has taught statistics for researchers and research methods for data scientists.