Facts are Pieces of a Puzzle, not the Puzzle Itself

Informative isn't just the facts, but the facts in their context

Almost every day, there is a new study, a new development related to the pandemic. We want to understand what the studies mean, and we want to be informed of the developments. But studies and data make sense only when we figure out where they fit: they are pieces of a puzzle that make sense when we find their correct place. This is not always easy, especially in the context of a pandemic that challenges us.

Take this new study about an outbreak in a Kentucky nursing home, including among vaccinated residents. Here are lots of words from the the abstract:

The Kentucky Department for Public Health (KDPH) and a local health department investigated a COVID-19 outbreak in a SNF that occurred after all residents and health care personnel (HCP) had been offered vaccination. Among 83 residents and 116 HCP, 75 (90.4%) and 61 (52.6%), respectively, received 2 vaccine doses. Twenty-six residents and 20 HCP received positive test results for SARS-CoV-2, the virus that causes COVID-19, including 18 residents and four HCP who had received their second vaccine dose >14 days before the outbreak began. An R.1 lineage variant was detected with whole genome sequencing (WGS). Although the R.1 variant has multiple spike protein mutations, vaccinated residents and HCP were 87% less likely to have symptomatic COVID-19 compared with those who were unvaccinated. Vaccination of SNF populations, including HCP, is critical to reduce the risk for SARS-CoV-2 introduction, transmission, and severe outcomes in SNFs. An ongoing focus on infection prevention and control practices is also essential.

What are we looking at here? A nursing home outbreak, 46 infections, three deaths, a variant with concerning mutations.

Here’s one way to headline an article about the study:

The article describes the outbreak:

An unvaccinated health care worker set off a Covid-19 outbreak at a nursing home in Kentucky where the vast majority of residents had been vaccinated, leading to dozens of infections, including 22 cases among residents and employees who were already fully vaccinated, a new study reported Wednesday.

Most of those who were infected with the coronavirus despite being vaccinated did not develop symptoms or require hospitalization, but one vaccinated individual, who was a resident of the nursing home, died, according to the study released by the Centers for Disease Control and Prevention.

Altogether, 26 facility residents were infected, including 18 who had been vaccinated, and 20 health care personnel were infected, including four who had been vaccinated. Two unvaccinated residents also died.

The article isn’t inaccurate. It relays what indeed happened. The headline is descriptive. The article states up top that most of the infected did not develop symptoms or require hospitalization, while noting the one death. It highlights the importance of vaccinating nursing home staff (which is how it came into the facility), and explains that this was a variant that shared a key mutation, E484K, with variants that were suspected of partial immune escape, like  B.1.351 (South Africa) and P.1. (Brazil). I’m not picking on the article at all, it is usually how such studies are represented in responsible outlets: the descriptive facts, in order. This is our accepted practice.

The CDC study also notes an efficacy calculation: “Vaccine was 86.5% protective against symptomatic illness among residents and 87.1% protective among HCP.” I saw multiple attempts on social media to compare this number to the one efficacy number from the trials, usually around 95%:

Two numbers right, both efficacy? Why not try to glean more information by comparing them?

So why am I writing all this?

Because just the facts, in order isn’t always informative in the way we need it to be. 

Let’s step back and start with this question: why did they do this investigation, including sequencing the virus? The answer seems obvious: they did this because it’s a case of vaccine breakthrough in a nursing home, and nursing homes have suffered the brunt of deaths.

But we can’t stop there: we need to think what the why of the study means for our interpretation of the facts it’s presenting to us.

Nursing homes are uniquely vulnerable due to three key reasons. They house the elderly, whose immune systems are weaker. They house people together, which allows for large outbreaks to happen, and also creates conditions for repeated exposure to an infected person, which is different from other potentially crowded places like grocery stores or restaurants where one may get exposed, but just once. Plus, nursing home residents are often already in poorer health than their same-age peers. People move into nursing homes usually because they are already in need of medical care—hence the term “skilled nursing facilities”—and the average life expectancy after moving is about a year. Here’s the statistics:

The average age of participants when they moved to a nursing home was about 83. The average length of stay before death was 13.7 months, while the median was five months. Fifty-three percent of nursing home residents in the study died within six months.

To put this into context, the average 83-year-old has a life expectancy of seven to eight more years. ,̶ ̶w̶h̶i̶c̶h̶ ̶m̶e̶a̶n̶s̶ ̶h̶a̶l̶f̶ ̶o̶f̶ ̶t̶h̶e̶m̶ ̶l̶i̶v̶e̶ ̶e̶v̶e̶n̶ ̶l̶o̶n̶g̶e̶r̶. (Author correction thanks to Sean in the comments: those were means, not medians, and life expectancy is skewed so cannot speak to half of them living longer. Point remains that this population is in poorer-health their same-aged peers.)

There is another crucial piece of context here: the fact that it’s a cluster. Remember how, when discussing the potential vaccine breakthrough cases in Israel, I pointed out that if the 8 cases they found were in clusters—meaning something like two families of four people—that was very different from 8 independent cases. For many statistical analyses, we assume the examples we are sampling are independent from each other, that they all give us separate pieces of information. If the samples are linked—from the same family or the same nursing home—that’s not the case. In this particular example, this matters even more because everyone there was living in a nursing home experiencing an outbreak, so it’s a cluster. 

A cluster differs greatly from what we measured in trials where the participants did not live together or share exposure especially because we know this pathogen is very overdispersed. It oscillates between being aggressively contagious—probably a combination of a person who emits a lot of aerosols and is at the most contagious stage of their infection plus an enclosed space, or repeated exposure in a congregate living facility like this one—and not transmitting onward at all. Various studies find that 80 to 90 percent of people never transmit onward—they are the end of the chain.

Hence, if your exposure takes place while you are a member of a potential cluster, your odds of being infected are much greater than in comparison with exposure that doesn’t occur as part of a cluster. For a pathogen like this, finding transmission events, not infected people, are key because transmission events are near each other. If you find one, you are likely to find more. But that also means that being in a cluster is a worse case scenario, compared with the independent measurements from the trials: one would expect higher attack rates.  In fact, this is very useful information for mitigation: focusing on finding such clusters and “backward-tracing them” to find the source, and then trying to look at other people that might have been exposed within that cluster, rather than trying to trace every infected person’s onward contacts (most of which were going to be dead end anyway) was key to Japan’s comparatively very successful strategy (something I wrote about while explaining overdispersion and its implications). 

When you put this together, what is the information you get from the above study?

You get very, very good, reassuring news about the vaccines.

Here we have as terrible a situation as it gets: an elderly population, congregate living and a variant with concerning mutations. And still, the vaccines were spectacularly protective. Only 6.3% of the vaccinated population in the nursing home developed symptomatic illness compared with 32.3% of those unvaccinated. There was one death among the 127 vaccinated people compared, with 2 among the 62 (note how much smaller the denominator is on the unvaccinated side).

That death of the vaccinated resident is, of course, tragic for that person, and his loved ones, as is the death of the two unvaccinated residents. The report notes that this resident also had a re-infection:

One resident was infected 300 days earlier and had nine consecutive negative RT-PCR tests before reinfection, including two within 30 days of the outbreak. This resident was hospitalized and died.

We already know that there will be some vaccine breakthroughs from the trials, but we also expect even those breakthroughs to be milder compared with the unvaccinated case. However, as we age, our immune systems do not work as well. Vaccines produce weaker responses among the elderly (and you can see this in the trial data as well). This is an unfortunate reality, and having both a re-infection and vaccine breakthrough suggests that this person, sadly, did not have a robust immune response either to infection or vaccination.  This is not always understood, but common colds can cause deadly outbreaks among nursing home residents exactly for this reason: things that most of us brush off without concern are not always the same for them. (This is also why this pathogen is a particular concern to the immunosuppressed).

Finally, when we sample, we also calculate a confidence interval. Here’s the intuition: if you have a tiny sample and/or results all over the place, you are less sure of how representative it is compared with the broad population. So we present a range, not just a point estimate.

Let’s check the confidence intervals here (VE is vaccine efficacy):

VE against symptomatic COVID-19 was 86.5% (95% CI = 65.6%–94.7%) among residents and 87.1% (95% CI = 46.4%–96.9%) among HCP. VE against hospitalization was 94.4% (95% CI = 73.9%–98.8%) among residents; no HCP were hospitalized. Three residents died, two of whom were unvaccinated (VE = 94.4%; 95% CI = 44.6%–99.4%).

The confidence interval (at 95%: meaning where we would expect this to be 95% of the time) for efficacy against symptomatic COVID is 65.6%  to 94.7%. For hospitalization it is 73.9%–98.8%.

For reference here’s the Pfizer trial data (mostly against “wildtype” or the original strain):

Note the efficacy against symptomatic disease among those 55 and up is 93.8%, and the confidence interval ranges from 80.9% to 98.8%. That is comparable, with a lot of overlap, to what we are seeing here, even though that’s still a much younger group than nursing home residents, whose average age is 83 (besides not being a cluster).

This was a lot of words to say: this was very, very encouraging news about the vaccines. Here was a worst-case scenario: elderly, congregate living, a variant with mutations associated with partial immune escape. And still, the vaccine was protective at a level that stood up to the trials which represented a much better scenario. If this vaccine was going to fail to be as protective as the trials, especially against immune escape variants, this was the stress test that could give us an indicator.

So we go back to the original motivation for this post. Facts need the right context so that they can turn into information that’s more useful to us. If I were headlining this, I’d emphasize how, even in a worst case scenario, this vaccine performed very, very well. Something more akin to this:

Going forward, it’s going to be important to examine such vaccine breakthroughs—failures and edge cases can be greatly illuminating. They should, of course, be reported on. People want to understand the details. But facts don’t just float around by themselves, they are pieces of a puzzle. The better we are at understanding that puzzle, and putting the pieces in their correct location, the closer we can be to seeing the broader picture of that puzzle.