The Cuomo administration’s recently released summary of contact tracing data was a tantalizing disappointment. Information that could have clarified the risks of different activities during the coronavirus pandemic was presented in such a limited way, with so little context or detail, that it shed minimal light – and may have added to public confusion.
As part of his Dec. 11 briefing, Gov. Cuomo shared a table listing the percentages of 46,000 COVID-19 cases that contact tracers had linked to various exposure sources in September, October and November. It was the first time the state had given even a summary of numbers gleaned from its contact-tracing program.
Topping the list was a category called “household/social gatherings,” which was said to account for 73.84 percent of the traced cases. “Healthcare delivery” was second at 7.81 percent, followed by “higher education student” at 2.02 percent, “restaurants & bars” at 1.43 percent and another 26 groupings with lower percentages.
Cuomo cited the high number for “household/social gatherings” as bolstering his policy against in-home parties of more than 10. “The troubling information in this is 74 percent of the new cases are coming from household gatherings, living room spread,” he said.
But the category in question seemed to encompass a range of possibilities, including transmission within a household (say, from a husband to a wife, or a child to a parent) as well as get-togethers involving outside guests. Lumping these two common scenarios together made it difficult to judge the risk of either exposure source on its own.
Cuomo might be correct that large in-home gatherings should be avoided, but his statistic was not as persuasive as he seemed to suggest.
This was one of many shortcomings in how the data were analyzed and communicated – most of which could only be resolved by sharing more data, context and methodology.
Here are some of the questions that need answers:
What cases could not be traced and why? The state was able to contact-trace only about one-fifth of the 220,000 infections recorded from September through November – meaning the largest category by far would be “unknown.” Presumably some people could not be located while others refused to cooperate. It’s likely that a third group was unable to say how they got sick. These numbers should be spelled out so that the limitations of the data are clear.
How are the categories defined? Some of the categories reference activities or settings, such as “healthcare delivery” and “gyms,” while others reference occupations, such as “education employee” and “professional services.” Education is broken into five categories while health care is combined into one. The groups appear to have blurry and overlapping boundaries, making the percentages in the table hard to interpret.
What happened to the missing 3.4 percent? The percentages for all listed categories add up to 96.6 percent. Officials should account for the other 3.4 percent – or explain why not.
How were individual cases assigned to categories? Some infected people would fall into multiple groupings – such as a high school student who commutes by public transit and works part-time in a restaurant. Since the percentages for all groups total to less than 100, each case was apparently allocated to a single category. The details of this allocation process were not explained.
How confidently could exposure sources be identified? Many if not most people who contract COVID-19 are uncertain how they came to be infected. They might also give misleading or incomplete answers to a contact tracer. This uncertainty should be gauged and disclosed – along with what efforts, if any, were made to verify the exposure sources through clinical testing.
Why aren’t nursing homes a separate category? Nursing home residents have suffered some of the highest mortality rates of any group, and infections and deaths in these facilities have persisted in spite of tight restrictions. The state should provide separate data on exposures traced to nursing homes – or explain why not.
How many cases were traced to weddings and funerals? Large weddings and funerals in defiance of state-imposed restrictions have been a focus of media coverage and public rebukes from the governor. It would be useful to know how many infections were traced to these events. As it is, it’s unclear whether they would fall under “household/social gatherings,” “religious activities” or some other grouping.
How do the categories compare in size and testing levels? The percentage of cases traced to a particular group would depend in part on the scale of the population. Since many more people work in retail than media production, you would expect retail to account for a higher percentage of traced cases even if its exposure risk were relatively low. Another relevant factor is the heavy testing being conducted among health-care workers and on certain college campuses. It seems that groups being tested regularly would tend to be overrepresented in the contact-tracing data – and any findings would need to be adjusted accordingly.
Are traced cases a representative sample? The 46,000 traced cases give state officials a lot of information to work with, but it’s important to know how the demographics of the data set compare to those of the state as a whole. If an ethnic group or geographic region is overrepresented or underrepresented, that might skew the numbers in misleading ways.
All of these questions and more should have been answered – or at least addressed – in any official discussion of contact-tracing data. The fact that they weren’t undermines the credibility of the governor’s message, and raises doubts about the quality of the analysis behind his policy making.
The lack of context also sends confusing messages. Some restaurant and bar operators saw their relatively low share of traced cases, at 1.43 percent, as evidence that restrictions on their businesses could be safely lifted. A more careful presentation of the data might have noted that people typically spend less time in restaurants or bars than they do at home or at work, and that it might be harder to track down people exposed there than, say, in a school or workplace. (The state could also have pointed to studies from other sources that found especially high risk of spread in restaurants and bars.)
A straightforward way for the state to avoid confusion about contact tracing is being more transparent.
The state should release a white paper explaining how the data were gathered and analyzed, filling in details that were left out of the governor’s presentation.
When exposures happen in public places, the state should promptly post the times, dates and places, as some county health departments already do.
The state should also share as much as possible of its raw data with the public, so that outside experts and average citizens can analyze it for themselves.
Contact tracing data are messy and complicated to interpret, but also potentially valuable sources of intelligence about controlling this pandemic and preventing future ones. The more people combing the numbers for clues, the better.