What would a recurrent surge of infections look like?
As fifty states with varying intensity of public health approaches to decrease the impact of this highly contagious disease begin to loosen their restrictions, how will we be able to recognize the very real threat of a “second-peak” surge of infections?
Most real experts agree that aggressive testing, new-case finding, and tracking of contacts (backwards and forwards) will be important– indeed critically essential. Small local micro-outbreaks need to be identified quickly and dealt with aggressively. This is going to be a challenge for a number of reasons!
We have become accustomed to seeing a variety of graphs and tables in our public media used to show the status of the epidemic and the hoped-for success in dealing with it. Such macro-presentations will continue– including by me! The problem is that by the time the significance of a given graph becomes evident, the horse may already be out of the barn and running. Nonetheless, what we can’t count, we can’t control. One important metric thought to justify a loosening of restrictions is a sustained two-week decrease in the number of new cases in a given locality. What would this look like in the different possible data visualizations?
Various reports and models use different valid analytical approaches, but care is needed not to unintentionally misinterpret the results. Such graphic representations are used not just to see how things are today, but to predict where we will be in the future. For example, when we visualize positive cases, we can look at daily counts or cumulative counts. Because daily counts vary widely depending on the timeliness of reports and weekend interruptions, using weekly averages is common. When we try to predict where the graphs are going, do we start with the first case, or begin with the 100th to allow matters to settle down? To deal with the difficulty of comparing small numbers that rise exponentially to big ones at the same time, it is common to use logarithmic scales in graphs. What does a “plateau” of cases look like in such different visualizations? Based on current status, how can we tell if things are really getting better– or how much worse they might be?
I prepared the following data visualizations to educate myself what a plateau of the counts of cases, deaths, or tests would look like. What would a 14-day decrease look like? I invite the reader or viewer to step through the seven different graphic representations of the number of current Covid-19 cases in Kentucky and two different futures. I hope the annotated figures are self-explanatory. In these hypothetical scenarios, I used the actual Kentucky case counts from the first reported case through May 4th. I then assumed that the number of new cases would plateau at 170 per day for the next 14 days, and that thereafter the number of new cases daily would decrease by 10 each day until there were no new ones. I was not 100 percent certain in advance what they would look like! I hope you find them useful too. The fully interactive versions can be accessed on the Institute’s Tableau Public website.
Warning: The rest of this article gets quite technical. I am asking for advice from other data nerds about how to monitor the nations’s easing up on its social distancing. Even if you do not look at the 7 figures in detail, at least notice how the same data can be looked at in different ways and the nature of the numbers we have to work with. Trends will be magnified or minimized by the choice of axes or other data transformations!
Old fashioned bar chart.
Below is a basic bar graph of daily new cases of Covid-19. Easy to understand. In this and in the plots that follow, I used real Kentucky case data up through May 4. For the next 14 days I hypothesized a constant 170 daily new cases. For the next 14 days I assumed that there were 10 fewer cases each day until there were none. [Click on the figures to see full size.]
Linear vs. logarithmic scales.
The next plot of new daily cases illustrates the problem when the daily count, for a variety of reasons, varies greatly. Hard to tell what direction things are going! For want of a better term, I will refer to this kind of plot as having a linear Y-axis, as opposed to a Semi-log plot. This kind of plot can be amenable to linear regression modeling approaches if the data distribution is not too wild.
When there is a large range of rapidly increasing numbers as in the initial stages of an epidemic, it is useful to use a semi-log plot in which the vertical Y-axis depicts the logarithm of the number of cases. This scale contracts at the higher numbers. [Don’t feel bad if you don’t remember your highschool advanced algebra. I didn’t’ either.] Governor Beshear has used semi-log plots in his daily presentations. What I wanted to see was what the exponential curve of such a plot would look like if the number of new cases plateaued at a constant number, or actually decreased. This figure tells me what to look for.
Cumulative case data.
We see a lot of curves of the cumulative number of cases like this one below. In the early stages of exponential (like doubling) growth, such curves can look like they are going straight up. I am hard pressed to find an inflection point in this curve until there is almost no growth at all, even though the numbers have stabilized or are improving.
When the same cumulative case numbers as above are plotted in a semi-log manner, we see the resulting curve approaching a horizontal line parallel to the X-axis. By inspection however, I am unable to say when the favorable change has occurred.
One way to deal with widely scattered data is to plot a moving average. In both Kentucky and nationally it is easy to demonstrate that weekend- or other interruptions, or the discovery of clusters of cases (such as in nursing homes) are understandable causes of variation. Plotting a 7-day moving average where the number of new cases on a given day is averaged with those of the 6 proceeding days smooths out the curve allowing more confidence in predicting the near-future. In such a plot, we observe that when the number of new cases becomes constant, a horizontal line results, but this change and the linear decline when there are no more new cases lags a few days their respective transitions.
For completeness, below is the same 7-day plot of new daily cases using a semi-log axis. Certainly the curve in the higher ranges is smoothed out, and the numbers average towards the horizontal when there is no change in daily rate. The nature of a semi-log plot is that it tends not to yield straight lines unless there is no change at all. Nonethe less, a plateau and a decrease are determinable.
I appreciate those readers who stuck it out to the end of this display. More to the point, I would appreciate any feedback or advice on how to do this better. What mistakes of math or interpretation have I made? Are there standards or recommendations from the epidemiology community that I should know as I attempt to monitor what is happening in states that are opening up their economy widely?
Peter Hasselbacher, MD
Emeritus Professor of Medicine, UofL
May 5, 2020