Skip to Main Content

About Regional Matters

These posts examine local, regional and national data that matter to the Fifth District economy and our communities.

Regional Matters

April 23, 2020

Forecasting the COVID-19 Pandemic in the Fifth District

Updated Estimates and Forecasts

June 10, 2020: We have updated estimates and forecasts of infection and mortality rates in the United States and in the constituent states of the Fifth District. The current update is based on data up to June 6 and now also includes mortality projections for the latter. We reestimate the model with the recent data so that the projections reflect both the influence of revised model estimates and more data. A full set of graphs for U.S. and Fifth District forecasts is available online.

For the entire United States, we now project median fatalities of 184,000 with a 95 percent range of 175,000 to 193,000 by early September. The corresponding number of total infections is 3.1 million. Even three months out, we expect the United States to have 10,000 new cases and almost 600 deaths daily. Our forecast intervals have shifted slightly upwards, possibly on account of relaxed social distancing measures in many parts of the United States since early May. Otherwise, the model performs exceedingly well in that projections for daily infections and fatalities become more precise and data realizations continue to lie within error bands.

Over the same horizon, we project 3,400 cumulative fatalities in Virginia, 2,800 in North Carolina, 5,500 in Maryland, and 1,200 in South Carolina, with the District of Columbia and West Virginia below the 1,000 mark, although the per capita rates in the former are exceedingly high. Although forecast intervals have tightened, the data flow for North and South Carolina has started to fall outside of the error bands from the previous forecast, which led to a considerable revision in the estimates. The pattern of new infections and deaths for these two states suggests the effect of early relaxation of lockdown measures, although the per capita rates are still comparatively low. We are developing a richer model for all states that allows forecasts to adjust to variables including the amount of social distancing. Preliminary results and further discussion are here.

The COVID-19 pandemic has affected the U.S. economy to a degree not seen since the Great Depression. The unemployment rate is likely to rise over the short run to 15 percent or more, while the likely collapse in GDP in the second quarter at an annualized rate of perhaps 25 percent is rarely seen outside of major calamities such as natural disasters and wars.

While the economic impact was and continues to be dramatic, policy sprang into action, ranging at first from stay-at-home orders and lockdowns to a host of financial measures. At the same time, the impact of the pandemic has varied across states, with Washington and California having been hit early in February, while New York more recently has become the epicenter of the crisis in the U.S.

At the time of writing, the localities in the Fifth Federal Reserve District, Maryland, West Virginia, Virginia, North Carolina, South Carolina, and Washington, D.C., have only seen a mild incidence of infections, relatively speaking. As has become increasingly clear, one of the hallmarks of SARS-CoV-2, the coronavirus causing COVID-19, is a long incubation period. As a consequence, the onset of widespread symptoms in the infected appears delayed by several weeks when community spread occurs. It is therefore imperative for national, state, and local policymakers to develop a clear picture of the spread of the pandemic.

In this article, we provide estimates and forecasts for the evolution of the pandemic in the Fifth District and its constituent states. The forecasts are based on a simple statistical model for the number of infections over the course of the pandemic that has been developed by the authors of this article. The model’s key feature is that it almost exclusively relies on the statistical properties of the observed data so as to provide a picture of the crisis that is as unbiased as possible.

We pursue a different approach from many of the current projections advanced by economic forecasters in that we do not impose the specific relationships that are implied by theoretical models that draw from epidemiology. In doing so, we potentially overlook insights about the specific behavior of an infection that history has taught epidemiology. At the same time, we gain flexibility in modeling the epidemic and avoid the potential pitfalls of imposing strict behavior of the contagion. This seems particularly important since the specific coronavirus underlying this pandemic has novel features that are potentially inconsistent with existing theoretical models.

Our approach is flexible in that the data largely determine the implied dynamics and time paths for the infections. In spirit, this is akin to economic forecasting models, such as autoregressive models, which rely primarily on the past evolution of economic series rather than model assumptions in describing the data’s underlying statistical behavior and thereby extrapolating into the future—that is, forecasting.

Forecasting Model and Data

It is known from epidemiology that a pandemic follows a typical pattern. At first, the number of infections is low since not many people are infected. However, the growth rate rises sharply as each infected person creates a chain of new infections. There comes a point, however, when the virus runs out of susceptible hosts, either because they are already infected, are immune, or they are simply not physically present because of social distancing. At this inflection point, the growth rate of infections falls until it eventually declines to zero.

We allow for these broad patterns in the evolution of an epidemic by imposing a flexible functional form on its path over time. Our statistical model describes the growth of infections as depending on the current and the lagged levels of the number of infections. The relationship between the explanatory variables is described by a flexible mathematical expression, which is designed to mimic the typical time path for the number of new cases and their total number. In contrast to theoretical epidemiological models, our specification has more leeway to go where the data tell it to and is not constrained by precise theoretical relationships that may be incorrect.

Our statistical model allows us to study the uncertainty of the forecast in a consistent manner. The precision of a forecast, or how tightly possible alternative forecast paths are concentrated around the most plausible path, is generally affected by two factors: first, the uncertainty of the model estimates in terms of overall fit and parameter estimates since no statistical model fits precisely; and second, by the extent to which the model may be subject to further disturbances or imprecision in data collection in the future. In our baseline forecast, we take both aspects into account to give a sense of how uncertain forecasts in a pandemic truly are, especially when the data flow is sparse at the beginning. Arguably, projections based on a theoretical model often give a sense of false precision.

We fit our model to the current observed infections data, after which the estimated model is used to forecast the evolution. The forecasts are based on forward simulations with all potential sources of uncertainty taken into account. We collect daily data from a variety of publicly available sources. The estimates are performed on these data up to and including April 20, 2020. A detailed description of the source data and the empirical model used can be found at this link.

Benchmarking the Fifth District: Infections in the U.S.

The chart below shows the cumulative number of cases, i.e., infections, in the U.S. and the daily count of new cases as a percentage of the population. The estimates suggest that the U.S. is already past the inflection point, after which the number of new infections falls. The peak infection day seems to have occurred in early April, although the data show some degree of uncertainty, as evidence by the wide 95 percent confidence bands around the model estimates. Moreover, recent incoming data have been more volatile.

In addition, the rate of new infections appears to decline rather sharply, which indicates that measures to suppress the spread of the pandemic could be working to some degree. We note that the uncertainty region widens immediately for a few months out, which reflects both the uncertainty about the dynamics of the pandemic and the uncertainty inherent in the data process. That is, the longer the pandemic lasts, the more precisely estimated the incidence of new cases becomes as the rate moves toward zero.

Forecast for the United States

Source: Authors' Calculations Using Publicly Available Data

However, there is still a long way to go, as the forecast suggests that the cumulative number of cases does not stabilize for another three months. The cumulative case load as a percentage of population after three months is forecast to reach 0.49 percent with a range of between 0.46 percent and 0.53 percent, which captures 95 percent of all forecasted possible paths.

Forecasting the Pandemic in the Fifth District

The following chart shows new cases and cumulative cases in the Fifth District. Broadly, the pattern is similar to the entire U.S., albeit at a smaller scale, both in absolute and relative terms. As a percentage of population, cumulative case numbers are about half of those for the nation, roughly a quarter of a percentage point by the end of July.

The forecasts suggest that the District had also reached the inflection point by early April, although the uncertainty is considerable given volatile and rising new case numbers over the last week. This is reflected in an extremely wide 95 percent forecast uncertainty interval for the first few projections out of the sample. It cannot be ruled out that the Fifth District has not reached the inflection point yet and that the number of new infections will keep rising. Although the estimated model does not regard this as likely – the median forecast path shows falling infection numbers – our empirical model does not offer a high enough degree of certainty that this is, in fact, the case.

Forecast for the Fifth District

Source: Authors' Calculations Using Publicly Available Data

The Path of the Pandemic in the States of the Fifth District

As we narrow the scale of the exercise, the number of data points available per state as well as the quality of the data declines. The estimates suggest that all localities are beyond the inflection point for new infections, albeit at a considerable degree of uncertainty. This observation has firmed over the course of the last week as more data points have become available.

Forecast for Individual States

Source: Authors' Calculations Using Publicly Available Data

During the second week of April, we could identify two groups in terms of forecast precision. The first group included Virginia, West Virginia, and Washington, D.C. The number of infections in these states did not allow for sufficient precision to conclude that the pandemic would follow the typical trajectory established above. The purely empirical estimates could not rule out the disease quickly burning out or spreading even more aggressively than it has in New York.

In the case of Virginia, the forecasted peak of new infections ranged from early April to mid-May. The second group was composed of North Carolina, South Carolina, and Maryland. These states yielded forecasts that exhibited a pattern much closer to the observed course of the pandemic in other jurisdictions. All three states appeared beyond the inflection point.

With more incoming data, the most extreme initial forecasts have not come to pass and it appears likely that the coronavirus infections follow typical patterns. The recent data still imply considerable uncertainty, however. For instance, while the estimates suggest that North Carolina has already hit a peak, the volatility of new case numbers in the last few days signal an extremely wide range of uncertainty and it cannot be ruled out that the estimated peak will shift into the future.

Finally, in per capita terms, the states of the Fifth District show a considerable range of infections three months out, ranging from as low as a 10th of a percentage point in West Virginia to 0.5 percent in Maryland in the middle of the pack and a 1 percent infection rate in more densely populated Washington, D.C.

Measuring the impact of the pandemic in per capita terms, as we have done so far in this article, helps highlight locality-specific aspects such as density of population, a more urban or rural environment, or even the strength of mitigation efforts. This does tend to hide the scale of the pandemic in terms of its human cost. We therefore report projections for overall numbers of infected persons over a six-month and 12-month horizon in Table 1. Overall, Maryland is expected to be the most affected state, with a median projected case number of 47,000, followed by Virginia and North Carolina.

Source: Authors' Calculations Using Publicly Available Data


The worldwide pandemic caused by SARS-CoV-2 has thrown daily life into turmoil for hundreds of millions of people, not to mention the costs in human life. It is paramount for public health and economic policymakers to get control of the spread of the virus. A key component of these efforts is understanding how the infection behaves over time. In this respect, our analysis offers a cautionary tale in that the picture of the evolution of the pandemic changes almost daily as new data come in.

In this article, we attempt to estimate the spread of the infection and project its likely path into the future for the Fifth Federal Reserve District and its constituent states. For this purpose, we develop a simple empirical model that allows us to characterize the uncertainty surrounding such forecasts. We find that there is considerable uncertainty in the near term, driven by the variability and the varying quality of the incoming data. Moreover, we find that forecasts change constantly as new data become available. Consequently, we argue that any forecast should be treated with skepticism and that policy should not be based upon rigid point estimates but should instead be focused on probable outcome ranges.

Forecasts from this simple empirical model of the coronavirus pandemic suggest that the Fifth District will be largely spared the scale of the outbreak in other localities in the U.S, chiefly New York, and around the world. Perhaps most importantly, in all Fifth District states the spread of the infection appears to be already beyond its peak in that the number of new infections is forecast to fall, suggesting a successful implementation of social distancing measures of various stringency.

The empirical approach taken in this article also allows for a transparent illustration of uncertainty. While the range of possible outcomes is still wide (and also reflects to some extent the dearth of data), worst case scenarios seem not to be within the range of likely outcomes. Any single forecast path should therefore be treated with some skepticism as the uncertainty is large. One caveat associated with our analysis is that we cannot model the implications of mitigations as we only take into account the observed pattern so far.

Have a question or comment about this article? We'd love to hear from you!

Views expressed are those of the authors and do not necessarily reflect those of the Federal Reserve Bank of Richmond or the Federal Reserve System.

phone Contact Us

Joseph Mengedoth (804) 697-2860