Repository logo
 

Bets: The dangers of selection bias in early analyses of the coronavirus disease (covid-19) pandemic

Accepted version
Peer-reviewed

Type

Article

Change log

Authors

Ju, N 
Bacallado, S 
Shah, RD 

Abstract

The coronavirus disease 2019 (COVID-19) has quickly grown from a regional outbreak in Wuhan, China to a global pandemic. Early estimates of the epidemic growth and incubation period of COVID-19 may have been biased due to sample selection. Using detailed case reports from 14 locations in and outside mainland China, we obtained 378 Wuhan-exported cases who left Wuhan before an abrupt travel quarantine. We developed a generative model we call BETS for four key epidemiological events---Beginning of exposure, End of exposure, time of Transmission, and time of Symptom onset (BETS)---and derived explicit formulas to correct for the sample selection. We gave a detailed illustration of why some early and highly influential analyses of the COVID-19 pandemic were severely biased. All our analyses, regardless of which subsample and model were being used, point to an epidemic doubling time of 2 to 2.5 days during the early outbreak in Wuhan. A Bayesian nonparametric analysis further suggests that about 5% of the symptomatic cases may not develop symptoms within 14 days of infection and that men may be much more likely than women to develop symptoms within 2 days of infection.

Description

Keywords

Epidemiology, infectious disease, Bayesian nonparametrics, selection bias, epidemic growth, incubation period

Journal Title

Annals of Applied Statistics

Conference Name

Journal ISSN

1932-6157
1941-7330

Volume Title

15

Publisher

Institute of Mathematical Statistics

Rights

All rights reserved