Comparing COVID-19 risk factors in Brazil using machine learning: the importance of socioeconomic, demographic and structural factors.
The COVID-19 pandemic continues to have a devastating impact on Brazil. Brazil's social, health and economic crises are aggravated by strong societal inequities and persisting political disarray. This complex scenario motivates careful study of the clinical, socioeconomic, demographic and structural factors contributing to increased risk of mortality from SARS-CoV-2 in Brazil specifically. We consider the Brazilian SIVEP-Gripe catalog, a very rich respiratory infection dataset which allows us to estimate the importance of several non-laboratorial and socio-geographic factors on COVID-19 mortality. We analyze the catalog using machine learning algorithms to account for likely complex interdependence between metrics. The XGBoost algorithm achieved excellent performance, producing an AUC-ROC of 0.813 (95% CI 0.810-0.817), and outperforming logistic regression. Using our model we found that, in Brazil, socioeconomic, geographical and structural factors are more important than individual comorbidities. Particularly important factors were: The state of residence and its development index; the distance to the hospital (especially for rural and less developed areas); the level of education; hospital funding model and strain. Ethnicity is also confirmed to be more important than comorbidities but less than the aforementioned factors. In conclusion, socioeconomic and structural factors are as important as biological factors in determining the outcome of COVID-19. This has important consequences for policy making, especially on vaccination/non-pharmacological preventative measures, hospital management and healthcare network organization.