The Hard Problem of Prediction for Conflict Prevention
There is a growing interest in prevention in several policy areas and this provides a strong motivation for an improved integration of machine learning into models of decision making. In this article we propose a framework to tackle conflict prevention. A key problem of conflict forecasting for prevention is that predicting the start of conflict in previously peaceful countries needs to overcome a low baseline risk. To make progress in this hard problem this project combines unsupervised with supervised machine learning. Specifically, the latent Dirichlet allocation (LDA) model is used for feature extraction from 4.1 million newspaper articles and these features are then used in a random forest model to predict conflict. The output of the forecast model is then analyzed in a framework of cost minimization in which excessive intervention costs due to false positives can be traded off against the damages and destruction caused by conflict. News text is able provide a useful forecast for the hard problem even when evaluated in such a cost-benefit framework. The aggregation into topics allows the forecast to rely on subtle signals from news which are positively or negatively related to conflict risk.