Essays on Tree-based Methods for Prediction and Causal Inference

Change log
O'Neill, Eoghan 

The first chapter of this thesis contains an application of causal forests to a residential electricity smart meter trial dataset. Household specific estimates are obtained for the effect of a Time-of-Use pricing scheme on peak demand. The most and least responsive households differ across education, age, employment status, and past electricity consumption. The results suggest that past consumption information is more useful than pre-trial survey information, which includes building characteristics, household characteristics, and responses to appliance usage questions.

The second chapter explores new variations of Bayesian tree-based machine learning algorithms. Bayesian Additive Regression Trees (BART) (Chipman et al. 2010) and Bayesian Causal Forests (BCF) (Hahn et al. 2020) are state-of-the-art machine learning methods for prediction and causal inference. A number of existing implementations of BART make use of Markov Chain Monte Carlo algorithms, which can be computationally expensive when applied to high-dimensional datasets, do not always perform well in terms of mixing of chains, and have limited parallelizability.

The second chapter introduces four variations of BART that do not rely on MCMC:

  1. An improved implementation of the existing method BART-BMA (Hernandez et al. 2018), which averages over sum-of-tree models found by a model search algorithm, performs well on high-dimensional datasets, and produces more interpretable output than other BART implementations because the output includes a comparatively small number of sum-of-tree models. %, each of which contains (under the default settings) 5 trees. Improvements are made to the model search algorithm, calculation of predictions, and credible intervals.% The algorithm is entirely deterministic.

  2. A treatment effect estimation algorithm that combines the model structure of BCF with the implementation of BART-BMA (BCF-BMA). This method successfully accounts for confounding on observables using the BCF parameterization, while retaining the parsimonious model selection approach of BART-BMA.

  3. A simple alternative BART implementation algorithm that uses importance sampling of models (BART-IS). This approach contrasts with existing MCMC and model-search based approaches in that BART-IS makes fast data-independent draws of many sum-of-tree models. The advantages of this approach are that it is straightforward to implement, fast, and trivially parallelizable.

  4. Bayesian Causal Forests using Importance Sampling (BCF-IS). This is a combination of the BCF model framework with the BART-IS implementation. BART-IS and BCF-IS exhibit comparable performance to BART-MCMC and BCF across a large number of simulated datasets.

The second chapter also includes some illustrative applications. The methods are extendable to multiple treatments, multivariate outcomes, and panel data methods.

The third chapter of this thesis describes how the methods introduced in the second chapter can be generalized from regression and treatment effect estimation for continuous outcomes, to a range of models with various link functions and outcome variables. As examples of how to apply the general approach, Logit-BART-BMA and Logit-BART-IS are introduced with illustrative applications.

Weeks, Melvyn
Machine Learning, Regression Trees, Bayesian Statistics, Causal Inference, Treatment Effects
Doctor of Philosophy (PhD)
Awarding Institution
University of Cambridge