Topics in conditional causal inference
With the growth of complex experimental designs and large-scale observational data, causal questions arising in applications are now more targeted and precise. For example, one might ask if the treatment is effective at a particular time point, or if the treatment is effective for a particular individual. To answer many questions of this kind, this thesis concerns conditional causal inference, generally referring to techniques of constructing or interpreting conditional distributions or expectations for inference about a causal effect of interest. This thesis consists of five chapters. In Chapter 1, we first review the classical potential outcomes framework and some basic causal inference methods relevant to this thesis, then provide a summary of the problems and methods studied in the following chapters. In Chapter 2, we consider testing causal effects in complex experimental designs via conditional randomization tests (CRTs). The CRTs we define are randomization tests conditioning on a subset of treatment assignments tailored to the effect of interest. Because many potential outcomes are missing in complex designs, a single CRT is rarely powerful. We develop a general theory for constructing multiple jointly valid CRTs in arbitrary designs. Following this theory, we propose practical methods that can collect and combine statistical evidence in different parts of an experiment to test a global effect of interest. Under a general framework of CRTs, we connect and discuss randomization tests developed for different statistical problems in the literature, which may be of independent interest. The following three chapters concern the problem of estimating conditional average treatment effects (CATEs). CATEs quantify individual-level treatment effects by conditioning on individual covariates. In Chapter 3, we consider estimating CATEs in the presence of high-dimensional covariates. We propose a neural network-based dimensionality reduction method that can transform high-dimensional covariates into a low-dimensional and informative representation. Neural network models are overparameterized and non-convex. We propose a sample-splitting and randomization method that enables the representation to be partially identifiable and converge consistently. In Chapter 4, we consider estimating CATEs in the presence of imbalanced treated and control populations. We take a Bayesian method to measure the overlap between the two populations and rebalance the populations by minimizing the posterior variances of counterfactual outcomes. We propose a PAC-Bayes generalization bound to show that this method is beneficial and consistent in estimating CATEs. In Chapter 5, we introduce a recursive partitioning method that can convert any black-box CATE estimates into interpretable subgroups. Our method uses a distribution-free technique called conformal prediction to quantify the uncertainties in CATE estimates, then leverage the uncertainties to construct robust subgroups. It leads to more well-identified subgroups and fewer false discoveries due to random noise in the data. All the methods proposed in this thesis are tested using multiple simulations or datasets. Overall, experimental results support our theories and demonstrate the advantages of our methods compared with some baseline methods.