Repository logo
 

Weighting and moment conditions in Bayesian inference


Type

Thesis

Change log

Authors

Yiu, Andrew 

Abstract

The work presented in this thesis was motivated by the goal of developing Bayesian methods for "weighted" biomedical data. To be more specific, we are referring to probability weights, which are used to adjust for distributional differences between the sample and the population. Sometimes, these differences occur by design; data collectors can choose to implement an unequal probability sampling frame to optimize efficiency subject to constraints. If so, the probability weights are known and are traditionally equal to the inverse of the unit sampling probabilities. It is often the case, however, that the sampling mechanism is unknown. Methods that use estimated weights include so-called doubly robust estimators, which have become popular in causal inference.

There is a lack of consensus regarding the role of probability weights in Bayesian inference. In some settings, it is reasonable to believe that conditioning on certain observed variables is sufficient to adjust for selection; the sampling mechanism is then deemed \textit{ignorable} in a Bayesian analysis. In Chapter 2, we develop a Bayesian approach for case-cohort data that ignores the sampling mechanism and outperforms existing methods, including those that involve inverse probability weighting. Our approach showcases some key strengths of the Bayesian paradigm---namely, the marginalization of nuisance parameters, and the availability of sophisticated computational techniques from the MCMC literature. We analyse data from the EPIC-Norfolk cohort study to investigate the associations between saturated fatty acids and incident type-2 diabetes.

However, ignoring the sampling is not always beneficial. For a variety of popular problems, weighting offers the potential for increased robustness, efficiency and bias-correction. It is also of interest to consider settings where sampling is nonignorable, but weights are available (only) for the selected units. This is tricky to handle in a conventional Bayesian framework; one must either make ad-hoc adjustments, or attempt to model the distribution of the weights. The latter is infeasible without additional untestable assumptions if the weights are not exact probability weights---e.g. due to trimming or calibration. By contrast, weighting methods are usually simple to implement in this context and are virtually model-free.

Chapters 3 and 4 develop approaches that are capable of combining weighting with Bayesian modelling. A key ingredient is to define target quantities as the solutions to moment conditions, as opposed to ``true'' components of parametric models. By doing so, the quantities coincide with the usual definitions if working model assumptions hold, but retain the interpretation of being projections if the assumptions are violated. This allows us to nonparametrically model the data-generating distribution and obtain the posterior of the target quantity implicitly. Crucially, our approaches still enable the user to directly specify their prior for the target quantity, in contrast to common nonparametric Bayesian models like Dirichlet processes.

The scope of our methodology extends beyond our original motivations. In particular, we can tackle a whole class of problems that would ordinarily be handled using estimating equations and robust variance estimation. Such problems are often called semiparametric because we are interested in estimating a finite-dimensional parameter in the presence of an infinite-dimensional nuisance parameter. Chapter 4 studies examples such as linear regression with heteroscedastic errors, and quantile regression.

Description

Date

2021-03-31

Advisors

Goudie, Robert
Tom, Brian

Keywords

Bayesian inference, Moment conditions, Weighted inference, Unequal probability sampling, Semiparametric statistics, Survey sampling

Qualification

Doctor of Philosophy (PhD)

Awarding Institution

University of Cambridge
Sponsorship
MRC (1939711)
MRC (unknown)