Repository logo

Advances in Reinforcement Learning for Decision Support



Change log


Jarrett, Daniel 


On the level of decision support, most algorithmic problems encountered in machine learning are instances of pure prediction or pure automation tasks. This dissertation takes a holistic view of decision support, and begins by identifying four important problem classes that lie between the two extremes: exploration, mediation, interpretation, and generation---specifically with an eye towards the role of reinforcement learning in helping humans 'close the loop' in sequential decision-making. In particular, we focus on the problems of: exploring new environments without guidance, interpreting observed behavior from data, generating synthetic time series, and mediating between humans and machines. For each of these, we proffer novel mathematical formalisms, propose algorithmic solutions, and present empirical illustration of their utility. In the first instance, we refine our notion of curiosity-driven exploration to separate epistemic knowledge from aleatoric variation in hindsight, propose an algorithmic framework that yields a simple and scalable generalization of curiosity that is robust to all types of stochasticity, and demonstrate state-of-the-art results in a popular benchmark. In the second instance, we formalize a unifying perspective on inverse decision modeling that generalizes existing work on imitation learning and reward learning while opening up a broader class of research problems in behavior representation, and instantiate an example for learning interpretable representations of boundedly rational decision-making. In the third instance, we propose a probabilistic generative model of time-series data that optimizes a local transition policy by reinforcement from a global energy model learned by contrastive estimation, draw a rich analogy between synthetic generation and sequential imitation, and verify that it yields useful samples on real-world datasets. In the fourth instance, we formalize the sequential problem of online decision mediation with abstentive feedback, propose an effective solution that seeks to trade off immediate loss terms against future improvements in generalization error, and illustrate its efficacy relative to applicable benchmark algorithms on a variety of metrics. Like so, this dissertation contributes and advances a broader perspective on machine learning for augmenting decision-making processes.





van der Schaar, Mihaela


Decision Support, Generative Modeling, Reinforcement Learning


Doctor of Philosophy (PhD)

Awarding Institution

University of Cambridge