Revisiting Generalization for Deep Learning: PAC-Bayes, Flat Minima, and Generative Models

Dziugaite, Gintare Karolina

doi:10.17863/CAM.40428

Revisiting Generalization for Deep Learning: PAC-Bayes, Flat Minima, and Generative Models

Repository URI

https://www.repository.cam.ac.uk/handle/1810/293273

Repository DOI

https://doi.org/10.17863/CAM.40428

Files

Thesis (2.98 MB)

Type

Thesis

Authors

Dziugaite, Gintare Karolina

Abstract

In this work, we construct generalization bounds to understand existing learning algorithms and propose new ones. Generalization bounds relate empirical performance to future expected performance. The tightness of these bounds vary widely, and depends on the complexity of the learning task and the amount of data available, but also on how much information the bounds take into consideration. We are particularly concerned with data and algorithm- dependent bounds that are quantitatively nonvacuous. We begin with an analysis of stochastic gradient descent (SGD) in supervised learning. By formalizing the notion of flat minima using PAC-Bayes generalization bounds, we obtain nonvacuous generalization bounds for stochastic classifiers based on SGD solutions. Despite strong empirical performance in many settings, SGD rapidly overfits in others. By combining nonvacuous generalization bounds and structural risk minimization, we arrive at an algorithm that trades-off accuracy and generalization guarantees. We also study generalization in the context of unsupervised learning. We propose to use a two sample test statistic for training neural network generator models and bound the gap between the population and the empirical estimate of the statistic.

Date

2018-12-12

Advisors

Ghahramani, Zoubin

Keywords

Deep learning, statistical learning theory, Generalization in neural networks, adversarial learning, PAC-Bayesian bounds, generative models

Qualification

Doctor of Philosophy (PhD)

Awarding Institution

University of Cambridge

Rights and licensing

Sponsorship

EPSRC

Collections

Theses - Engineering

Revisiting Generalization for Deep Learning: PAC-Bayes, Flat Minima, and Generative Models

Repository URI

Repository DOI

Files

Type

Change log

Authors

Abstract

Description

Date

Advisors

Keywords

Qualification

Awarding Institution

Rights and licensing

Sponsorship

Collections