Accelerated Optimisation Algorithms for Machine Learning and Image Processing
Repository URI
Repository DOI
Change log
Authors
Abstract
Over the past decade, the overlap between machine learning and image processing has grown so considerably that the two fields have become inseparable. Some of the most important problems affecting science and society fall within these overlapping fields, including the development of self-driving cars and the automated interpretation of medical images. Innovation in machine learning and image processing is boosted by two pillars: extraordinarily complex models, and extremely large datasets. Training complex models on large datasets is challenging, and research developing efficient algorithms to do so has grown in parallel with the rise in machine learning and image processing. Stochastic optimisation algorithms have long been the methods of choice for training machine learning models. For reasons outlined in this thesis, they are natural choices for solving classical problems in machine learning, and their efficiency generally scales with the size of training datasets much better than other algorithms. Stochastic gradient methods have been studied extensively in regulated scenarios: stochastic gradient estimators are generally required to be unbiased, and the optimisation objectives often must be strongly convex with simple non-smooth regularisers (if any non-smoothness is permitted). As the applications of machine learning expand, especially within image processing, their associated optimisation problems become more complex, and researchers can no longer rely on existing theoretical guarantees established under highly controlled scenarios. New optimisation algorithms must be developed to efficiently solve problems at the intersection of machine learning and imaging. This thesis seeks to improve our understanding of stochastic optimisation algorithms and expand these algorithms into new domains, with an emphasis on problems from image processing. We introduce new optimisation algorithms and develop theoretical frameworks to study these methods without assuming unbiased estimators, strong convexity, or other restrictive assumptions that most existing algorithms require. With this flexibility, we in- troduce several algorithms that exhibit state-of-the-art efficiency on problems in machine learning and image processing. Many of the algorithms introduced in this thesis achieve theoretically optimal convergence rates on their respective problem classes. By solving large-scale problems more efficiently, we hope the algorithms in this thesis allow researchers to pursue even more complex problems and explore even larger datasets. With the theoretical frameworks in this thesis, we hope to provide the foundation for a better understanding of the behaviour of optimisation algorithms in machine learning in image processing.