Data Compression with Relative Entropy Coding

Flamich, Gergely

doi:https://doi.org/10.17863/CAM.118972

Data Compression with Relative Entropy Coding

Repository URI

https://www.repository.cam.ac.uk/handle/1810/385303

Repository DOI

https://doi.org/10.17863/CAM.118972

Files

Primary Thesis (3.01 MB)

Type

Thesis

Authors

Flamich, Gergely

Abstract

Data compression forms the foundation of our Digital Age, enabling us to store, process and communicate more information than ever before. Over the last few years, machine learning-based compression algorithms have revolutionised the field and surpassed the performance of traditional methods. Moreover, machine learning unlocked previously infeasible features for compression, such as providing guarantees for users’ privacy or tailoring compression to specific data statistics (e.g., satellite images or audio recordings of animals) or users’ audiovisual perception. This, in turn, has led to an explosion of theoretical investigations and insights that aim to develop new fundamental theories, methods and algorithms better suited for machine learning-based compressors.

In this thesis, I contribute to this trend by investigating relative entropy coding, a mathematical framework that generalises classical source coding theory. Concretely, relative entropy coding deals with the efficient communication of uncertain or randomised information. One of its key advantages is that it extends compression methods to continuous spaces and can thus be integrated more seamlessly into modern machine learning pipelines than classical quantisation-based approaches. Furthermore, it is a natural foundation for developing advanced compression methods that are privacy-preserving or account for the perceptual quality of the reconstructed data.

The thesis considers relative entropy coding at three conceptual levels: After introducing the basics of the framework, (1) I prove results that provide new, maximally tight fundamental limits to the communication and computational efficiency of relative entropy coding; (2) I use the theory of Poisson point processes to develop and analyse new relative entropy coding algorithms, whose performance attains the theoretic optima and (3) I showcase the strong practical performance of relative entropy coding by applying it to image, audio, video and protein data compression using small, energy-efficient, probabilistic neural networks called Bayesian implicit neural representations.

Finally, while most of the work I present in the thesis was motivated by practical data compression, many of the techniques I present are more general and have implications and applications in broader information theory, as well as the development and analysis of general-purpose sampling algorithms.

Date

2024-12-14

Advisors

Hernández Lobato, José Miguel

Keywords

Channel Simulation, Compression, Data Compression, Implicit Neural Representations, Learned Compression, Learned Data Compression, Machine Learning, Poisson Processes, Relative Entropy Coding

Qualification

Doctor of Philosophy (PhD)

Awarding Institution

University of Cambridge

Rights and licensing

Except where otherwised noted, this item's license is described as Attribution 4.0 International (CC BY 4.0)

Collections

Theses - Engineering

Data Compression with Relative Entropy Coding

Repository URI

Repository DOI

Files

Type

Change log

Authors

Abstract

Description

Date

Advisors

Keywords

Qualification

Awarding Institution

Rights and licensing

Collections