Machine learning to extract information from noise: application to concrete and cancer detection
Repository URI
Repository DOI
Change log
Authors
Abstract
The availability of large amounts of experimental and computational data together with powerful computers enabled use of machine learning to predict phenomena in physical sciences. Machine learning is a class of methods that start from existing data to train a model and then use this model to predict the quantities of interest. While machine learning has been successfully applied in physical sciences and beyond, it faces a challenge presented by noise in real-world experimental data. This noise leads to inaccuracy and uncertainty in machine learning predictions, limiting applications of machine learning to physical sciences.
In this thesis, we develop a machine learning methodology that extracts information from noise in data, turning the negative impact of noise and uncertainty into an asset for making better predictions. We demonstrate the effectiveness of the methodology in two research areas: concrete and cancer detection.
First, we utilize the methodology in two studies of concrete. In one study, we propose two concrete mixes that were experimentally verified to exhibit improved environmental cost and carbonation. In the other study, we make predictions of long-term concrete strength that were experimentally verified.
Second, we utilize the methodology to extract information from the noise in DNA methylation patterns to detect acute myeloid leukemia – a type of blood cancer – at early stages. We achieve area under receiver operating characteristic curve of above 0.8 for new, previously unseen patients. This means we can correctly distinguish between a healthy and an unhealthy patient prior to the onset of the disease in more than 80% of the cases.
Our methodology is generic and can be applied in many fields, including information engi- neering, autonomous vehicles, and additive manufacturing, where the information embedded in noise can be exploited to accelerate development, understanding, and impact.

