Machine learning to model health with multimodal mobile sensor data


Thumbnail Image
Change log

The widespread adoption of smartphones and wearables has led to the accumulation of rich datasets, which could aid the understanding of behavior and health in unprecedented detail. At the same time, machine learning and specifically deep learning have reached impressive performance in a variety of prediction tasks, but their use on time-series data appears challenging. Existing models struggle to learn from this unique type of data due to noise, sparsity, long-tailed distributions of behaviors, lack of labels, and multimodality.

This dissertation addresses these challenges by developing new models that leverage multi-task learning for accurate forecasting, multimodal fusion for improved population subtyping, and self-supervision for learning generalized representations. We apply our proposed methods to challenging real-world tasks of predicting mental health and cardio-respiratory fitness through sensor data.

First, we study the relationship of passive data as collected from smartphones (movement and background audio) to momentary mood levels. Our new training pipeline, which combines different sensor data into a low-dimensional embedding and clusters longitudinal user trajectories as outcome, outperforms traditional approaches based solely on psychology questionnaires. Second, motivated by mood instability as a predictor of poor mental health, we propose encoder-decoder models for time-series forecasting which exploit the bi-modality of mood with multi-task learning.

Next, motivated by the success of general-purpose models in vision and language tasks, we propose a self-supervised neural network ready-to-use as a feature extractor for wearable data. To this end, we set the heart rate responses as the supervisory signal for activity data, leveraging their underlying physiological relationship and show that the resulting task-agnostic embeddings can generalize in predicting structurally different downstream outcomes through transfer learning (e.g. BMI, age, energy expenditure), outperforming unsupervised autoencoders and biomarkers. Finally, acknowledging fitness as a strong predictor of overall health, which, however, can only be measured with expensive instruments (e.g., a VO2max test), we develop models that enable accurate prediction of fine-grained fitness levels with wearables in the present, and more importantly, its direction and magnitude almost a decade later.

All proposed methods are evaluated on large longitudinal datasets with tens of thousands of participants in the wild. The models developed and the insights drawn in this dissertation provide evidence for a better understanding of high-dimensional behavioral and physiological data with implications for large-scale health and lifestyle monitoring.

Mascolo, Cecilia
machine learning, self-supervised learning, mobile and wearable data, sensing, health
Doctor of Philosophy (PhD)
Awarding Institution
University of Cambridge
EPSRC (2178667)
Engineering and Physical Sciences Research Council (2178667)
The Department of Computer Science and Technology at the University of Cambridge through the EPSRC through Grant DTP (EP/N509620/1), and the Embiricos Trust Scholarship of Jesus College Cambridge