Understanding Biology with Machine Learning: Compression, Intelligibility, and Dependency

Machine learning (ML) is increasingly used to interrogate biological systems whose complexity resists law-like, deductive explanation. As a result, embeddings, clusters, and attributions are often overinterpreted, dependencies are left implicit, and claims about explainability are often insufficiently bounded. In this work, we present a framework for contextualizing how machine learning contributes to scientific understanding in biology via compression, qualitative intelligibility, and dependency models. Compression is achieved when inductive biases encode biological structure, reducing the effective hypothesis space and yielding representations aligned with known biology. Qualitative intelligibility is supported when high-dimensional measurements are mapped to human-graspable objects, such as embeddings, clusters, and trajectories, that enable accurate qualitative reasoning without exact calculation. Dependency modelling is realized when learned models make explicit the pattern of relations among system components and thereby guide prediction and intervention. We examine how these principles manifest in successful ML applications and discuss considerations that emerge from this framework. Overall, when viewed through these lenses, ML can transform predictive success into intervention-guiding knowledge in the life sciences.

Keywords

46 Information and Computing Sciences, 4602 Artificial Intelligence, Data Science, Networking and Information Technology R&D (NITRD), Machine Learning and Artificial Intelligence

Journal Title

Artificial Intelligence in the Life Sciences

Journal ISSN

2667-3185
2667-3185

Publisher

Elsevier

Publisher DOI

https://doi.org/10.1016/j.ailsci.2026.100161

Rights and licensing

Except where otherwised noted, this item's license is described as Attribution 4.0 International

Sponsorship

Horizon Europe UKRI Underwrite ERC (EP/X024733/1)
Royal Society (URF\R1\201461)

Collections

University of Cambridge Research Outputs (Articles and Conferences)