Topological Data Analysis for Unsupervised Feature Selection in Large Scale Spatial Omics Data Sets

                Spatial transcriptomics studies are becoming increasingly large and commonplace, necessitating simultaneous analysis of a large number of spatially resolved variables. Correspondingly, a diverse range of methodologies have been proposed to compare the spatial expression structure of genes. Here, we apply persistent homology, a method from topological data analysis, to produce a continuous quantification of spatial structure in a given gene’s expression, and show how this can be used for downstream tasks such as spatially variable gene identification. We explore the unique advantages of topology for this task, deriving biologically meaningful insights into kidney disease and myocardial infarction using public spatial transcriptomics data. We also show how the non-parametric nature of homology enables our methodology to extend naturally to other spatial omics modalities, demonstrating this on a spatial metabolomics sample. Our work showcases the advantages of using a continuous quantification of spatial structure over
                p
                -value based approaches to SVG identification, the potential for developing unified methods for the analysis of different spatial omics modalities, and the utility of persistent homology in big data applications.

Description

Acknowledgements: We would like to thank those who worked to generate, organise and share the publicly available spatial transcriptomics datasets analysed in this paper.

Keywords

Spatial Transcriptomics, Persistent Homology, Topological Data Analysis, Spatially Variable Gene, Mass Spectrometry Imaging

Journal Title

Bulletin of Mathematical Biology

Journal ISSN

0092-8240
1522-9602

Volume Title

88

Publisher

Springer Science and Business Media LLC

Publisher DOI

https://doi.org/10.1007/s11538-026-01618-2

Rights and licensing

Except where otherwised noted, this item's license is described as http://creativecommons.org/licenses/by/4.0/

Sponsorship

Medical Research Council (G117871)

Collections

Jisc Publications Router

Topological Data Analysis for Unsupervised Feature Selection in Large Scale Spatial Omics Data Sets

Published version

Peer-reviewed

Repository URI

Repository DOI

Files

Type

Change log

Authors

Abstract