Entropy sorting of single cell RNA sequencing data reveals the inner cell mass in the human pre-implantation embryo
Accepted version
Peer-reviewed
Repository URI
Repository DOI
Change log
Authors
Abstract
A major challenge in single cell gene expression analysis is to discern meaningful cellular heterogeneity from technical or biological noise. To address this challenge, we present Entropy Sorting, a mathematical framework that distinguishes genes indicative of cell identity. ES achieves this in an unsupervised manner by quantifying if observed correlations between features are more likely to have occurred due to random chance versus a dependent relationship, without the need for any user defined significance threshold. On synthetic data we demonstrate the removal of noisy signals to reveal a higher resolution of gene expression patterns than commonly used feature selection methods. We then apply ES to human pre-implantation embryo scRNA-seq data. Previous studies failed to unambiguously identify early inner cell mass (ICM), suggesting that the human embryo may diverge from the mouse paradigm. In contrast, ES resolves the ICM and reveals sequential lineage bifurcations as in the classical model. Entropy sorting thus provides a powerful approach for maximising information extraction from high dimensional datasets such as scRNA-seq data.
Description
Keywords
Journal Title
Conference Name
Journal ISSN
2213-6711
Volume Title
Publisher
Publisher DOI
Rights
Sponsorship
Biotechnology and Biological Sciences Research Council (1943266)
Biotechnology and Biological Sciences Research Council (2489150)
Biotechnology and Biological Sciences Research Council (BB/P021573/1)
Biotechnology and Biological Sciences Research Council (BB/T007044/2)