Repository logo

Quantifying expression variability in single-cell RNA sequencing data



Change log


Transcriptional noise is an intrinsic feature of cell populations and plays a driving role in mammalian development, tissue homoeostasis and immune function. While expression heterogeneity, a phenotypic readout of transcriptional noise, has been broadly studied in prokaryotic model systems or by profiling individual genes, few whole-transcriptome studies in mammalian systems have been reported. The development of single-cell RNA sequencing technologies introduced powerful tools to investigate transcriptional differences between individual cells, therefore allowing the in-depth characterisation of expression variability. In this thesis, I computationally analysed single-cell RNA sequencing data to understand transcriptional variability and expanded a statistical model to avoid confounding effects when quantifying such variability. First, I profiled individual transcriptomes of CD4+ T cells, identifying a global decrease in transcriptional variability upon immune activation. By extending this analysis across two sub-species of mice, I identified an evolutionarily conserved set of immune response genes for which transcriptional variability increases during ageing. I used a Bayesian modelling framework to quantify mean expression and transcriptional variability but due to a strong confounding effect between these two parameters, variability analysis was restricted to genes that are similarly expressed across the tested conditions. To address this problem, I extended the computational framework allowing the parallel assessment of changes in mean expression and variability. Within this Bayesian framework, I introduced a joint prior linking mean expression and variability parameters, which allowed a residual over-dispersion to be measured for each gene. This measure allowed me to statistically assess changes in variability even for genes with differences in mean expression between conditions. Finally, I applied the model to identify temporal changes in variability over the time-course of spermatogenesis. This unidirectional differentiation process involves several complex steps before mature sperm form from spermatogonial stem cells. When profiling changes in variability across this developmental time-course, peaks in variability are caused by rapid changes in gene expression along the differentiation trajectory. This thesis provides a deeper understanding of technical and biological factors that drive transcriptional variability and offers a basis for future research to characterise its role in health and disease.





Marioni, John


single-cell RNA sequencing, transcriptional noise, Bayesian statistics, spermatogenesis, immune system, ageing, T cells, regression


Doctor of Philosophy (PhD)

Awarding Institution

University of Cambridge
Funding was provided via the EMBL international PhD programme