A Leap of Fate: Fluid identity, functional modules, and flexible strategies for transcription factor-based reprogramming
Repository URI
Repository DOI
Change log
Authors
Abstract
“The most important reason for going from one place to another is to see what’s in between, and they took great pleasure in doing just that.” – Norton Juster, The Phantom Tollbooth
Direct cell reprogramming is a technique used to convert a cell from one type to another, without a stable intermediate. This is commonly achieved through transcription factor (TF) over-expression in the source cell type, with TF selection based primarily on the target cell type. Historically, most trial TF panels designed for reprogramming have been selected on the basis of developmental literature or reports of prior trials. However, with advancing computational capacity and increasing availability of transcriptomic and regulatory datasets, quantitative predictive frameworks (e.g. Mogrify) are now feasible. This offers the opportunity to streamline TF selection and thus reprogramming.
Such methods may also open the possibility of new applications, including to target secondary or synthetic cell types or to conduct conversions within non-model species, for which there may be limited prior literature in the field. This thesis explores these possibilities, with a view to how flexibility in framework implementation can increase applicability. It also considers how these methodological trials can inform our theoretical model of cell identity, beyond the classic Waddington differentiation landscape, for the future manipulation of non-standard cell types. Overall, the thesis comprises 5 chapters, grouped by thematic relevance into 3 sections.
In Section I, the principle of integrating differential expression (DE) and regulatory network analysis to identify phenotype-critical TFs is illustrated, both beyond (Chapter 1) and within (Chapter 2) the context of an automated predictive framework, i.e. Mogrify. Chapter 1 describes the application of DE to an idiopathic pulmonary fibrosis (IPF) bulk RNA-seq dataset from mouse lung tissue to determine transcriptomic differences between diseased and healthy tissue and between resistant and susceptible strains. Regulatory enrichment analysis of the results shows that targets of the anti-fibrotic TF FOSL1 are overrepresented in the cross-strain expressed gene (DEG) set, suggesting a possible effect of FOSL1 on IPF risk.
Chapter 2 describes the use of Mogrify to predict TFs for the conversion of fibroblasts and induced pluripotent stem cells (iPSCs) to erythrocytes and megakaryocytes. Despite the developmental similarity of these target cell types, the TF prediction sets were distinct. This indicates potential capacity of the framework to identify regulators responsible for differences between these related targets as well as between source and target cell types. In Section II, innovation around the Mogrify framework and handling of predictions is introduced.
Chapter 3 describes the transcriptomic characterisation of cells resulting from the reprogramming protocol designed in Chapter 2 – as well as those resulting from of a second, literature-based protocol – using both single-cell and bulk RNA-seq. Crucially, bulk data was also an input to secondary Mogrify implementation, providing predictions for improvement of reprogrammed cells that are detailed in this chapter. Additionally, Chapter 3 compares TF predictions from iPSC to blood-derived and synthetic myeloid cells, as well as from iPSC and synthetic myeloid cells toward blood-derived myeloid cells.
Chapter 4 handles the flexible implementation of Mogrify to predict TFs for the conversion of CHOK1 cells to hamster plasma cells. It describes the use of orthologue mapping as an adaptive measure to facilitate the input of CHOK1 and hamster data to Mogrify, which required either human or mouse annotation. Inter-run comparison is again detailed, as the single-cell CHOK1 to hamster plasma cell data is complemented with single-cell and bulk CHOK1 and mouse and human plasma cell data, as well as single-cell hamster B cell data. A co-expression analysis of the results of reprogramming suggests that this protocol may constitute an improvement on the existing method of XBP1 or PRDM1/BLIMP1 over-expression.
Chapter 5 attempts to address how predictive framework might be adapted for the design of a truly synthetic cell type: a functionally defined target cell. It proposes the concept of a reprogramming dictionary, comprising functionally annotated and co-regulated gene modules. As proof-of-principle, this is piloted through the construction of co-expressed gene modules from the Tabula Sapiens, annotated on the basis of functional enrichment. The expression of these modules is also calculated and reported for a test lung cancer dataset. Many modules are noted to recombine relative to the reference dataset, including those not functionally enriched. Some cancer cells are noted to recombine normal cellular functions with tumorigenic ones – illustrating in reverse the dictionary’s intended purpose of function-defined non-standard cell type design. Further development of the dictionary concept – and its forward application– is left as an open problem.
