Repository logo
 

Transcription Factor Binding Dynamics and Spatial Co-localization In Human Genome


Type

Thesis

Change log

Authors

Ma, Xiaoyan 

Abstract

Transcription factor (TF) binding has been studied extensively in relation to binding site affinity and chromosome modifications; however, the relationship between genome spatial organisation and transcription factor binding is not well studied. Using the recently available high resolution Hi-C contact map of human GM12878 lymphoblastoid cells, we investigated computationally the genome-wide spatial co-localization of transcription factor binding sites, for both within the same type and between different types.

First, we observed a strong positive correlation between site occupancy and homotypic TF co-localization based on Hi-C contacts, consistent with our predictions from biophysical simulations of TF target search. This trend is more prominent in binding sites with weak binding sequences and within enhancers, suggesting genome spatial organisation plays an essential role in determining binding site occupancy, especially for weak regulatory elements.

Furthermore, when investigating spatial co-localization between different TFs, we discovered two distinct co-localization networks of TFs in lymphoblastoid cells, one of which is enriched in lymphocyte specific pathways and distal enhancer binding. These two TF networks have strong biases for either the A1 or A2 chromosome subcompartment, but nonetheless are still preserved within each, indicating a potential causal link between cell-type-specific transcription factor binding and chromosome subcompartment segregation. We called 40 pairs of significantly co-localized TFs according to the genome wide Hi-C contact map, which are enriched in previously reported, physical interactions, thus linking TF spatial network to co-functioning.

In addition to the above main project, I also worked on a side project to find compute-efficient ways in scaling binding site strength across different TFs based on Position-Weight-Matrices (PWM). While common bioinformatics tools produce scores that can reflect the binding strength between a specific TF and the DNA, these scores are not directly comparable between different TFs. We provided two approaches in estimating a scaling parameter λ to the PWM score for different TFs. The first approach uses a PWM and background genomic sequence as input to estimate λ for a specific TF, which we applied to show that λ distributions for different TF families correspond with their DNA binding properties. Our second method can reliably convert λ between different PWMs of the same TF, which allows us to directly compare PWMs that were generated by different approaches.

Description

Date

Advisors

Adryan, Boris

Keywords

transcription factor, Hi-C contact map, chromatin organisation, Position Weight Matrix

Qualification

Doctor of Philosophy (PhD)

Awarding Institution

University of Cambridge
Sponsorship
CSC scholarship from Chinese Scholarship Council