Repository logo
 

Fine mapping chromatin contacts in capture Hi-C data.

Accepted version
Peer-reviewed

Change log

Authors

Eijsbouts, Christiaan Q 
Burren, Oliver S 
Newcombe, Paul J 

Abstract

BACKGROUND: Hi-C and capture Hi-C (CHi-C) are used to map physical contacts between chromatin regions in cell nuclei using high-throughput sequencing. Analysis typically proceeds considering the evidence for contacts between each possible pair of fragments independent from other pairs. This can produce long runs of fragments which appear to all make contact with the same baited fragment of interest. RESULTS: We hypothesised that these long runs could result from a smaller subset of direct contacts and propose a new method, based on a Bayesian sparse variable selection approach, which attempts to fine map these direct contacts. Our model is conceptually novel, exploiting the spatial pattern of counts in CHi-C data. Although we use only the CHi-C count data in fitting the model, we show that the fragments prioritised display biological properties that would be expected of true contacts: for bait fragments corresponding to gene promoters, we identify contact fragments with active chromatin and contacts that correspond to edges found in previously defined enhancer-target networks; conversely, for intergenic bait fragments, we identify contact fragments corresponding to promoters for genes expressed in that cell type. We show that long runs of apparently co-contacting fragments can typically be explained using a subset of direct contacts consisting of <10% of the number in the full run, suggesting that greater resolution can be extracted from existing datasets. CONCLUSIONS: Our results appear largely complementary to those from a per-fragment analytical approach, suggesting that they provide an additional level of interpretation that may be used to increase resolution for mapping direct contacts in CHi-C experiments.

Description

Keywords

Bayesian statistics, Capture Hi-C, Chromatin conformation, Variable selection, CD4-Positive T-Lymphocytes, Chromatin, High-Throughput Nucleotide Sequencing, Macrophages, Models, Statistical, Promoter Regions, Genetic, Sequence Analysis, DNA

Journal Title

BMC Genomics

Conference Name

Journal ISSN

1471-2164
1471-2164

Volume Title

20

Publisher

Springer Science and Business Media LLC
Sponsorship
Wellcome Trust (107881/Z/15/Z)
Medical Research Council (MC_UU_00002/4)
This work was funded by the MRC (MC UU 00002/4, MC UU 00002/9) and the Wellcome Trust (WT107881).