Population- and individual-specific regulatory variation in Sardinia

Change log
Pala, M 
Zappala, Z 
Marongiu, M 
Li, X 
Davis, JR 

Genetic studies of complex traits have mainly identified associations with noncoding variants. To further determine the contribution of regulatory variation, we combined whole-genome and transcriptome data for 624 individuals from Sardinia to identify common and rare variants that influence gene expression and splicing. We identified 21,183 expression quantitative trait loci (eQTLs) and 6,768 splicing quantitative trait loci (sQTLs), including 619 new QTLs. We identified high-frequency QTLs and found evidence of selection near genes involved in malarial resistance and increased multiple sclerosis risk, reflecting the epidemiological history of Sardinia. Using family relationships, we identified 809 segregating expression outliers (median z score of 2.97), averaging 13.3 genes per individual. Outlier genes were enriched for proximal rare variants, providing a new approach to study large-effect regulatory variants and their relevance to traits. Our results provide insight into the effects of regulatory variants and their relationship to population history and individual genetic risk.

gene expression profiling, genome-wide association studies, population genetics, transcriptomics
Journal Title
Nature Genetics
Conference Name
Journal ISSN
Volume Title
Nature Publishing Group
European Commission Horizon 2020 (H2020) Societal Challenges (633964)
M.P. is supported by the European Union’s Horizon 2020 Research and Innovation Programme under grant agreement 633964 (ImmunoAgeing). Z.Z. is supported by the National Science Foundation (NSF) GRFP (DGE- 114747) and by the Stanford Center for Computational, Evolutionary, and Human Genomics (CEHG). Z.Z., J.R.D., and G.T.H. also acknowledge support from the Stanford Genome Training Program (SGTP; NIH/NHGRI T32HG000044). J.R.D. is supported by the Stanford Graduate Fellowship. K.R.K. is supported by Department of Defense, Air Force Office of Scientific Research, National Defense Science and Engineering Graduate (NDSEQ) Fellowship 32 CFR 168a. S.J.S. is supported by the NIHR Cambridge Biomedical Research Centre. The SardiNIA project is supported in part by the intramural program of the National Institute on Aging through contract HHSN271201100005C to the Consiglio Nazionale delle Ricerche of Italy. The RNA sequencing was supported by the PB05 InterOmics MIUR Flagship grant; by the FaReBio2011 “Farmaci e Reti Biotecnologiche di Qualità” grant; and by Sardinian Autonomous Region (L.R. no. 7/2009) grant cRP3-154 to F. Cucca, who is also supported by the Italian Foundation for Multiple Sclerosis (FISM 2015/R/09) and by the Fondazione di Sardegna (ex Fondazione Banco di Sardegna, Prot. U1301.2015/AI.1157.BE Prat. 2015-1651). S.B.M. is supported by the US National Institutes of Health through R01HG008150, R01MH101814, U01HG007436, and U01HG009080. All of the authors would like to thank the CRS4 and the SCGPM for the computational infrastructure supporting this project.