Repository logo

Aggregation tests identify new gene associations with breast cancer in populations with diverse ancestry.

Accepted version



Change log


Mueller, Stefanie H 
Lai, Alvina G 
Valkovskaya, Maria 
Michailidou, Kyriaki 
Bolla, Manjeet K 


BACKGROUND: Low-frequency variants play an important role in breast cancer (BC) susceptibility. Gene-based methods can increase power by combining multiple variants in the same gene and help identify target genes. METHODS: We evaluated the potential of gene-based aggregation in the Breast Cancer Association Consortium cohorts including 83,471 cases and 59,199 controls. Low-frequency variants were aggregated for individual genes' coding and regulatory regions. Association results in European ancestry samples were compared to single-marker association results in the same cohort. Gene-based associations were also combined in meta-analysis across individuals with European, Asian, African, and Latin American and Hispanic ancestry. RESULTS: In European ancestry samples, 14 genes were significantly associated (q < 0.05) with BC. Of those, two genes, FMNL3 (P = 6.11 × 10-6) and AC058822.1 (P = 1.47 × 10-4), represent new associations. High FMNL3 expression has previously been linked to poor prognosis in several other cancers. Meta-analysis of samples with diverse ancestry discovered further associations including established candidate genes ESR1 and CBLB. Furthermore, literature review and database query found further support for a biologically plausible link with cancer for genes CBLB, FMNL3, FGFR2, LSP1, MAP3K1, and SRGAP2C. CONCLUSIONS: Using extended gene-based aggregation tests including coding and regulatory variation, we report identification of plausible target genes for previously identified single-marker associations with BC as well as the discovery of novel genes implicated in BC development. Including multi ancestral cohorts in this study enabled the identification of otherwise missed disease associations as ESR1 (P = 1.31 × 10-5), demonstrating the importance of diversifying study cohorts.



Journal Title

Genome Med

Conference Name

Journal ISSN


Volume Title


BioMed Central
European Commission Horizon 2020 (H2020) Societal Challenges (633784)
European Commission Horizon 2020 (H2020) Societal Challenges (634935)
National Cancer Institute (U19CA148537)
National Cancer Institute (R01CA128978)
National Cancer Institute (U19CA148065)
Wellcome Trust (203477/Z/16/Z)
Cancer Research UK (A10710)
Cancer Research UK (A12014)
Cancer Research UK (A10118)
National Cancer Institute (P30CA023100)
Medical Research Council (G1000143)
Cancer Research UK (A16563)