Repository logo
 

Use of a Novel Nonparametric Version of DEPTH to Identify Genomic Regions Associated with Prostate Cancer Risk.

Accepted version
Peer-reviewed

Type

Article

Change log

Authors

MacInnis, Robert J 
Schmidt, Daniel F 
Makalic, Enes 
Severi, Gianluca 
FitzGerald, Liesel M 

Abstract

BACKGROUND: We have developed a genome-wide association study analysis method called DEPTH (DEPendency of association on the number of Top Hits) to identify genomic regions potentially associated with disease by considering overlapping groups of contiguous markers (e.g., SNPs) across the genome. DEPTH is a machine learning algorithm for feature ranking of ultra-high dimensional datasets, built from well-established statistical tools such as bootstrapping, penalized regression, and decision trees. Unlike marginal regression, which considers each SNP individually, the key idea behind DEPTH is to rank groups of SNPs in terms of their joint strength of association with the outcome. Our aim was to compare the performance of DEPTH with that of standard logistic regression analysis. METHODS: We selected 1,854 prostate cancer cases and 1,894 controls from the UK for whom 541,129 SNPs were measured using the Illumina Infinium HumanHap550 array. Confirmation was sought using 4,152 cases and 2,874 controls, ascertained from the UK and Australia, for whom 211,155 SNPs were measured using the iCOGS Illumina Infinium array. RESULTS: From the DEPTH analysis, we identified 14 regions associated with prostate cancer risk that had been reported previously, five of which would not have been identified by conventional logistic regression. We also identified 112 novel putative susceptibility regions. CONCLUSIONS: DEPTH can reveal new risk-associated regions that would not have been identified using a conventional logistic regression analysis of individual SNPs. IMPACT: This study demonstrates that the DEPTH algorithm could identify additional genetic susceptibility regions that merit further investigation. Cancer Epidemiol Biomarkers Prev; 25(12); 1619-24. ©2016 AACR.

Description

Keywords

Australia, Genetic Predisposition to Disease, Genome-Wide Association Study, Humans, Machine Learning, Male, Middle Aged, Polymorphism, Single Nucleotide, Prostatic Neoplasms, United Kingdom

Journal Title

Cancer Epidemiol Biomarkers Prev

Conference Name

Journal ISSN

1055-9965
1538-7755

Volume Title

25

Publisher

American Association for Cancer Research (AACR)
Sponsorship
National Cancer Institute (U19CA148537)
National Health and Medical Research Council Australia (Grant ID: 1033452, Senior Principal Research Fellowship, Senior Research Fellowship), Cancer Research UK (Grant IDs: C5047/A7357, C1287/A10118, C1287/A5260, C5047/A3354, C5047/A10692, C16913/A6135 and C16913/A6835), Prostate Research Campaign UK (now Prostate Cancer UK), The Institute of Cancer Research and The Everyman Campaign, The National Cancer Research Network UK, The National Cancer Research Institute (NCRI) UK, National Institute for Health Research funding to the NIHR Biomedical Research Centre at The Institute of Cancer Research and The Royal Marsden NHS Foundation Trust, Prostate Cancer Research Program of Cancer Council Victoria from The National Health and Medical Research Council, Australia (Grant IDs: 126402, 209057, 251533, 396414, 450104, 504700, 504702, 504715, 623204, 940394, 614296), VicHealth, Cancer Council Victoria, The Prostate Cancer Foundation of Australia, The Whitten Foundation, PricewaterhouseCoopers, Tattersall’s