Repository logo
 

Correlation-Adjusted Regression Survival Scores for High-Dimensional Variable Selection

Published version
Peer-reviewed

Loading...
Thumbnail Image

Type

Article

Change log

Authors

Welchowski, Thomas 
Schmid, Matthias 

Abstract

Background The development of classification methods for personalized medicine is highly dependent on the identification of predictive genetic markers. In survival analysis it is often necessary to discriminate between influential and non-influential markers. It is common to perform univariate screening using Cox scores, which quantify the associations between survival and each of the markers to provide a ranking. Since Cox scores do not account for dependencies between the markers, their use is suboptimal in the presence highly correlated markers. Methods As an alternative to the Cox score, we propose the correlation-adjusted regression survival (CARS) score for right-censored survival outcomes. By removing the correlations between the markers, the CARS score quantifies the associations between the outcome and the set of “de-correlated” marker values. Estimation of the scores is based on inverse probability weighting, which is applied to log-transformed event times. For high-dimensional data, estimation is based on shrinkage techniques. Results The consistency of the CARS score is proven under mild regularity conditions. In simulations with high correlations, survival models based on CARS score rankings achieved higher areas under the precision-recall curve than competing methods. Two example applications on prostate and breast cancer confirmed these results. CARS scores are implemented in the R package carSurv. Conclusions In research applications involving high-dimensional genetic data, the use of CARS scores for marker selection is a favorable alternative to Cox scores even when correlations between covariates are low. Having a straightforward interpretation and low computational requirements, CARS scores are an easy-to-use screening tool in personalized medicine research.

Description

Keywords

biomarker discovery, breast cancer, multigene signature, personalized medicine, prostate cancer, survival modeling, Aged, Aged, 80 and over, Biomarkers, Tumor, Breast Neoplasms, Computer Simulation, Female, Genetic Markers, Humans, Male, Middle Aged, Proportional Hazards Models, Prostatic Neoplasms, Regression Analysis, Survival Analysis, Watchful Waiting

Journal Title

Statistics in Medicine

Conference Name

Journal ISSN

1097-0258
1097-0258

Volume Title

38

Publisher

John Wiley & Sons Inc.
Sponsorship
Wellcome Trust (204623/Z/16/Z)
Medical Research Council (MC_UU_00002/7)
This research was supported by the Deutsche Forschungsgemeinschaft (Project SCHM 2966/1-2), Wellcome Trust and the Royal Society (Grant Number 204623/Z/16/Z) and the UK Medical Research Council (Grant Number MC_UU_00002/7)