Repository logo
 

Accuracy of Gene Scores when Pruning Markers by Linkage Disequilibrium.

Published version
Peer-reviewed

Type

Article

Change log

Authors

Dudbridge, Frank 
Newcombe, Paul J 

Abstract

OBJECTIVE: Gene scores are often used to model the combined effects of genetic variants. When variants are in linkage disequilibrium, it is common to prune all variants except the most strongly associated. This avoids duplicating information but discards information when variants have independent effects. However, joint modelling of correlated variants increases the sampling error in the gene score. In recent applications, joint modelling has offered only small improvements in accuracy over pruning. We aimed to quantify the relationship between pruning and joint modelling in relation to sample size. METHODS: We derived the coefficient of determination R2 for a gene score constructed from pruned markers, and for one constructed from correlated markers with jointly estimated effects. RESULTS: Pruned scores tend to have slightly lower R2 than jointly modelled scores, but the differences are small at sample sizes up to 100,000. If the proportion of correlated variants is high, joint modelling can obtain modest improvements asymptotically. CONCLUSIONS: The small gains observed to date from joint modelling can be explained by sample size. As studies become larger, joint modelling will be useful for traits affected by many correlated variants, but the improvements may remain small. Pruning remains a useful heuristic for current studies.

Description

Keywords

Genetic Markers, Genetic Variation, Humans, Linkage Disequilibrium, Models, Genetic, Quantitative Trait, Heritable, Sample Size

Journal Title

Hum Hered

Conference Name

Journal ISSN

0001-5652
1423-0062

Volume Title

80

Publisher

S. Karger AG