Update on the predictability of tall stature from DNA markers in Europeans.
Published version
Peer-reviewed
Repository URI
Repository DOI
Change log
Authors
Abstract
Predicting adult height from DNA has important implications in forensic DNA phenotyping. In 2014, we introduced a prediction model consisting of 180 height-associated SNPs based on data from 10,361 Northwestern Europeans enriched with tall individuals (770 > 1.88 standard deviation), which yielded a mid-ranged accuracy (AUC = 0.75 for binary prediction of tall stature and R2 = 0.12 for quantitative prediction of adult height). Here, we provide an update on DNA-based height predictability considering an enlarged list of subsequently-published height-associated SNPs using data from the same set of 10,361 Europeans. A prediction model based on the full set of 689 SNPs showed an improved accuracy relative to previous models for both tall stature (AUC = 0.79) and quantitative height (R2 = 0.21). A feature selection analysis revealed a subset of 412 most informative SNPs while the corresponding prediction model retained most of the accuracy (AUC = 0.76 and R2 = 0.19) achieved with the full model. Over all, our study empirically exemplifies that the accuracy for predicting human appearance phenotypes with very complex underlying genetic architectures, such as adult height, can be improved by increasing the number of phenotype-associated DNA variants. Our work also demonstrates that a careful sub-selection allows for a considerable reduction of the number of DNA predictors that achieve similar prediction accuracy as provided by the full set. This is forensically relevant due to restrictions in the number of SNPs simultaneously analyzable with forensically suitable DNA technologies in the current days of targeted massively parallel sequencing in forensic genetics.
Description
Keywords
Journal Title
Conference Name
Journal ISSN
1878-0326