Repository logo
 

Accounting for the Multiple Natures of Missing Values in Label-Free Quantitative Proteomics Data Sets to Compare Imputation Strategies.

Published version
Peer-reviewed

Repository DOI


Change log

Abstract

Missing values are a genuine issue in label-free quantitative proteomics. Recent works have surveyed the different statistical methods to conduct imputation and have compared them on real or simulated data sets and recommended a list of missing value imputation methods for proteomics application. Although insightful, these comparisons do not account for two important facts: (i) depending on the proteomics data set, the missingness mechanism may be of different natures and (ii) each imputation method is devoted to a specific type of missingness mechanism. As a result, we believe that the question at stake is not to find the most accurate imputation method in general but instead the most appropriate one. We describe a series of comparisons that support our views: For instance, we show that a supposedly "under-performing" method (i.e., giving baseline average results), if applied at the "appropriate" time in the data-processing pipeline (before or after peptide aggregation) on a data set with the "appropriate" nature of missing values, can outperform a blindly applied, supposedly "better-performing" method (i.e., the reference method from the state-of-the-art). This leads us to formulate few practical guidelines regarding the choice and the application of an imputation method in a proteomics context.

Description

Journal Title

J Proteome Res

Conference Name

Journal ISSN

1535-3893
1535-3907

Volume Title

15

Publisher

American Chemical Society (ACS)

Rights and licensing

Except where otherwised noted, this item's license is described as Attribution 4.0 International
Sponsorship
European Commission (262067)
Biotechnology and Biological Sciences Research Council (BB/L002817/1)
his work was supported by the following funding: ANR-2010-GENOM-BTV-002-01 (Chloro-Types), ANR-10-INBS-08 (ProFI project, “Infrastructures Nationales en Biologie et Santé”, “Investissements d’Avenir”), EU FP7 program (Prime-XS project, Contract no. 262067), the Prospectom project (Mastodons 2012 CNRS challenge), and the BBSRC Strategic Longer and Larger grant (Award BB/L002817/1).