Repository logo
 

Deep Learning Enables Fast and Accurate Imputation of Gene Expression.

Accepted version
Peer-reviewed

Loading...
Thumbnail Image

Type

Article

Change log

Authors

Viñas, Ramon 
Gamazon, Eric R 
Liò, Pietro 

Abstract

A question of fundamental biological significance is to what extent the expression of a subset of genes can be used to recover the full transcriptome, with important implications for biological discovery and clinical application. To address this challenge, we propose two novel deep learning methods, PMI and GAIN-GTEx, for gene expression imputation. In order to increase the applicability of our approach, we leverage data from GTEx v8, a reference resource that has generated a comprehensive collection of transcriptomes from a diverse set of human tissues. We show that our approaches compare favorably to several standard and state-of-the-art imputation methods in terms of predictive performance and runtime in two case studies and two imputation scenarios. In comparison conducted on the protein-coding genes, PMI attains the highest performance in inductive imputation whereas GAIN-GTEx outperforms the other methods in in-place imputation. Furthermore, our results indicate strong generalization on RNA-Seq data from 3 cancer types across varying levels of missingness. Our work can facilitate a cost-effective integration of large-scale RNA biorepositories into genomic studies of disease, with high applicability across diverse tissue types.

Description

Keywords

GTEx, RNA-seq, deep learning, gene expression, generative adversarial networks, imputation, machine learning, transcriptomics

Journal Title

Front Genet

Conference Name

Journal ISSN

1664-8021
1664-8021

Volume Title

12

Publisher

Frontiers Media SA

Rights

All rights reserved
Sponsorship
Medical Research Council (MC_PC_17213)
"La Caixa" foundation National Institutes of Health Armstrong Trust Fund MICA: Mental Health Data Pathfinder NHS Foundation Trust Microsoft