Repository logo
 

How much do model organism phenotypes contribute to the computational identification of human disease genes?

Accepted version
Peer-reviewed

Type

Article

Change log

Authors

Alghamdi, Sarah 
Hoehndorf, Robert 

Abstract

Computing phenotypic similarity has been shown to be useful in identification of new disease genes and for rare disease diagnostic support. Genotype–phenotype data from orthologous genes in model organisms can compensate for lack of human data to greatly increase genome coverage. Work over the past decade has demonstrated the power of cross-species phenotype comparisons, and several cross-species phenotype ontologies have been developed for this purpose. The relative contribution of different model organisms to computational identification of disease-associated genes is not yet fully explored. We use methods based on phenotype ontologies to semantically relate phenotypes resulting from loss-of-function mutations in different model organisms to disease-associated phenotypes in humans. Semantic machine learning methods are used to measure how much different model organisms contribute to the identification of known human gene–disease associations. We find that mouse genotype-phenotype data is the most important dataset in the identification of human disease genes by semantic similarity and machine learning over phenotype ontologies. Data from other model organisms does not improve identification over that obtained by using the mouse alone, and therefore does not contribute significantly to this task. Our work has implications for the future development of integrated phenotype ontologies, as well as for the use of model organism phenotypes in human genetic variant interpretation.

Description

Keywords

Journal Title

Disease Models and Mechanisms

Conference Name

Journal ISSN

1754-8403
1754-8411

Volume Title

Publisher

Company of Biologists
Sponsorship
King Abdullah University of Science and Technology (KAUST) Office of Sponsored Research (OSR) under Award No.URF/1/3790-01-01, URF/1/4355-01-01, and FCC/1/1976-34-01.