Repository logo
 

PheneBank: a literature-based database of phenotypes.

Accepted version
Peer-reviewed

Loading...
Thumbnail Image

Change log

Abstract

MOTIVATION: Significant effort has been spent by curators to create coding systems for phenotypes such as the Human Phenotype Ontology, as well as disease-phenotype annotations. We aim to support the discovery of literature-based phenotypes and integrate them into the knowledge discovery process. RESULTS: PheneBank is a Web-portal for retrieving human phenotype-disease associations that have been text-mined from the whole of Medline. Our approach exploits state-of-the-art machine learning for concept identification by utilizing an expert annotated rare disease corpus from the PMC Text Mining subset. Evaluation of the system for entities is conducted on a gold-standard corpus of rare disease sentences and for associations against the Monarch initiative data. AVAILABILITY AND IMPLEMENTATION: The PheneBank Web-portal freely available at http://www.phenebank.org. Annotated Medline data is available from Zenodo at DOI: 10.5281/zenodo.1408800. Semantic annotation software is freely available for non-commercial use at GitHub: https://github.com/pilehvar/phenebank. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.

Description

Journal Title

Bioinformatics

Conference Name

Journal ISSN

1367-4803
1367-4811

Volume Title

Publisher

Oxford University Press (OUP)

Rights and licensing

Except where otherwised noted, this item's license is described as All rights reserved
Sponsorship
Engineering and Physical Sciences Research Council (EP/M005089/1)
Medical Research Council (MR/M025160/1)
Medical Research Council (grant MR/M025160/1).