PheneBank: a literature-based database of phenotypes.
dc.contributor.author | Pilehvar, Mohammad Taher | |
dc.contributor.author | Bernard, Adam | |
dc.contributor.author | Smedley, Damian | |
dc.contributor.author | Collier, Nigel | |
dc.date.accessioned | 2021-10-25T23:31:14Z | |
dc.date.available | 2021-10-25T23:31:14Z | |
dc.date.issued | 2021-11-12 | |
dc.identifier.issn | 1367-4803 | |
dc.identifier.uri | https://www.repository.cam.ac.uk/handle/1810/329879 | |
dc.description.abstract | MOTIVATION: Significant effort has been spent by curators to create coding systems for phenotypes such as the Human Phenotype Ontology (HPO), as well as disease-phenotype annotations. We aim to support the discovery of literature-based phenotypes and integrate them into the knowledge discovery process. RESULTS: PheneBank is a Web-portal for retrieving human phenotype-disease associations that have been text-mined from the whole of Medline. Our approach exploits state-of-the-art machine learning for concept identification by utilising an expert annotated rare disease corpus from the PMC Text Mining subset. Evaluation of the system for entities is conducted on a gold-standard corpus of rare disease sentences and for associations against the Monarch initiative data. AVAILABILITY: The PheneBank Web-portal freely available at http://www.phenebank.org. Annotated Medline data is available from Zenodo at DOI: 10.5281/zenodo.1408800. Semantic annotation software is freely available for non-commercial use at GitHub: https://github.com/pilehvar/phenebank. SUPPLEMENTARY INFORMATION: Supplementary data is available at Bioinformatics online. | |
dc.description.sponsorship | Medical Research Council (grant MR/M025160/1). | |
dc.language | eng | |
dc.publisher | Oxford University Press (OUP) | |
dc.rights | All rights reserved | |
dc.rights.uri | http://www.rioxx.net/licenses/all-rights-reserved | |
dc.title | PheneBank: a literature-based database of phenotypes. | |
dc.type | Article | |
prism.publicationDate | 2021 | |
prism.publicationName | Bioinformatics | |
dc.identifier.doi | 10.17863/CAM.77324 | |
dcterms.dateAccepted | 2021-11-02 | |
rioxxterms.versionofrecord | 10.1093/bioinformatics/btab740 | |
rioxxterms.version | AM | |
rioxxterms.licenseref.uri | http://www.rioxx.net/licenses/all-rights-reserved | |
rioxxterms.licenseref.startdate | 2021-11-12 | |
dc.contributor.orcid | Collier, Nigel [0000-0002-7230-4164] | |
dc.identifier.eissn | 1367-4811 | |
rioxxterms.type | Journal Article/Review | |
pubs.funder-project-id | Engineering and Physical Sciences Research Council (EP/M005089/1) | |
pubs.funder-project-id | Medical Research Council (MR/M025160/1) | |
cam.issuedOnline | 2021-11-12 | |
cam.orpheus.success | 2021-10-25 - Embargo set during processing via Fast-track | |
rioxxterms.freetoread.startdate | 2022-10-27 |
Files in this item
This item appears in the following Collection(s)
-
Cambridge University Research Outputs
Research outputs of the University of Cambridge