LION LBD: a literature-based discovery system for cancer biology.
View / Open Files
Authors
Pyysalo, Sampo
Ali, Imran
Haselwimmer, Stefan
Shah, Tejas
Guo, Yufan
Högberg, Johan
Stenius, Ulla
Korhonen, Anna
Publication Date
2019-05-01Journal Title
Bioinformatics
ISSN
1367-4803
Publisher
Oxford University Press (OUP)
Volume
35
Issue
9
Pages
1553-1561
Language
eng
Type
Article
Physical Medium
Print
Metadata
Show full item recordCitation
Pyysalo, S., Baker, S., Ali, I., Haselwimmer, S., Shah, T., Young, A., Guo, Y., et al. (2019). LION LBD: a literature-based discovery system for cancer biology.. Bioinformatics, 35 (9), 1553-1561. https://doi.org/10.1093/bioinformatics/bty845
Abstract
MOTIVATION: The overwhelming size and rapid growth of the biomedical literature make it impossible for scientists to read all studies related to their work, potentially leading to missed connections and wasted time and resources. Literature-based discovery (LBD) aims to alleviate these issues by identifying implicit links between disjoint parts of the literature. While LBD has been studied in depth since its introduction three decades ago, there has been limited work making use of recent advances in biomedical text processing methods in LBD. RESULTS: We present LION LBD, a literature-based discovery system that enables researchers to navigate published information and supports hypothesis generation and testing. The system is built with a particular focus on the molecular biology of cancer using state-of-the-art machine learning and natural language processing methods, including named entity recognition and grounding to domain ontologies covering a wide range of entity types and a novel approach to detecting references to the hallmarks of cancer in text. LION LBD implements a broad selection of co-occurrence based metrics for analyzing the strength of entity associations, and its design allows real-time search to discover indirect associations between entities in a database of tens of millions of publications while preserving the ability of users to explore each mention in its original context in the literature. Evaluations of the system demonstrate its ability to identify undiscovered links and rank relevant concepts highly among potential connections. AVAILABILITY AND IMPLEMENTATION: The LION LBD system is available via a web-based user interface and a programmable API, and all components of the system are made available under open licenses from the project home page http://lbd.lionproject.net. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.
Keywords
Algorithms, Databases, Factual, Humans, Natural Language Processing, Neoplasms, Publications
Sponsorship
Cancer Research UK Cambridge Institute Core Grant (C14303/A17197)
Funder references
Medical Research Council (MR/M013049/1)
Cancer Research UK (C14303/A17197)
Medical Research Council (MR/R010013/1)
Identifiers
External DOI: https://doi.org/10.1093/bioinformatics/bty845
This record's URL: https://www.repository.cam.ac.uk/handle/1810/285662
Statistics
Total file downloads (since January 2020). For more information on metrics see the
IRUS guide.
Recommended or similar items
The current recommendation prototype on the Apollo Repository will be turned off on 03 February 2023. Although the pilot has been fruitful for both parties, the service provider IKVA is focusing on horizon scanning products and so the recommender service can no longer be supported. We recognise the importance of recommender services in supporting research discovery and are evaluating offerings from other service providers. If you would like to offer feedback on this decision please contact us on: support@repository.cam.ac.uk