Show simple item record

dc.contributor.authorCrichton, Gamal Kashaka Omari
dc.date.accessioned2019-06-24T08:16:29Z
dc.date.available2019-06-24T08:16:29Z
dc.date.issued2019-07-20
dc.date.submitted2019-02-18
dc.identifier.urihttps://www.repository.cam.ac.uk/handle/1810/293886
dc.description.abstractLiterature-based Discovery (LBD) uses information from explicit statements in literature to generate new or unstated knowledge. Automated LBD can thus facilitate hypothesis testing and generation from large collections of publications to support and accelerate scientific research, which is adversely affected by publication explosion and knowledge fragmentation. Existing methods, however, use methodologies which are inadequate for capturing the complex information available in scientific literature and are prone to proposing spurious discoveries or an abundance of low-quality ones. To be capable of solving these problems, automated LBD needs to accurately glean the extensive information present in literature, cope with the dynamic nature of scientific knowledge and place high-quality proposals at the top of ranked outputs. Recent advances in Natural Language Processing (NLP) allow for deep textual analysis to obtain a wide coverage of information present in text and can adapt easily to recognising new biomedical entities and terms. Similarly, recent advances in graph processing have made it possible to do in-depth analysis on information represented as graphs, such as published biomedical connections, to facilitate high-quality knowledge discovery. Both of these advances utilise neural networks extensively. This work used neural networks in a bid to advance automated LBD in three ways: 1) improving biomedical Named Entity Recognition (NER) to extract entities from unstructured text by using multi-task learning across multiple biomedical datasets; 2) improving knowledge discovery from realistic, random- and time-sliced biomedical graphs using link prediction and 3) improving the ranking of published discoveries on open- and closed- LBD instances by scoring the strength of connection paths using neural models. Excitingly, the latter approaches outperformed those used by the state-of-the-art LION LBD system, indicating that their integration into it would provide better support to cancer researchers using it. The results from this work show that it is feasible to use neural networks to improve LBD in different ways. They also demonstrate that neural networks are versatile enough to be applied to improve traditional as well as non-traditional LBD. The principal implication of these findings is that neural biomedical knowledge discovery, especially LBD, is presently useful in addition to being a potentially rich field for further study.
dc.description.sponsorshipCambridge Commonwealth, European & International Trust
dc.language.isoen
dc.rightsAll rights reserved
dc.subjectLiterature-based Discovery
dc.subjectLBD
dc.subjectNeural networks
dc.subjectNamed Entity Recognition
dc.subjectNER
dc.subjectMulti-task Learning
dc.subjectLION LBD
dc.subjectknowledge discovery
dc.subjectNatural Language Processing
dc.subjectNLP
dc.subjectMachine Learning
dc.subjectDeep Learning
dc.subjectBiomedical NLP
dc.subjectBiomedical Knowledge Discovery
dc.subjectLink Predcition
dc.subjectLanguage Technology Laboratory
dc.titleImproving Automated Literature-based Discovery with Neural Networks: Neural biomedical Named Entity Recognition, Link Prediction and Discovery
dc.typeThesis
dc.type.qualificationlevelDoctoral
dc.type.qualificationnameDoctor of Philosophy (PhD)
dc.publisher.institutionUniversity of Cambridge
dc.publisher.departmentTheoretical and Applied Linguistics
dc.date.updated2019-06-19T11:29:34Z
dc.identifier.doi10.17863/CAM.40995
dc.contributor.orcidCrichton, Gamal Kashaka Omari [0000-0002-3036-0811]
dc.publisher.collegeSt. Edmund's College
dc.type.qualificationtitlePhD Computational Linguistics and Knowledge Discovery
cam.supervisorKorhonen, Anna
cam.thesis.fundingfalse


Files in this item

Thumbnail

This item appears in the following Collection(s)

Show simple item record