Repository logo
 

A Neural Classification Method for Supporting the Creation of BioVerbNet

Published version
Peer-reviewed

Loading...
Thumbnail Image

Type

Article

Change log

Authors

Chiu, HW 
Pyysalo, Sampo 
Stenius, Ulla 

Abstract

Background: VerbNet, an extensive computational verb lexicon for English, has proved useful for supporting a wide range of Natural Language Processing tasks requiring information about the behaviour and meaning of verbs. Biomedical text processing and mining could benefit from a similar resource. We take the first step towards the development of BioVerbNet: A VerbNet specifically aimed at describing verbs in the area of biomedicine. Because VerbNet-style classification is extremely time consuming, we start from a small manual classification of biomedical verbs and apply a state-of-the-art neural representation model, specifically developed for class-based optimization, to expand the classification with new verbs, using all the PubMed abstracts and the full articles in the PubMed Central Open Access subset as data. Results: Direct evaluation of the resulting classification against BioSimVerb (verb similarity judgement data in biomedicine) shows promising results when representation learning is performed using verb class-based contexts. Human validation by linguists and biologists reveals that the automatically expanded classification is highly accurate. Including novel, valid member verbs and classes, our method can be used to facilitate cost-effective development of BioVerbNet. Conclusion: This work constitutes the first effort on applying a state-of-the-art architecture for neural representation learning to biomedical verb classification. While we discuss future optimization of the method, our promising results suggest that the automatic classification released with this article can be used to readily support application tasks in biomedicine.

Description

Keywords

verb lexicon, representation learning

Journal Title

Journal of Biomedical Semantics

Conference Name

Journal ISSN

2041-1480
2041-1480

Volume Title

10

Publisher

BioMed Central
Sponsorship
Medical Research Council (MR/M013049/1)
European Research Council (648909)
ESRC (1804172)
ESRC (ES/J500033/1)
This work is supported by the Medical Research Council [grant number MR/M013049/1], the ERC Consolidator Grant LEXICAL [grant number 648909], the ESRC Doctoral Fellowship [grant number ES/J500033/1] and the Defense Advanced Research Projects Agency [DARPA 15-18-CwC-FP-032]