Repository logo
 

Bio-SimVerb


Loading...
Thumbnail Image

Change log

Description

Evaluation Dataset: Samples/words in Bio-SimVerb (verbs) and Bio-SimLex (nouns) are collected from a pre-processed PubMed Central Open Access subset (PMC). POS tags and tokens in this resource are generated using the BLLIP constituency parser, trained on a biomedical corpus. The resource covers over 1.4M full articles with more than 388M parsed sentences.

Details can be referred in Section 3 of the paper: Bio-SimVerb and Bio-SimLex: wide-coverage evaluation sets of word similarity in biomedicine

Version

Software / Usage instructions

guideline refer to: https://github.com/cambridgeltl/bio-simverb

Keywords

Publisher

Rights and licensing

Except where otherwised noted, this item's license is described as Attribution 4.0 International (CC BY 4.0)
Sponsorship
European Research Council (648909)
Medical Research Council (MR/M013049/1)