Bio-SimVerb
Repository URI
Repository DOI
Change log
Description
Evaluation Dataset: Samples/words in Bio-SimVerb (verbs) and Bio-SimLex (nouns) are collected from a pre-processed PubMed Central Open Access subset (PMC). POS tags and tokens in this resource are generated using the BLLIP constituency parser, trained on a biomedical corpus. The resource covers over 1.4M full articles with more than 388M parsed sentences.
Details can be referred in Section 3 of the paper: Bio-SimVerb and Bio-SimLex: wide-coverage evaluation sets of word similarity in biomedicine
Version
Software / Usage instructions
guideline refer to: https://github.com/cambridgeltl/bio-simverb
Keywords
Publisher
Rights and licensing
Except where otherwised noted, this item's license is described as Attribution 4.0 International (CC BY 4.0)
Sponsorship
European Research Council (648909)
Medical Research Council (MR/M013049/1)
Medical Research Council (MR/M013049/1)