Bio-SimVerb
datacite.contributor.supervisor | Korhonen, Anna | |
datacite.issupplementto.doi | 10.1186/s12859-018-2039-z | |
datacite.issupplementto.url | https://www.repository.cam.ac.uk/handle/1810/276650 | |
dc.contributor.author | Chiu, HW | |
dc.contributor.author | Pyysalo, Sampo | |
dc.contributor.author | Vulic, Ivan | |
dc.contributor.author | Korhonen, Anna | |
dc.date.accessioned | 2018-06-05T14:33:44Z | |
dc.date.available | 2018-06-05T14:33:44Z | |
dc.description | Evaluation Dataset: Samples/words in Bio-SimVerb (verbs) and Bio-SimLex (nouns) are collected from a pre-processed PubMed Central Open Access subset (PMC). POS tags and tokens in this resource are generated using the BLLIP constituency parser, trained on a biomedical corpus. The resource covers over 1.4M full articles with more than 388M parsed sentences. Details can be referred in Section 3 of the paper: Bio-SimVerb and Bio-SimLex: wide-coverage evaluation sets of word similarity in biomedicine | |
dc.format | guideline refer to: https://github.com/cambridgeltl/bio-simverb | |
dc.identifier.doi | 10.17863/CAM.18370 | |
dc.identifier.uri | https://www.repository.cam.ac.uk/handle/1810/276631 | |
dc.rights | Attribution 4.0 International (CC BY 4.0) | |
dc.rights.uri | https://creativecommons.org/licenses/by/4.0/ | |
dc.subject | Bio-SimLex | |
dc.title | Bio-SimVerb | |
dc.type | Dataset | |
dcterms.format | zip | |
pubs.funder-project-id | European Research Council (648909) | |
pubs.funder-project-id | Medical Research Council (MR/M013049/1) | |
rioxxterms.licenseref.uri | https://creativecommons.org/licenses/by/4.0/ | |
rioxxterms.type | Other |