Repository logo
 

Bio-SimVerb

datacite.contributor.supervisorKorhonen, Anna
datacite.issupplementto.doi10.1186/s12859-018-2039-z
datacite.issupplementto.urlhttps://www.repository.cam.ac.uk/handle/1810/276650
dc.contributor.authorChiu, HW
dc.contributor.authorPyysalo, Sampo
dc.contributor.authorVulic, Ivan
dc.contributor.authorKorhonen, Anna
dc.date.accessioned2018-06-05T14:33:44Z
dc.date.available2018-06-05T14:33:44Z
dc.descriptionEvaluation Dataset: Samples/words in Bio-SimVerb (verbs) and Bio-SimLex (nouns) are collected from a pre-processed PubMed Central Open Access subset (PMC). POS tags and tokens in this resource are generated using the BLLIP constituency parser, trained on a biomedical corpus. The resource covers over 1.4M full articles with more than 388M parsed sentences. Details can be referred in Section 3 of the paper: Bio-SimVerb and Bio-SimLex: wide-coverage evaluation sets of word similarity in biomedicine
dc.formatguideline refer to: https://github.com/cambridgeltl/bio-simverb
dc.identifier.doi10.17863/CAM.18370
dc.identifier.urihttps://www.repository.cam.ac.uk/handle/1810/276631
dc.rightsAttribution 4.0 International (CC BY 4.0)
dc.rights.urihttps://creativecommons.org/licenses/by/4.0/
dc.subjectBio-SimLex
dc.titleBio-SimVerb
dc.typeDataset
dcterms.formatzip
pubs.funder-project-idEuropean Research Council (648909)
pubs.funder-project-idMedical Research Council (MR/M013049/1)
rioxxterms.licenseref.urihttps://creativecommons.org/licenses/by/4.0/
rioxxterms.typeOther

Files

Original bundle
Now showing 1 - 1 of 1
No Thumbnail Available
Name:
datasets.zip
Size:
62.96 KB
Format:
ZIP file
Licence
https://creativecommons.org/licenses/by/4.0/
License bundle
Now showing 1 - 1 of 1
No Thumbnail Available
Name:
DepositLicenceAgreement.pdf
Size:
417.78 KB
Format:
Adobe Portable Document Format