Show simple item record

dc.contributor.authorFaggionato, Cen
dc.contributor.authorMeelen, Mariekeen
dc.date.accessioned2020-03-03T00:30:51Z
dc.date.available2020-03-03T00:30:51Z
dc.date.issued2019-01-01en
dc.identifier.isbn9789544520557en
dc.identifier.issn1313-8502
dc.identifier.urihttps://www.repository.cam.ac.uk/handle/1810/302933
dc.description.abstractThis paper presents a full procedure for the development of a segmented, POS-tagged and chunk-parsed corpus of Old Tibetan. As an extremely low-resource language, Old Tibetan poses non-trivial problems in every step towards the development of a searchable treebank. We demonstrate, however, that a carefully developed, semi-supervised method of optimising and extending existing tools for Classical Tibetan, as well as creating specific ones for Old Tibetan, can address these issues. We thus also present the very first Tibetan Treebank in a variety of formats to facilitate research in the fields of NLP, historical linguistics and Tibetan Studies.
dc.rightsAll rights reserved
dc.titleDeveloping the old Tibetan treebanken
dc.typeConference Object
prism.endingPage312
prism.publicationDate2019en
prism.publicationNameInternational Conference Recent Advances in Natural Language Processing, RANLPen
prism.startingPage304
prism.volume2019-Septemberen
dc.identifier.doi10.17863/CAM.50008
rioxxterms.versionofrecord10.26615/978-954-452-056-4_035en
rioxxterms.versionAM
rioxxterms.licenseref.urihttp://www.rioxx.net/licenses/all-rights-reserveden
rioxxterms.licenseref.startdate2019-01-01en
dc.contributor.orcidMeelen, Marieke [0000-0003-0395-8372]
rioxxterms.typeConference Paper/Proceeding/Abstracten
rioxxterms.freetoread.startdate2020-01-01


Files in this item

Thumbnail

This item appears in the following Collection(s)

Show simple item record