Show simple item record

dc.contributor.authorGritta, Milan
dc.contributor.authorPilehvar, Mohammad
dc.contributor.authorCollier, Nigel
dc.date.accessioned2018-09-20T12:02:49Z
dc.date.available2018-09-20T12:02:49Z
dc.date.issued2018
dc.identifier.isbn9781948087322
dc.identifier.urihttps://www.repository.cam.ac.uk/handle/1810/280425
dc.description.abstractThe purpose of text geolocation is to associate geographic information contained in a document with a set (or sets) of coordinates, either implicitly by using linguistic features and/or explicitly by using geographic metadata combined with heuristics. We introduce a geocoder (location mention disambiguator) that achieves state-of-the-art (SOTA) results on three diverse datasets by exploiting the implicit lexical clues. Moreover, we propose a new method for systematic encoding of geographic metadata to generate two distinct views of the same text. To that end, we introduce the Map Vector (MapVec), a sparse representation obtained by plotting prior geographic probabilities, derived from population figures, on a World Map. We then integrate the implicit (language) and explicit (map) features to significantly improve a range of metrics. We also introduce an open-source dataset for geoparsing of news events covering global disease outbreaks and epidemics to help future evaluation in geoparsing.
dc.publisherAssociation for Computational Linguistics
dc.titleWhich Melbourne? Augmenting geocoding with maps
dc.typeConference Object
prism.endingPage1296
prism.publicationDate2018
prism.publicationNameACL 2018 - 56th Annual Meeting of the Association for Computational Linguistics, Proceedings of the Conference (Long Papers)
prism.startingPage1285
prism.volume1
dc.identifier.doi10.17863/CAM.27796
dcterms.dateAccepted2018-04-20
rioxxterms.versionofrecord10.18653/v1/p18-1119
rioxxterms.licenseref.urihttp://www.rioxx.net/licenses/all-rights-reserved
rioxxterms.licenseref.startdate2018-01-01
dc.contributor.orcidGritta, Milan [0000-0003-0014-7275]
dc.contributor.orcidCollier, Nigel [0000-0002-7230-4164]
rioxxterms.typeConference Paper/Proceeding/Abstract
pubs.funder-project-idNatural Environment Research Council (1649558)
pubs.funder-project-idEngineering and Physical Sciences Research Council (EP/M005089/1)
pubs.funder-project-idNERC (via Cranfield University) (NE/M009009/1)
cam.issuedOnline2018-07-12
pubs.conference-nameProceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
pubs.conference-start-date2018-07
pubs.conference-finish-date2018-07
rioxxterms.freetoread.startdate2019-09-12


Files in this item

Thumbnail

This item appears in the following Collection(s)

Show simple item record