Which Melbourne? Augmenting geocoding with maps
View / Open Files
Publication Date
2018-01-01Journal Title
ACL 2018 - 56th Annual Meeting of the Association for Computational Linguistics, Proceedings of the Conference (Long Papers)
ISBN
9781948087322
Volume
1
Pages
1285-1296
Type
Conference Object
Metadata
Show full item recordCitation
Gritta, M., Pilehvar, M., & Collier, N. (2018). Which Melbourne? Augmenting geocoding with maps. ACL 2018 - 56th Annual Meeting of the Association for Computational Linguistics, Proceedings of the Conference (Long Papers), 1 1285-1296. https://doi.org/10.18653/v1/p18-1119
Abstract
The purpose of text geolocation is to associate geographic information contained in a document with a set (or sets) of coordinates, either implicitly by using linguistic features and/or explicitly by using geographic metadata combined with heuristics. We introduce a geocoder (location mention disambiguator) that achieves state-of-the-art (SOTA) results on three diverse datasets by exploiting the implicit lexical clues. Moreover, we propose a new method for systematic encoding of geographic metadata to generate two distinct views of the same text. To that end, we introduce the Map Vector (MapVec), a sparse representation obtained by plotting prior geographic probabilities, derived from population figures, on a World Map. We then integrate the implicit (language) and explicit (map) features to significantly improve a range of metrics. We also introduce an open-source dataset for geoparsing of news events covering global disease outbreaks and epidemics to help future evaluation in geoparsing.
Sponsorship
NERC (1649558)
EPSRC (EP/M005089/1)
NERC (via Cranfield University) (NE/M009009/1)
Identifiers
External DOI: https://doi.org/10.18653/v1/p18-1119
This record's URL: https://www.repository.cam.ac.uk/handle/1810/280425
Rights
Licence:
http://www.rioxx.net/licenses/all-rights-reserved