Repository logo
 

Which Melbourne? Augmenting Geocoding with Maps

Accepted version
Peer-reviewed

Loading...
Thumbnail Image

Change log

Abstract

The purpose of text geolocation is to associate geographic information contained in a document with a set (or sets) of coordinates, either implicitly by using linguistic features and/or explicitly by using geographic metadata combined with heuristics. We introduce a geocoder (location mention disambiguator) that achieves state-of-the-art (SOTA) results on three diverse datasets by exploiting the implicit lexical clues. Moreover, we propose a new method for systematic encoding of geographic metadata to generate two distinct views of the same text. To that end, we introduce the Map Vector (MapVec), a sparse representation obtained by plotting prior geographic probabilities, derived from population figures, on a World Map. We then integrate the implicit (language) and explicit (map) features to significantly improve a range of metrics. We also introduce an open-source dataset for geoparsing of news events covering global disease outbreaks and epidemics to help future evaluation in geoparsing.

Description

Keywords

Journal Title

Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)

Conference Name

Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)

Journal ISSN

Volume Title

1

Publisher

Association for Computational Linguistics (ACL)

Rights and licensing

Except where otherwised noted, this item's license is described as All rights reserved
Sponsorship
Natural Environment Research Council (1649558)
Engineering and Physical Sciences Research Council (EP/M005089/1)
NERC (via Cranfield University) (NE/M009009/1)
Medical Research Council (MR/M025160/1)