微博
社会化媒体
计算机科学
数据科学
地理
万维网
作者
Laura Rocco,Federico Dassereto,Michela Bertolotto,Davide Buscaldi,Barbara Catania,Giovanna Guerrini
标识
DOI:10.1080/13658816.2020.1764003
摘要
Many solutions for coarse geolocating of users at the time they post a message exist. However, for many important applications, like traffic monitoring and event detection, finer geolocation at the level of city neighborhoods, i.e., at a sub-city level, is needed. Data-driven approaches often do not guarantee good accuracy and efficiency due to the higher number of sub-city level positions to be estimated and the low availability of balanced and large training sets. We claim that external information sources overcome limitations of data-driven approaches in achieving good accuracy for sub-city level geolocation and we present a knowledge-driven approach achieving good results once the reference area of a message is known. Our algorithm, called Sherloc, exploits toponyms in the message, extracts their semantic from a geographic gazetteer, and embeds them into a metric space that captures the semantic distance among them. We identify the semantically closest toponyms to a message and then cluster them with respect to their spatial locations. Sherloc requires no prior training, it can infer the location at sub-city level with high accuracy, and it is not limited to geolocating on a fixed spatial grid.
科研通智能强力驱动
Strongly Powered by AbleSci AI