地理定位
计算机科学
一致性(知识库)
数据库
数据挖掘
数据一致性
传感器融合
数据质量
情报检索
万维网
运营管理
人工智能
经济
公制(单位)
作者
Li Han,Pei Zhang,Zhanfeng Wang,Fei Du,Ye Kuang,Ying An
标识
DOI:10.1109/iscc.2017.8024680
摘要
Database driven IP geolocation is a convenient and common way to determine geographic location of an IP address. However, the underlying problem is that it is often difficult for users to determine which provider is reliable enough to meet their own scenarios. In this paper, we tackle this challenge in a data fusion perspective. We first evaluate the consistency degree of data entries among 5 free geolocation databases and employ it as an indicator of data quality assessment. We find that this indicator varies by geographic scope and granularity for a certain provider. Therefore we are able to evaluate data quality for different parts and dimensions within a database. Then a data fusion method utilizing data consistency degree and quota-based votes is proposed and analyzed. Over 40 million IP geolocation ground truth data in China, i.e., more than 10% of the total address space allocated to China, is applied to verify the effectiveness and advantage of the proposed method. In this work, we provide insights into comprehensive utilization of multi-databases characteristics for data entry fusion in the absence of enough priori knowledge.
科研通智能强力驱动
Strongly Powered by AbleSci AI