连续性
聚类分析
数据挖掘
层次聚类
计算机科学
同质性(统计学)
联动装置(软件)
可扩展性
人工智能
机器学习
数据库
生物化学
化学
基因
操作系统
标识
DOI:10.1080/13658810701674970
摘要
Regionalization is to divide a large set of spatial objects into a number of spatially contiguous regions while optimizing an objective function, which is normally a homogeneity (or heterogeneity) measure of the derived regions. This research proposes and evaluates a family of six hierarchical regionalization methods. The six methods are based on three agglomerative clustering approaches, including the single linkage, average linkage (ALK), and the complete linkage (CLK), each of which is constrained with spatial contiguity in two different ways (i.e. the first‐order constraining and the full‐order constraining). It is discovered that both the Full‐Order‐CLK and the Full‐Order‐ALK methods significantly outperform existing methods across four quality evaluations: the total heterogeneity, region size balance, internal variation, and the preservation of data distribution. Moreover, the proposed algorithms are efficient and can find the solution in O(n 2log n) time. With such data scalability, for the first time it is possible to effectively regionalize large data sets that have 10 000 or more spatial objects. A detailed comparison and evaluation of the six methods are carried out with the 2004 US presidential election data.
科研通智能强力驱动
Strongly Powered by AbleSci AI