聚类分析
合并(版本控制)
计算机科学
最近邻链算法
图形
离群值
k-最近邻算法
单连锁聚类
模式识别(心理学)
算法
完整的链接聚类
数据点
相关聚类
数据挖掘
CURE数据聚类算法
人工智能
树冠聚类算法
理论计算机科学
情报检索
作者
Yan Wang,Yan Ma,Hui Huang,Bin Wang,D. P. Acharjya
标识
DOI:10.1016/j.is.2022.102124
摘要
Numerous graph-based clustering algorithms relying on k-nearest neighbor (KNN) have been proposed. However, the performance of these algorithms tends to be affected by many factors such as cluster shape, cluster density and outliers. To address these issues, we present a split–merge clustering algorithm based on the KNN graph (SMKNN), which is based on the idea that two adjacent clusters can be merged if the data points located in the connection layers of the two clusters tend to be consistent in distribution. In Stage 1, a KNN graph is constructed. In Stage 2, the subgraphs are obtained by removing the pivot points from the KNN graph, in which the pivot points are determined by the size of local distance ratio of data points. In Stage 3, the adjacent cluster pairs satisfying the maximum similarity are merged, in which the similarity measure of two clusters is designed with two concepts including external connection edges and internal connection edges. By the experiments on ten synthetic data sets and eight real data sets, we compared SMKNN with two traditional algorithms, two density-based algorithms, nine graph-based algorithms and four neural network based algorithms in accuracy. The experimental results demonstrate a good performance of the proposed clustering method.
科研通智能强力驱动
Strongly Powered by AbleSci AI