欧几里德距离
聚类分析
相似性(几何)
模式识别(心理学)
离群值
数学
标准差
水准点(测量)
距离测量
k-最近邻算法
特征(语言学)
算法
k-中位数聚类
系统发育中的距离矩阵
点(几何)
人工智能
计算机科学
模糊聚类
统计
CURE数据聚类算法
组合数学
语言学
哲学
几何学
大地测量学
图像(数学)
地理
作者
Juanying Xie,Xinglin Liu,Mingzhao Wang
标识
DOI:10.1016/j.ins.2023.119788
摘要
DPC (Clustering by fast search and find of Density Peaks) algorithm and its variations typically employ Euclidean distance, overlooking the diverse contributions of individual feature to similarity and subsequent clustering. To address this limitation, the standard deviation weighted distance is proposed in this paper to enhance the Euclidean distance. This weighted distance takes into account the specific contribution of each feature to the distance (similarity) between data points. By utilizing this weighted distance, the local density ρi and distance δi of point i are defined, thereby capturing the local pattern of point i to the fullest extent possible. Outliers are defined using this innovative weighted distance. The divide and conquer assignment strategy is proposed based on this proposed weighted distance and the semi-supervised learning and the mutual K-nearest neighbor assumption. Consequently, the SFKNN-DPC (Standard deviation weighted distance and Fuzzy weighted K-Nearest Neighbors based Density Peak Clustering) algorithm is proposed, aiming to effectively uncover the hidden clusters within a dataset. Extensive experiments conducted on benchmark datasets demonstrate the superiority of SFKNN-DPC over DPC, its variations, and other benchmark clustering algorithms. Moreover, statistical significance tests indicate that SFKNN-DPC exhibits notable differences when compared to its counterparts.
科研通智能强力驱动
Strongly Powered by AbleSci AI