聚类分析
k-最近邻算法
模式识别(心理学)
核密度估计
模糊聚类
模糊逻辑
核(代数)
密度估算
星团(航天器)
概率密度函数
计算机科学
火焰团簇
样品(材料)
数学
数据挖掘
算法
最近邻链算法
人工智能
CURE数据聚类算法
树冠聚类算法
统计
物理
估计员
组合数学
热力学
程序设计语言
作者
Jia Zhao,Gang Wang,Jeng‐Shyang Pan,Tanghuai Fan,Ivan Lee
标识
DOI:10.1016/j.patcog.2023.109406
摘要
Uneven density data refers to data with a certain difference in sample density between clusters. The local density of density peaks clustering algorithm (DPC) does not consider the effect of sample density difference between clusters of uneven density data, which may lead to wrong selection of cluster centers; the algorithm allocation strategy makes it easy to incorrectly allocate samples originally belonging to sparse clusters to dense clusters, which reduces clustering efficiency. In this study, we proposed the density peaks clustering algorithm based on fuzzy and weighted shared neighbor for uneven density datasets (DPC-FWSN). First, a nearest neighbor fuzzy kernel function is obtained by combining K-nearest neighbor and fuzzy neighborhood. Then, local density is redefined by the nearest neighbor fuzzy kernel function. The local density can better characterize the distribution characteristics of the sample by balancing the contribution of sample density in dense and sparse areas, in order to avoid the situation that the sparse cluster does not have a cluster center. Finally, the allocation strategy for weighted shared neighbor similarity is proposed to optimize the sample allocation at the boundary of the sparse cluster. Experiments are performed on IDPC-FA, FKNN-DPC, FNDPC, DPCSA and DPC for uneven density datasets, complex morphologies datasets and real datasets. The clustering results demonstrate that DPC-FWSN effectively handles datasets with uneven density distribution.
科研通智能强力驱动
Strongly Powered by AbleSci AI