计算机科学
聚类分析
星团(航天器)
算法
确定数据集中的群集数
k-中位数聚类
数据挖掘
人工智能
CURE数据聚类算法
模式识别(心理学)
相关聚类
程序设计语言
作者
Tong Wang,Yuping Wang,Delong Liu
出处
期刊:IEEE Transactions on Knowledge and Data Engineering
[Institute of Electrical and Electronics Engineers]
日期:2023-04-01
卷期号:35 (4): 3419-3432
被引量:7
标识
DOI:10.1109/tkde.2021.3138962
摘要
Imbalanced data clustering is a challenging problem in machine learning. The main difficulty is caused by the imbalance in both cluster size and data density distribution. To address this problem, we propose a novel clustering algorithm called LDPI based on local-density peaks in this study. First, an initial sub-cluster construction scheme is designed based on a 3-dimensional (3-D) decision graph that can easily detect the initial sub-cluster centers and identify the noise points. Second, a sub-cluster updating strategy is designed, which can automatically identify the false sub-cluster centers and update the initial sub-clusters. Third, a sub-cluster merging scheme is designed, which merges the updated initial sub-clusters into final clusters. Consequently, the proposed algorithm has three advantages: 1) It does not require any input parameters; 2) It can automatically determine the cluster centers and number of clusters; 3) It is suitable for imbalanced datasets and datasets with arbitrary shapes and distributions. The effectiveness of LDPI is demonstrated experimentally and the superiority of LDPI is identified by comparison with 5 state-of-the-art algorithms.
科研通智能强力驱动
Strongly Powered by AbleSci AI