聚类分析
k-中位数聚类
计算机科学
初始化
模糊聚类
确定数据集中的群集数
单连锁聚类
数据挖掘
完整的链接聚类
加权
质心
相关聚类
算法
CURE数据聚类算法
星团(航天器)
模糊逻辑
变量(数学)
火焰团簇
分拆(数论)
模式识别(心理学)
数学
人工智能
组合数学
放射科
数学分析
医学
程序设计语言
作者
Imran Khan,Zongwei Luo,Joshua Zhexue Huang,Waseem Shahzad
出处
期刊:IEEE Transactions on Knowledge and Data Engineering
[Institute of Electrical and Electronics Engineers]
日期:2020-09-01
卷期号:32 (9): 1838-1853
被引量:36
标识
DOI:10.1109/tkde.2019.2911582
摘要
One of the most significant problems in cluster analysis is to determine the number of clusters in unlabeled data, which is the input for most clustering algorithms. Some methods have been developed to address this problem. However, little attention has been paid on algorithms that are insensitive to the initialization of cluster centers and utilize variable weights to recover the number of clusters. To fill this gap, we extend the standard fuzzy k-means clustering algorithm. It can automatically determine the number of clusters by iteratively calculating the weights of all variables and the membership value of each object in all clusters. Two new steps are added to the fuzzy k-means clustering process. One of them is to introduce a penalty term to make the clustering process insensitive to the initial cluster centers. The other one is to utilize a formula for iterative updating of variable weights in each cluster based on the current partition of data. Experimental results on real-world and synthetic datasets have shown that the proposed algorithm effectively determined the correct number of clusters while initializing the different number of cluster centroids. We also tested the proposed algorithm on gene data to determine a subset of important genes.
科研通智能强力驱动
Strongly Powered by AbleSci AI