初始化
计算机科学
聚类分析
数据挖掘
模糊聚类
数据流聚类
CURE数据聚类算法
树冠聚类算法
相关聚类
核(代数)
确定数据集中的群集数
人工智能
数学
组合数学
程序设计语言
作者
Runhai Jiao,Shaolong Liu,Wu Wen,Biying Lin
出处
期刊:Kybernetes
[Emerald (MCB UP)]
日期:2016-09-05
卷期号:45 (8): 1273-1291
被引量:3
标识
DOI:10.1108/k-08-2015-0209
摘要
Purpose The large volume of big data makes it impractical for traditional clustering algorithms which are usually designed for entire data set. The purpose of this paper is to focus on incremental clustering which divides data into series of data chunks and only a small amount of data need to be clustered at each time. Few researches on incremental clustering algorithm address the problem of optimizing cluster center initialization for each data chunk and selecting multiple passing points for each cluster. Design/methodology/approach Through optimizing initial cluster centers, quality of clustering results is improved for each data chunk and then quality of final clustering results is enhanced. Moreover, through selecting multiple passing points, more accurate information is passed down to improve the final clustering results. The method has been proposed to solve those two problems and is applied in the proposed algorithm based on streaming kernel fuzzy c-means (stKFCM) algorithm. Findings Experimental results show that the proposed algorithm demonstrates more accuracy and better performance than streaming kernel stKFCM algorithm. Originality/value This paper addresses the problem of improving the performance of increment clustering through optimizing cluster center initialization and selecting multiple passing points. The paper analyzed the performance of the proposed scheme and proved its effectiveness.
科研通智能强力驱动
Strongly Powered by AbleSci AI