数据库扫描
最大值和最小值
星团(航天器)
计算机科学
集合(抽象数据类型)
k均值聚类
算法
噪音(视频)
预处理器
简单(哲学)
数学
聚类分析
人工智能
相关聚类
CURE数据聚类算法
图像(数学)
哲学
数学分析
程序设计语言
认识论
标识
DOI:10.1016/j.jocs.2021.101445
摘要
The k-means method aims to divide a set of N objects into k clusters, where each cluster is represented by the mean value of its objects. This algorithm is simple and converges to local minima quickly. It has linear time complexity, but it requires the number of clusters in advance which requires some knowledge in advance, in addition to selecting the initial centers which affect the quality of the final result and the number of iterations. The quality of the final result and the number of iterations depend on both k and initial centers. Many papers tried to detect a suitable value for k (the number of clusters) or introduced a better method for selecting the initial centers only. This research introduces a method able to detect a near-optimal value for k and near-optimal initial centers. The proposed method adds a preprocessing step to get the number of clusters and the initial centers before applying the k-means method. The idea is to get initial clusters using a density-based method that does not require the number of clusters in advance and computes the mean values for objects in each cluster and uses this knowledge in k-means. This leads to improving the quality of the final result as presented in the experimental results. The proposed method will use the DBSCAN “Density-based spatial clustering of application with noise” method as a preprocessing step. So, the paper concentrates on the DBSCAN and k-means. The proposed method will converge to global minima which improve the quality of the final result. The proposed method requires the two input parameters for the DBSCAN method and its time complexity is o(n log n) which is the same as that of DBSCAN.
科研通智能强力驱动
Strongly Powered by AbleSci AI