计算机科学
文档聚类
聚类分析
非结构化数据
棕色聚类
tf–国际设计公司
层次聚类
人工智能
数据流聚类
预处理器
高维数据聚类
相关聚类
共识聚类
CURE数据聚类算法
数据挖掘
大数据
期限(时间)
物理
量子力学
作者
Bankapalli Jyothi,L. Sumalatha,Suneetha Eluri
摘要
Summary Document clustering is a technique used to split the collection of textual content into clusters or groups. In modern days, generally, the spectral clustering is utilized in machine learning domain. By using a selection of text mining algorithms, the diverse features of unstructured content is captured for ensuing in rich descriptions. The main aim of this article is to enhance a novel unstructured text data clustering by a developed natural language processing technique. The proposed model will undergo three stages, namely, preprocessing, features extraction, and clustering. Initially, the unstructured data is preprocessed by the techniques such as punctuation and stop word removal, stemming, and tokenization. Then, the features are extracted by the word2vector using continuous Bag of Words model and term frequency‐inverse document frequency. Then, unstructured features are performed by the hierarchical clustering using the optimizing the cut‐off distance by the improved sensing area‐based electric fish optimization (FISA‐EFO). Tuned deep neural network is used for improving the clustering model, which is proposed by same algorithm. Thus, the results reveal that the model provides better clustering accuracy than other clustering techniques while handling the unstructured text data.
科研通智能强力驱动
Strongly Powered by AbleSci AI