A hierarchical consensus learning model for deep multi-view document clustering

计算机科学人工智能聚类分析层次聚类共识聚类深度学习机器学习数据挖掘模糊聚类树冠聚类算法

作者

Ruina Bai,Ruizhang Huang,Yanping Chen,Yongbin Qin,Yong Xu,Qinghua Zheng

出处

期刊：Information Fusion [Elsevier BV]
日期：2024-06-05 卷期号：111: 102507-102507 被引量：4

标识

DOI：10.1016/j.inffus.2024.102507

摘要

Document clustering, a fundamental task in natural language processing, aims to divede large collections of documents into meaningful groups based on their similarities. Multi-view document clustering (MvDC) has emerged as a promising approach, leveraging information from diverse views to improve clustering accuracy and robustness. However, existing multi-view clustering methods suffer from two issues: (1) a lack of inter-relations across documents during consensus semantic learning; (2) the neglect of consensus structure mining in the multi-view document clustering. To address these issues, we propose a Hierarchical Consensus Learning model for Multi-view Document Clustering, termed as MvDC-HCL. Our model incorporates two key modules: The Data-oriented Consensus Semantic Learning (CSeL) module focuses on learning consensus semantics across various views by leveraging a hybrid contrastive consensus objective. The Task-oriented Consensus Structure Clustering (CStC) module employs a gated fusion network and clustering-driven structure contrastive learning to mine consensus structures effectively. Specifically, CSeL module constructs a contrastive consensus learning objective based on intra-sample and inter-sample relationships in multi-view data, aiming to optimize the view semantic representations obtained by the semantic learner. This facilitates consistent semantic learning across various views of the same sample and consistent relationship learning among samples from different views. Then, the learned view semantic representations are fed into the fusion network of CStC to obtain fused sample semantic representations. Together with the view semantic representations, sample-level and view-level clustering structures are derived for consensus structure mining. Additionally, CStC introduces clustering-driven objectives to guide consensus structure mining and achieve consistent clustering results. By hierarchically extracting implicit consensus semantics and structures within multi-view document data and tasks, MvDC-HCL significantly enhances clustering performance. Through comprehensive experiments, we demonstrate that proposed model can consistently perform better over the state-of-the-art methods. Our code is publicly available at https://github.com/m22453/MvDC_HCRL.

求助该文献

最长约 10秒，即可获得该文献文件

A hierarchical consensus learning model for deep multi-view document clustering

今日热心研友