插补(统计学)
缺少数据
计算机科学
聚类分析
数据挖掘
嵌入
一致性(知识库)
机器学习
人工智能
作者
Xingfeng Li,Yinghui Sun,Quansen Sun,Jian Dai,Zhenwen Ren
标识
DOI:10.1145/3581783.3612483
摘要
In practical scenarios, partial missing of multi-view data is very common, such as register information missing from social network analysis, which results in incomplete multi-view clustering (IMVC). How to fill missing data fast and efficiently plays a vital role in improving IMVC, carrying a significant challenge. Existing IMVC methods always use all observed data to fill in missing data, resulting in high complexity and poor imputation quality due to a lack of guidance from consistent distribution. To break the existing limitations, we propose a novel Distribution Consistency based Fast Anchor Imputation for Incomplete Multi-view Clustering (DCFAI-IMVC) method. Specifically, to eliminate the interference of redundant and fraudulent features in the original space, incomplete data are first projected into a consensus latent space, where we dynamically learn a small number of anchors to achieve fast and good imputation. Then, we employ global distribution information of the observed embedding representations to further ensure the consistent distribution between the learned anchors and the observed embedding representations. Ultimately, a tensor low-rank constraint is imposed on bipartite graphs to investigate the high-order correlations hidden in data. DCFAI-IMVC enjoys linear complexity in terms of sample number, which gives it great potential to handle large-scale IMVC tasks. By performing extensive experiments, our effectiveness, superiority, and efficiency are all validated on multiple public datasets with recent advances.
科研通智能强力驱动
Strongly Powered by AbleSci AI