计算机科学
插补(统计学)
数据挖掘
机器学习
缺少数据
作者
Yangyang Wu,Xiaoye Miao,Zi-ang Nan,Jinshan Zhang,HE Jian-hu,Jianwei Yin
出处
期刊:IEEE Transactions on Knowledge and Data Engineering
[Institute of Electrical and Electronics Engineers]
日期:2024-04-19
卷期号:36 (11): 6029-6041
标识
DOI:10.1109/tkde.2024.3387439
摘要
The multi-view data with incomplete information hinder effective data analysis. Existing multi-view imputation methods, which learn the mapping between a complete view and a completely missing view, are not able to deal with the typical multi-view data with missing feature information. In this paper, we propose a unified generative imputation model named UGit with optimal transport theory to simultaneously impute the missing features/values of all incomplete views. This imputation is conditional on all the observed values from the multi-view data. UGit consists of two modules, i.e., a unified multi-view generator (UMG) and a masking energy discriminator (MED). To effectively and efficiently impute missing features across all views, the generator UMG employs a unified autoencoder in conjunction with the cross-view attention mechanism to learn the data distribution from all observed multi-view data. The discriminator MED leverages a novel masking energy divergence function to make UGit differentiable for imputation accuracy enhancement. Extensive experiments on several real-world multi-view data sets demonstrate that, UGit speeds up the model training by 4.28x with more than 41% accuracy gain on average, compared to the state-of-the-art approaches.
科研通智能强力驱动
Strongly Powered by AbleSci AI