插补(统计学)
缺少数据
计量经济学
统计
价值(数学)
计算机科学
数学
作者
Long Chen,Changan Yuan,Yuan Ge,Xiaofeng Zhu
摘要
With the rapid proliferation of social data, the prevalence of missing values has become increasingly commonplace. Data sets containing missing values not only consume storage space but also pose a significant obstacle to direct utilization, resulting in substantial resource wastage. Generative missing value imputation methods, leveraging generative models, have demonstrated notable efficacy in recent years by directly generating values for missing components based on observable data values. This paper introduces a novel generative method for missing value imputation based on a diffusion denoising model, termed the Conditional Diffusion Model for Missing Value Imputation (CDMVI). Specifically, CDMVI trains a conditional diffusion model using complete data samples and subsequently utilizes the trained model to impute missing values in datasets. During the training stage , a subset of feature is randomly selected from complete data samples, and varying levels of random noise are introduced as condition inputs to the noise predictor within the diffusion model. In the imputation stage, the missing segments of the data are initially replaced with random noise, serving as a guide for the diffusion model to generate complete samples. Experimental evaluations across multiple datasets demonstrate the competitive performance of our proposed CDMVI method.
科研通智能强力驱动
Strongly Powered by AbleSci AI