插补(统计学)
缺少数据
计算机科学
数据挖掘
时间序列
数据建模
回归
原始数据
统计
机器学习
数学
数据库
程序设计语言
作者
Yang Hu,Ze Yang,Wenchang Hou
出处
期刊:IEEE Transactions on Knowledge and Data Engineering
[Institute of Electrical and Electronics Engineers]
日期:2023-03-01
卷期号:35 (3): 2837-2846
标识
DOI:10.1109/tkde.2021.3109115
摘要
Missing data widely exist in the raw or processed data, implying information loss. In many cases, missing values have to be accurately imputed for further use. In this paper, an extreme case, consecutively missing data in large-length and mainly remaining data in small-length, is discussed for time series varying with operating conditions, very universal in industrial processes. Firstly, to fully utilize the information of remaining data, a similar conditions screening scheme is provided, efficient to improve imputation accuracy. Then, multiple receding imputation via Gaussian process regression (GPR) and long short term memory (LSTM) neural network are proposed, deducing generic multiple combination imputation and bidirectional imputation structures. At last, applied for data imputation of extremely missing wind power data, condition-dependent on wind speed, imputation effects of the proposed methods are carefully compared. Simulation results reveal effectiveness of these methods to impute missing data under the extreme case, laying very important foundation for data-driven applications.
科研通智能强力驱动
Strongly Powered by AbleSci AI