缺少数据
离群值
插补(统计学)
数据挖掘
计算机科学
异常检测
潜变量
审查(临床试验)
统计
数学
模式识别(心理学)
人工智能
作者
Jinwook Rhyu,Dragana Bozinovski,Alexis B. Dubs,Naresh Mohan,Elizabeth M. Cummings Bende,Andrew J. Maloney,M. J. Nieves,José Sangerman,Amos E. Lu,Moo Sun Hong,Anastasia Artamonova,Rui Wen Ou,Paul W. Barone,James C. Leung,Jacqueline M. Wolfrum,Anthony J. Sinskey,Stacy L. Springs,Richard D. Braatz
标识
DOI:10.1016/j.compchemeng.2023.108448
摘要
The majority of algorithms used for data imputation are based on latent variable methods. The presence of outliers in process data, however, misleads the latent relations among variables, resulting in an inaccurate estimation of missing values. This article proposes an approach for automatically detecting outliers using T2 and Q contributions and estimating missing data using various general-purpose algorithms while reducing the impact of outliers. The software is validated using biomanufacturing data from the production of a monoclonal antibody produced by Chinese hamster ovary cells in a perfusion bioreactor for five missingness cases including missing completely at random, sensor drop-out, multi-rate, patterned, and censoring. Based on the normalized root mean squared error and the three proposed metrics corresponding to feasibility, plausibility, and rapidity, respectively, matrix completion methods are the most effective, except for the censoring case in which probabilistic principal component analysis-based methods are the most effective.
科研通智能强力驱动
Strongly Powered by AbleSci AI