缺少数据
插补(统计学)
数据挖掘
聚类分析
计算机科学
奇异值分解
稳健性(进化)
模式识别(心理学)
算法
人工智能
机器学习
生物
基因
生物化学
作者
Olga G. Troyanskaya,Michael Cantor,Gavin Sherlock,Pat Brown,Trevor Hastie,Robert Tibshirani,David Botstein,Russ B. Altman
出处
期刊:Bioinformatics
[Oxford University Press]
日期:2001-06-01
卷期号:17 (6): 520-525
被引量:4187
标识
DOI:10.1093/bioinformatics/17.6.520
摘要
We present a comparative study of several methods for the estimation of missing values in gene microarray data. We implemented and evaluated three methods: a Singular Value Decomposition (SVD) based method (SVDimpute), weighted K-nearest neighbors (KNNimpute), and row average. We evaluated the methods using a variety of parameter settings and over different real data sets, and assessed the robustness of the imputation methods to the amount of missing data over the range of 1--20% missing values. We show that KNNimpute appears to provide a more robust and sensitive method for missing value estimation than SVDimpute, and both SVDimpute and KNNimpute surpass the commonly used row average method (as well as filling missing values with zeros). We report results of the comparative experiments and provide recommendations and tools for accurate estimation of missing microarray data under a variety of conditions.
科研通智能强力驱动
Strongly Powered by AbleSci AI