离群值
缺少数据
统计
标准差
可靠性(半导体)
医学
样品(材料)
统计能力
数据挖掘
计量经济学
功率(物理)
计算机科学
数学
色谱法
量子力学
物理
化学
作者
Sang Gyu Kwak,Jong Hae Kim
标识
DOI:10.4097/kjae.2017.70.4.407
摘要
Missing values and outliers are frequently encountered while collecting data. The presence of missing values reduces the data available to be analyzed, compromising the statistical power of the study, and eventually the reliability of its results. In addition, it causes a significant bias in the results and degrades the efficiency of the data. Outliers significantly affect the process of estimating statistics (e.g., the average and standard deviation of a sample), resulting in overestimated or underestimated values. Therefore, the results of data analysis are considerably dependent on the ways in which the missing values and outliers are processed. In this regard, this review discusses the types of missing values, ways of identifying outliers, and dealing with the two.
科研通智能强力驱动
Strongly Powered by AbleSci AI