缺少数据
插补(统计学)
人工神经网络
计算机科学
数据挖掘
均方误差
人工智能
支持向量机
朴素贝叶斯分类器
模式识别(心理学)
机器学习
统计
数学
作者
Anita Desiani,Novi Rustiana Dewi,Annisa Nur Fauza,Muhammad Naufal Rachmatullah,Muhammad Arhami,Muhammad Nawawi
标识
DOI:10.26554/sti.2021.6.4.303-312
摘要
The University of California Irvine Heart disease dataset had missing data on several attributes. The missing data can loss the important information of the attributes, but it cannot be deleted immediately on dataset. To handle missing data, there are several ways including deletion, imputation by mean, mode, or with prediction methods. In this study, the missing data were handled by deletion technique if the attribute had more than 70% missing data. Otherwise, it were handled by mean and mode method to impute missing data that had missing data less or equal 1%. The artificial neural network was used to handle the attribute that had missing data more than 1%. The results of the techniques and methods used to handle missing data were measured based on the performance results of the classification method on data that has been handled the problem of missing data. In this study the classification method used is Artificial Neural Network, Naïve Bayes, Support Vector Machine, and K-Nearest Neighbor. The performance results of classification methods without handling missing data were compared with the performance results of classification methods after imputation missing data on dataset for accuracy, sensitivity, specificity and ROC. In addition, the comparison of the Mean Squared Error results was also used to see how close the predicted label in the classification was to the original label. The lowest Mean Squared Error wasobtained by Artificial Neural Network, which means that the Artificial Neural Network worked very well on dataset that has been handled missing data compared to other methods. The result of accuracy, specificity, sensitivity in each classification method showed that imputation missing data could increase the performance of classification, especially for the Artificial Neural Network method.
科研通智能强力驱动
Strongly Powered by AbleSci AI