过采样
计算机科学
预处理器
数据挖掘
机器学习
人工智能
模式识别(心理学)
计算机网络
带宽(计算)
作者
Min Zeng,Beiji Zou,Faran Wei,Xiyao Liu,Lei Wang
标识
DOI:10.1109/icoacs.2016.7563084
摘要
Diabetes, vertebral column pathologies and Parkinson's disease are three common diseases which have high prevalence and brought great trouble and pain to billions of patients. Computer aided diagnosis can support decision making of physicians. However, imbalanced nature of data sets hampered the mining of medical resources. In this study, we proposed a powerful preprocessing method by combining Synthetic Minority Oversampling Technique (SMOTE) with Tomek links technique and then is applied to the imbalanced medical data sets of the three diseases. By using 8 classifiers, we compared the experimental results with those of using only SMOTE technique to evaluate the effectiveness of this method. The results show that the method of SMOTE combined with Tomek links technique is much superior compared with that of using only SMOTE. The performances are evidently better, with 31, 27, 30 out of a total of 32 evaluation metrics are improved for diabetes, Parkinson's disease, and vertebral column, respectively.
科研通智能强力驱动
Strongly Powered by AbleSci AI