An enhanced Predictive heterogeneous ensemble model for breast cancer prediction

随机森林 支持向量机 人工智能 计算机科学 朴素贝叶斯分类器 机器学习 集成学习 乳腺癌 逻辑回归 决策树 阿达布思 模式识别(心理学) 癌症 医学 内科学
作者
S. Nanglia,Muneer Ahmad,Fawad Khan,N. Z. Jhanjhi
出处
期刊:Biomedical Signal Processing and Control [Elsevier BV]
卷期号:72: 103279-103279 被引量:127
标识
DOI:10.1016/j.bspc.2021.103279
摘要

Breast Cancer is one of the most prevalent tumors after lung cancer and is common in both women and men. This disease is mostly asymptomatic in the early stages thus detection is difficult, and it becomes complicated and expensive to be treated in later stages resulting in increased fatality rates. There are comparatively very few pieces of literature that investigated breast cancer employing an ensemble learning for cancer prediction as compared to single classifier approaches. This paper presents a heterogeneous ensemble machine learning approach, to detect breast cancer in the early stages. The proposed approach follows the CRISP-DM process and uses Stacking for building the ensemble model using three different algorithms – K-Nearest Neighbors (KNN), Support Vector Machine (SVM), and Decision Tree (DT). The performance of this meta classifier is compared with the individual performances of its base classifiers (KNN, SVM, DT) and other single classifiers – Logistic Regression (LR), Artificial Neural Network (ANN), Naïve Bayes (NB), Stochastic Gradient Descent (SGD) and a homogenous ensemble model of Random Forest (RF). The top 5 features – Glucose, Resistin, HOMA, Insulin, and BMI are derived by using Chi-Square. Evaluation of the model helps in estimating its consideration for early breast cancer prediction just by using the anthropometric data of humans. Performances of models are compared using metrics such as accuracy, AUC, ROC Curve, f1-score, precision, recall, log loss, and specificity using K-fold cross-validation of 2, 3, 5, 10, and 20 folds. The proposed ensemble model achieved the greatest accuracy of 78 % with the lowest log-loss of 0.56, at K = 20, thus rejecting the Null hypothesis. The derived p-value is 0.014, from the one-tailed t-test, which provides lower significance at ∝ = 0.05.
最长约 10秒,即可获得该文献文件

科研通智能强力驱动
Strongly Powered by AbleSci AI
科研通是完全免费的文献互助平台,具备全网最快的应助速度,最高的求助完成率。 对每一个文献求助,科研通都将尽心尽力,给求助人一个满意的交代。
实时播报
打打应助今天早睡呀采纳,获得20
刚刚
赵君仪发布了新的文献求助10
1秒前
1秒前
小白完成签到,获得积分10
1秒前
嘛呱发布了新的文献求助10
1秒前
11发布了新的文献求助10
1秒前
韩思语发布了新的文献求助10
2秒前
2秒前
苹果以云完成签到,获得积分10
2秒前
2秒前
3秒前
mwl发布了新的文献求助10
3秒前
哈哈哈哈哈哈完成签到 ,获得积分10
4秒前
spz150发布了新的文献求助10
5秒前
5秒前
有魅力的妙竹关注了科研通微信公众号
5秒前
合适从霜发布了新的文献求助10
6秒前
6秒前
俏皮行云完成签到 ,获得积分10
7秒前
7秒前
8秒前
香蕉觅云应助Espoir采纳,获得10
9秒前
微笑盼旋发布了新的文献求助10
10秒前
星迹一帆发布了新的文献求助10
10秒前
一地金啊发布了新的文献求助10
10秒前
11秒前
11秒前
11秒前
rationality发布了新的文献求助10
12秒前
隐形曼青应助SYX采纳,获得30
12秒前
复活节岛的土壤完成签到,获得积分10
12秒前
科研通AI6.3应助杨超越采纳,获得10
12秒前
lei发布了新的文献求助10
12秒前
pancake发布了新的文献求助10
13秒前
勤恳的隶发布了新的文献求助10
13秒前
14秒前
李爱国应助在南方看北方采纳,获得10
15秒前
所所应助老迟到的惋清采纳,获得10
15秒前
15秒前
爆米花应助韩思语采纳,获得10
16秒前
高分求助中
卤化钙钛矿人工突触的研究 1000
Engineering for calcareous sediments : proceedings of the International Conference on Calcareous Sediments, Perth 15-18 March 1988 / edited by R.J. Jewell, D.C. Andrews 1000
Wolffs Headache and Other Head Pain 9th Edition 1000
Continuing Syntax 1000
Signals, Systems, and Signal Processing 510
Cardiac structure and function of elite volleyball players across different playing positions 500
CLSI H26-A2 500
热门求助领域 (近24小时)
化学 材料科学 医学 生物 纳米技术 工程类 有机化学 化学工程 生物化学 计算机科学 物理 内科学 复合材料 催化作用 物理化学 光电子学 电极 细胞生物学 基因 无机化学
热门帖子
关注 科研通微信公众号,转发送积分 6242931
求助须知:如何正确求助?哪些是违规求助? 8066635
关于积分的说明 16837380
捐赠科研通 5320743
什么是DOI,文献DOI怎么找? 2833228
邀请新用户注册赠送积分活动 1810765
关于科研通互助平台的介绍 1666979