Machine learning-based statistical analysis for early stage detection of cervical cancer

随机森林 计算机科学 人工智能 特征选择 宫颈癌 机器学习 转化(遗传学) 模式识别(心理学) 决策树 树(集合论) 癌症 数学 医学 基因 内科学 数学分析 生物化学 化学
作者
Mamun Ali,Kawsar Ahmed,Francis M. Bui,Iraj Sadegh Amiri,Syed Muhammad Ibrahim,Julian M.W. Quinn,Mohammad Ali Moni
出处
期刊:Computers in Biology and Medicine [Elsevier BV]
卷期号:139: 104985-104985 被引量:35
标识
DOI:10.1016/j.compbiomed.2021.104985
摘要

Cervical cancer (CC) is the most common type of cancer in women and remains a significant cause of mortality, particularly in less developed countries, although it can be effectively treated if detected at an early stage. This study aimed to find efficient machine-learning-based classifying models to detect early stage CC using clinical data. We obtained a Kaggle data repository CC dataset which contained four classes of attributes including biopsy, cytology, Hinselmann, and Schiller. This dataset was split into four categories based on these class attributes. Three feature transformation methods, including log, sine function, and Z-score were applied to these datasets. Several supervised machine learning algorithms were assessed for their performance in classification. A Random Tree (RT) algorithm provided the best classification accuracy for the biopsy (98.33%) and cytology (98.65%) data, whereas Random Forest (RF) and Instance-Based K-nearest neighbor (IBk) provided the best performance for Hinselmann (99.16%), and Schiller (98.58%) respectively. Among the feature transformation methods, logarithmic gave the best performance for biopsy datasets whereas sine function was superior for cytology. Both logarithmic and sine functions performed the best for the Hinselmann dataset, while Z-score was best for the Schiller dataset. Various Feature Selection Techniques (FST) methods were applied to the transformed datasets to identify and prioritize important risk factors. The outcomes of this study indicate that appropriate system design and tuning, machine learning methods and classification are able to detect CC accurately and efficiently in its early stages using clinical data.

科研通智能强力驱动
Strongly Powered by AbleSci AI
科研通是完全免费的文献互助平台,具备全网最快的应助速度,最高的求助完成率。 对每一个文献求助,科研通都将尽心尽力,给求助人一个满意的交代。
实时播报
刚刚
haidayu完成签到,获得积分10
3秒前
Duncan发布了新的文献求助10
5秒前
臣静的猫完成签到,获得积分10
5秒前
落后千雁完成签到,获得积分10
6秒前
6秒前
可爱的函函应助trouble虫虫采纳,获得10
7秒前
梅陇路青椒完成签到,获得积分10
8秒前
58完成签到 ,获得积分10
9秒前
why完成签到,获得积分10
10秒前
信徒完成签到,获得积分10
10秒前
柯夫子完成签到,获得积分10
10秒前
共享精神应助Duncan采纳,获得10
11秒前
香蕉觅云应助科研通管家采纳,获得10
11秒前
田様应助科研通管家采纳,获得10
11秒前
11秒前
科目三应助科研通管家采纳,获得50
11秒前
科研通AI61应助科研通管家采纳,获得10
11秒前
11秒前
haifeng完成签到,获得积分10
12秒前
CodeCraft应助科研通管家采纳,获得10
12秒前
CipherSage应助科研通管家采纳,获得30
12秒前
爆米花应助科研通管家采纳,获得10
12秒前
cdercder应助科研通管家采纳,获得10
12秒前
molihuakai应助科研通管家采纳,获得10
12秒前
酷波er应助科研通管家采纳,获得10
12秒前
lanbing802发布了新的文献求助10
12秒前
Hello应助ZYC007采纳,获得20
13秒前
夜雨完成签到,获得积分10
15秒前
AXX041795发布了新的文献求助20
16秒前
lkl关闭了lkl文献求助
17秒前
李爱国应助rues011采纳,获得10
18秒前
18秒前
Kao应助Brave采纳,获得10
20秒前
离开时是天命完成签到,获得积分10
21秒前
ssshs发布了新的文献求助10
22秒前
科研通AI6.4应助666采纳,获得10
22秒前
小也完成签到,获得积分10
23秒前
顺利毕业耶耶耶完成签到,获得积分10
23秒前
科研通AI6.2应助vivi采纳,获得10
23秒前
高分求助中
Cronologia da história de Macau 5000
Merrill's Atlas of Radiographic Positioning and Procedures - 3-Volume Set, 16th Edition 2000
Interactions of Vowel Quality and Prosody in East Slavic 500
Vander's Renal Physiology第10版 500
CLSI M27M44S Performance Standards for Antifungal Susceptibility Testing of Yeasts Fourth Edition 400
Python for Chemists 400
Analytical Separation Science 400
热门求助领域 (近24小时)
化学 材料科学 医学 生物 纳米技术 工程类 有机化学 化学工程 生物化学 计算机科学 内科学 物理 复合材料 催化作用 细胞生物学 无机化学 光电子学 物理化学 电极 基因
热门帖子
关注 科研通微信公众号,转发送积分 7111094
求助须知:如何正确求助?哪些是违规求助? 8764758
关于积分的说明 18535429
捐赠科研通 6678970
什么是DOI,文献DOI怎么找? 3143959
关于科研通互助平台的介绍 2259444
邀请新用户注册赠送积分活动 2118841