Boosting(机器学习)
分类器(UML)
概化理论
人工智能
计算机科学
机器学习
训练集
梯度升压
宫颈癌
模式识别(心理学)
医学
数学
统计
癌症
随机森林
内科学
作者
Youyi Song,Jing Zou,Kup‐Sze Choi,Baiying Lei,Jing Qin
标识
DOI:10.1016/j.media.2023.103014
摘要
Cell classification underpins intelligent cervical cancer screening, a cytology examination that effectively decreases both the morbidity and mortality of cervical cancer. This task, however, is rather challenging, mainly due to the difficulty of collecting a training dataset representative sufficiently of the unseen test data, as there are wide variations of cells' appearance and shape at different cancerous statuses. This difficulty makes the classifier, though trained properly, often classify wrongly for cells that are underrepresented by the training dataset, eventually leading to a wrong screening result. To address it, we propose a new learning algorithm, called worse-case boosting, for classifiers effectively learning from under-representative datasets in cervical cell classification. The key idea is to learn more from worse-case data for which the classifier has a larger gradient norm compared to other training data, so these data are more likely to correspond to underrepresented data, by dynamically assigning them more training iterations and larger loss weights for boosting the generalizability of the classifier on underrepresented data. We achieve this idea by sampling worse-case data per the gradient norm information and then enhancing their loss values to update the classifier. We demonstrate the effectiveness of this new learning algorithm on two publicly available cervical cell classification datasets (the two largest ones to the best of our knowledge), and positive results (4% accuracy improvement) yield in the extensive experiments. The source codes are available at: https://github.com/YouyiSong/Worse-Case-Boosting.
科研通智能强力驱动
Strongly Powered by AbleSci AI