随机森林
人工智能
机器学习
算法
尿
分类器(UML)
梯度升压
生物标志物
Boosting(机器学习)
质谱法
计算机科学
化学
色谱法
生物化学
作者
Ting Zeng,Yanshan Liang,Qingyuan Dai,Jinglin Tian,Jinyao Chen,Bo Lei,Zhu Yang,Zongwei Cai
标识
DOI:10.1016/j.cclet.2022.03.020
摘要
Exposure to environmental cadmium increases the health risk of residents. Early urine metabolic detection using high-resolution mass spectrometry and machine learning algorithms would be advantageous to predict the adverse health effects. Here, we conducted machine learning approaches to screen potential biomarkers under cadmium exposure in 403 urine samples. In positive and negative ionization mode, 4207 and 3558 features were extracted, respectively. We compared seven machine learning algorithms and found that the extreme gradient boosting (XGBoost) and random forest (RF) classifiers showed better accuracy and predictive performance than others. Following 5-fold cross-validation, the value of area under curve (AUC) was both 0.93 for positive and negative ionization modes in XGBoost classifier. In the RF classifier, AUC were 0.80 and 0.84 for positive and negative ionization modes, respectively. We then identified a biomarker panel based on XGBoost and RF classifiers. The incorporation of machine learning models into urine analysis using high-resolution mass spectrometry could allow a convenient assessment of cadmium exposure.
科研通智能强力驱动
Strongly Powered by AbleSci AI