Applications of Machine Learning to In Silico Quantification of Chemicals without Analytical Standards

生物信息学 化学 机器学习 生化工程 人工智能 计算机科学 工程类 基因 生物化学
作者
Dimitri Abrahamsson,June-Soo Park,Randolph R. Singh,Marina Sirota,Tracey J. Woodruff
出处
期刊:Journal of Chemical Information and Modeling [American Chemical Society]
卷期号:60 (6): 2718-2727 被引量:42
标识
DOI:10.1021/acs.jcim.9b01096
摘要

Non-targeted analysis provides a comprehensive approach to analyze environmental and biological samples for nearly all chemicals present. One of the main shortcomings of current analytical methods and workflows is that they are unable to provide any quantitative information constituting an important obstacle in understanding environmental fate and human exposure. Herein, we present an in silico quantification method using mahine-learning for chemicals analyzed using electrospray ionization (ESI). We considered three data sets from different instrumental setups: (i) capillary electrophoresis electrospray ionization-mass spectrometry (CE-MS) in positive ionization mode (ESI+), (ii) liquid chromatography quadrupole time-of-flight mass spectrometry (LC-QTOF/MS) in ESI+ and (iii) LC-QTOF/MS in negative ionization mode (ESI−). We developed and applied two different machine-learning algorithms: a random forest (RF) and an artificial neural network (ANN) to predict the relative response factors (RRFs) of different chemicals based on their physicochemical properties. Chemical concentrations can then be calculated by dividing the measured abundance of a chemical, as peak area or peak height, by its corresponding RRF. We evaluated our models and tested their predictive power using 5-fold cross-validation (CV) and y randomization. Both the RF and the ANN models showed great promise in predicting RRFs. However, the accuracy of the predictions was dependent on the data set composition and the experimental setup. For the CE-MS ESI+ data set, the best model predicted measured RRFs with a mean absolute error (MAE) of 0.19 log units and a cross-validation coefficient of determination (Q2) of 0.84 for the testing set. For the LC-QTOF/MS ESI+ data set, the best model predicted measured RRFs with an MAE of 0.32 and a Q2 of 0.40. For the LC-QTOF/MS ESI– data set, the best model predicted measured RRFs with a MAE of 0.50 and a Q2 of 0.20. Our findings suggest that machine-learning algorithms can be used for predicting concentrations of nontargeted chemicals with reasonable uncertainties, especially in ESI+, while the application on ESI– remains a more challenging problem.
最长约 10秒,即可获得该文献文件

科研通智能强力驱动
Strongly Powered by AbleSci AI
更新
PDF的下载单位、IP信息已删除 (2025-6-4)

科研通是完全免费的文献互助平台,具备全网最快的应助速度,最高的求助完成率。 对每一个文献求助,科研通都将尽心尽力,给求助人一个满意的交代。
实时播报
小巧满天完成签到 ,获得积分10
刚刚
刚刚
今天也要开心Y完成签到,获得积分10
刚刚
1秒前
1秒前
神奇海螺完成签到 ,获得积分10
2秒前
牛肉面发布了新的文献求助10
2秒前
2秒前
乐观忆之完成签到,获得积分10
2秒前
scm应助甜心糖采纳,获得50
3秒前
帕尼灬尼完成签到,获得积分10
3秒前
小羽完成签到,获得积分10
4秒前
初初见你完成签到 ,获得积分10
4秒前
FF完成签到,获得积分10
5秒前
djiwisksk66应助Star1983采纳,获得10
6秒前
kingwill应助木虫采纳,获得20
6秒前
6秒前
bai发布了新的文献求助10
6秒前
卡斯帕良完成签到,获得积分10
7秒前
7秒前
端庄的梦山完成签到,获得积分10
7秒前
彭于晏应助Cody采纳,获得10
7秒前
8秒前
lalala发布了新的文献求助10
9秒前
彭于晏应助英勇秀采纳,获得10
9秒前
李爱国应助MORNING采纳,获得10
9秒前
Jasper应助水溶c100采纳,获得20
10秒前
ED应助耶嘿采纳,获得10
10秒前
11秒前
11秒前
温暖焱发布了新的文献求助10
12秒前
风趣的胜完成签到,获得积分10
12秒前
斯文败类应助图雄争霸采纳,获得10
12秒前
13秒前
DingShicong发布了新的文献求助10
13秒前
bai完成签到,获得积分10
14秒前
iNk应助hhh采纳,获得10
15秒前
15秒前
Lemonade发布了新的文献求助10
15秒前
PDIF-CN2发布了新的文献求助10
16秒前
高分求助中
The Mother of All Tableaux Order, Equivalence, and Geometry in the Large-scale Structure of Optimality Theory 2400
Ophthalmic Equipment Market by Devices(surgical: vitreorentinal,IOLs,OVDs,contact lens,RGP lens,backflush,diagnostic&monitoring:OCT,actorefractor,keratometer,tonometer,ophthalmoscpe,OVD), End User,Buying Criteria-Global Forecast to2029 2000
Optimal Transport: A Comprehensive Introduction to Modeling, Analysis, Simulation, Applications 800
Official Methods of Analysis of AOAC INTERNATIONAL 600
ACSM’s Guidelines for Exercise Testing and Prescription, 12th edition 588
T/CIET 1202-2025 可吸收再生氧化纤维素止血材料 500
Comparison of adverse drug reactions of heparin and its derivates in the European Economic Area based on data from EudraVigilance between 2017 and 2021 500
热门求助领域 (近24小时)
化学 材料科学 医学 生物 工程类 有机化学 生物化学 物理 内科学 纳米技术 计算机科学 化学工程 复合材料 遗传学 基因 物理化学 催化作用 冶金 细胞生物学 免疫学
热门帖子
关注 科研通微信公众号,转发送积分 3952150
求助须知:如何正确求助?哪些是违规求助? 3497551
关于积分的说明 11088037
捐赠科研通 3228178
什么是DOI,文献DOI怎么找? 1784700
邀请新用户注册赠送积分活动 868855
科研通“疑难数据库(出版商)”最低求助积分说明 801230