任务(项目管理)
计算机科学
毒性
人工智能
机器学习
自然语言处理
工程类
化学
系统工程
有机化学
作者
Zhichao Tan,Youcai Zhao,Kunsen Lin,Tao Zhou
标识
DOI:10.1016/j.jhazmat.2024.135265
摘要
In silico models for screening substances of healthy and ecological concern are essential for effective chemical management. However, current data-driven toxicity prediction models confront formidable challenges related to expressive capacity, data scarcity, and reliability issues. Thus, this study introduces TOX-BERT, a SMILES-based pretrained model for screening health and ecological toxicity. Results show that masked atom recovery pretraining and multi-task learning offer promising solutions to enhance model capacity and address data scarcity issues. Two novel application domain (AD) parameters, termed PCA-AD and LDS, were proposed to improve prediction reliability of TOX-BERT with accuracy surpassing 90 % and mean absolute error (MAE) below 0.52. TOX-BERT was applied to 18,905 IECSC chemicals, revealing distinct toxicity relationships that align with experimental studies such as those between cardiotoxicity and acute ecotoxicity. In addition to previous PBT screening, 156 potential high-risk chemicals for specific endpoint were identified covering 7 categories. Furthermore, a SMILES-based toxicity site detection approach was developed for structural toxicity analysis. These advancements carry profound implications to address challenges faced by current data-driven toxicity prediction models. TOX-BERT emerges as a valuable tool for more comprehensive, reliable, and applicable predictions of health and ecological toxicity in chemical risk assessment and management.
科研通智能强力驱动
Strongly Powered by AbleSci AI