毒性
生物信息学
化学毒性
计算机科学
机器学习
人工智能
药物发现
二元分类
训练集
计算生物学
化学
生物信息学
生物
支持向量机
生物化学
有机化学
基因
作者
Zhiyuan Wang,Piaopiao Zhao,Xiaoxiao Zhang,Xuan Xu,Weihua Li,Guixia Liu,Yun Tang
标识
DOI:10.1016/j.comtox.2021.100155
摘要
Chemical respiratory toxicity usually causes serious harms to human body, so it is necessary to identify drugs or compounds with potential respiratory toxicity in early drug discovery stage. In this study, we collected 2,529 compounds from public databases and literature, and used six machine learning methods together with nine types of molecular fingerprints to construct a series of binary classification models for prediction of chemical respiratory toxicity. The accuracy of the best performing model was 0.869 for test set, and 0.933 for external validation set. Meanwhile, we defined the applicability domain of the models based on molecular similarity. We also identified the structural alerts about chemical respiratory toxicity through information gain and substructure frequency analysis, which could be used to elucidate their mechanisms and optimize the structures with less toxicity. Our study would be very helpful for prediction of chemical respiratory toxicity in early stage of drug discovery and environmental risk assessment.
科研通智能强力驱动
Strongly Powered by AbleSci AI