Prediction of Drug-Induced Liver Injury: From Molecular Physicochemical Properties and Scaffold Architectures to Machine Learning Approaches

脚手架肝损伤药品计算机科学纳米技术材料科学药理学医学程序设计语言

作者

Yulong Zhao,Zhoudong Zhang,Kai Wang,Jie Jia,Yaxuan Wang,Huanqiu Li,Xiaotian Kong,Sheng Tian

出处

期刊：Research Square - Research Square 日期：2024-04-18

链接

researchsquare.comdoi.org

标识

DOI：10.21203/rs.3.rs-4268191/v1

摘要

Abstract The process of developing new drugs is widely acknowledged as being time-intensive and requiring substantial financial investment. Despite ongoing efforts to reduce time and expenses in drug development, ensuring medication safety remains an urgent problem. One of the major problems involved in drug development is hepatotoxicity, specifically known as drug-induced liver injury (DILI). The popularity of new drugs often poses a significant barrier during development and frequently leads to their recall after launch. In silico methods have many advantages compared with traditional in vivo and in vitro assays. To establish a more precise and reliable prediction model, it is necessary to utilize an extensive and high-quality database consisting of information on drug molecule properties and structural patterns. In addition, we should also carefully select appropriate molecular descriptors that can be used to accurately depict compound characteristics. The aim of this study was to conduct a comprehensive investigation into the prediction of DILI. First, we conducted a comparative analysis of the physicochemical properties of extensively well-prepared DILI-positive and DILI-negative compounds. Then, we used classic substructure dissection methods to identify structural pattern differences between these two different types of chemical molecules. These findings indicate that it is not feasible to establish property or substructure-based rules for distinguishing between DILI-positive and DILI-negative compounds. Finally, we developed quantitative classification models for predicting DILI using the naïve Bayes classifier (NBC) and recursive partitioning (RP) machine learning techniques. The optimal DILI prediction model was obtained using NBC, which combines 21 physicochemical properties, the VolSurf descriptors, and the LCFP_10 fingerprint set. This model achieved a global accuracy (GA) of 0.855 and an area under the curve (AUC) of 0.704 for the training set, while the corresponding values were 0.619 and 0.674 for the test set, respectively. Moreover, indicative substructural fragments favorable or unfavorable for DILI were identified from the best naïve Bayesian classification model. These findings may help prioritize lead compounds in the early stage of drug development pipelines.

求助该文献

最长约 10秒，即可获得该文献文件

Prediction of Drug-Induced Liver Injury: From Molecular Physicochemical Properties and Scaffold Architectures to Machine Learning Approaches

今日热心研友