Prediction of Adverse Drug Reaction Linked to Protein Targets Using Network-Based Information and Machine Learning

机器学习人工智能计算机科学支持向量机随机森林中间性中心性药物数据库聚类分析人工神经网络分类器（UML）数据挖掘中心性药品医学组合数学精神科数学

作者

Cristiano Galletti,Joaquim Aguirre‐Plans,Baldo Oliva,Narcís Fernández‐Fuentes

出处

期刊：Frontiers in bioinformatics [Frontiers Media SA]
日期：2022-07-14 卷期号：2 被引量：7

链接

frontiersin.org frontiersin.org doaj.org nih.gov nih.govdoi.org

标识

DOI：10.3389/fbinf.2022.906644

摘要

Drug discovery attrition rates, particularly at advanced clinical trial stages, are high because of unexpected adverse drug reactions (ADR) elicited by novel drug candidates. Predicting undesirable ADRs produced by the modulation of certain protein targets would contribute to developing safer drugs, thereby reducing economic losses associated with high attrition rates. As opposed to the more traditional drug-centric approach, we propose a target-centric approach to predict associations between protein targets and ADRs. The implementation of the predictor is based on a machine learning classifier that integrates a set of eight independent network-based features. These include a network diffusion-based score, identification of protein modules based on network clustering algorithms, functional similarity among proteins, network distance to proteins that are part of safety panels used in preclinical drug development, set of network descriptors in the form of degree and betweenness centrality measurements, and conservation. This diverse set of descriptors were used to generate predictors based on different machine learning classifiers ranging from specific models for individual ADR to higher levels of abstraction as per MEDDRA hierarchy such as system organ class. The results obtained from the different machine-learning classifiers, namely, support vector machine, random forest, and neural network were further analyzed as a meta-predictor exploiting three different voting systems, namely, jury vote , consensus vote , and red flag , obtaining different models for each of the ADRs in analysis. The level of accuracy of the predictors justifies the identification of problematic protein targets both at the level of individual ADR as well as a set of related ADRs grouped in common system organ classes. As an example, the prediction of ventricular tachycardia achieved an accuracy and precision of 0.83 and 0.90, respectively, and a Matthew correlation coefficient of 0.70. We believe that this approach is a good complement to the existing methodologies devised to foresee potential liabilities in preclinical drug discovery. The method is available through the DocTOR utility at GitHub ( https://github.com/cristian931/DocTOR ).

求助该文献

Prediction of Adverse Drug Reaction Linked to Protein Targets Using Network-Based Information and Machine Learning

今日热心研友