计算机科学
蛋白质配体
亲缘关系
配体(生物化学)
源代码
集合(抽象数据类型)
交互信息
数据挖掘
编码(集合论)
机器学习
人工智能
化学
数学
程序设计语言
统计
立体化学
受体
生物化学
有机化学
作者
Norberto Sánchez‐Cruz,José L. Medina‐Franco,Jordi Mestres,Xavier Barril
出处
期刊:Bioinformatics
[Oxford University Press]
日期:2020-11-11
卷期号:37 (10): 1376-1382
被引量:88
标识
DOI:10.1093/bioinformatics/btaa982
摘要
Abstract Motivation Machine-learning scoring functions (SFs) have been found to outperform standard SFs for binding affinity prediction of protein–ligand complexes. A plethora of reports focus on the implementation of increasingly complex algorithms, while the chemical description of the system has not been fully exploited. Results Herein, we introduce Extended Connectivity Interaction Features (ECIF) to describe protein–ligand complexes and build machine-learning SFs with improved predictions of binding affinity. ECIF are a set of protein−ligand atom-type pair counts that take into account each atom’s connectivity to describe it and thus define the pair types. ECIF were used to build different machine-learning models to predict protein–ligand affinities (pKd/pKi). The models were evaluated in terms of ‘scoring power’ on the Comparative Assessment of Scoring Functions 2016. The best models built on ECIF achieved Pearson correlation coefficients of 0.857 when used on its own, and 0.866 when used in combination with ligand descriptors, demonstrating ECIF descriptive power. Availability and implementation Data and code to reproduce all the results are freely available at https://github.com/DIFACQUIM/ECIF. Supplementary information Supplementary data are available at Bioinformatics online.
科研通智能强力驱动
Strongly Powered by AbleSci AI