虚拟筛选
化学空间
计算机科学
生物信息学
杠杆(统计)
人工智能
对接(动物)
计算生物学
机器学习
药物发现
化学
生物信息学
生物
医学
生物化学
基因
护理部
作者
Yang Ying,Kun Yao,Matthew P. Repasky,Karl Leswing,Robert Abel,Brian K. Shoichet,Steven V. Jerome
标识
DOI:10.1021/acs.jctc.1c00810
摘要
With the advent of make-on-demand commercial libraries, the number of purchasable compounds available for virtual screening and assay has grown explosively in recent years, with several libraries eclipsing one billion compounds. Today's screening libraries are larger and more diverse, enabling the discovery of more-potent hit compounds and unlocking new areas of chemical space, represented by new core scaffolds. Applying physics-based in silico screening methods in an exhaustive manner, where every molecule in the library must be enumerated and evaluated independently, is increasingly cost-prohibitive. Here, we introduce a protocol for machine learning-enhanced molecular docking based on active learning to dramatically increase throughput over traditional docking. We leverage a novel selection protocol that strikes a balance between two objectives: (1) identifying the best scoring compounds and (2) exploring a large region of chemical space, demonstrating superior performance compared to a purely greedy approach. Together with automated redocking of the top compounds, this method captures almost all the high scoring scaffolds in the library found by exhaustive docking. This protocol is applied to our recent virtual screening campaigns against the D4 and AMPC targets that produced dozens of highly potent, novel inhibitors, and a blind test against the MT1 target. Our protocol recovers more than 80% of the experimentally confirmed hits with a 14-fold reduction in compute cost, and more than 90% of the hit scaffolds in the top 5% of model predictions, preserving the diversity of the experimentally confirmed hit compounds.
科研通智能强力驱动
Strongly Powered by AbleSci AI