过度拟合
计算机科学
时间轴
协议(科学)
工作流程
机器学习
数学
数据库
人工神经网络
医学
统计
病理
替代医学
作者
César de Oliveira,Karl Leswing,Shulu Feng,R. P. F. Kanters,Robert Abel,Sathesh Bhat
标识
DOI:10.26434/chemrxiv-2023-vv5cq
摘要
Significant improvements have been made in the past decade to methods that rapidly and accurately predict binding affinity through free energy perturbation (FEP) calculations. This has been driven by recent advances in small molecule force fields and sampling algorithms combined with the availability of low-cost parallel computing. Predictive accuracies of ~1 kcal mol-1 have been regularly achieved, which are sufficient to drive potency optimization in modern drug discovery campaigns. Despite the robustness of these FEP approaches across multiple target classes, there are invariably target systems that do not display expected performance with default FEP settings. Traditionally, these systems required labor-intensive manual protocol development to arrive at parameter settings that produce a predictive FEP model. Due to the a) relatively large parameter space to be explored, b) significant compute requirements, and c) limited understanding of how combinations of parameters can affect FEP performance, manual FEP protocol optimization can take weeks to months to complete, and often does not involve rigorous train-test set splits, resulting in potential overfitting. These manual FEP protocol development timelines do not coincide with tight drug discovery project timelines, essentially preventing the use of FEP calculations for these target systems. Here, we describe an automated workflow termed FEP Protocol Builder (FEP-PB) to rapidly generate accurate FEP protocols for systems that do not perform well with default settings. FEP-PB uses active learning to iteratively search the protocol parameter space to develop accurate FEP protocols. To validate this approach, we applied it to pharmaceutically relevant systems where default FEP settings could not produce predictive models. We demonstrate that FEP-PB can rapidly generate accurate FEP protocols for the previously challenging MCL1 system with limited human intervention. We also apply FEP-PB in a real-world drug discovery setting to generate an accurate FEP protocol for the p97 system. FEP-PB is able to generate a more accurate protocol than the expert user, rapidly validating p97 as amenable to free energy calculations. Additionally, through the active learning process, we are able to gain insight into which parameters are most important for a given system. These results suggest that FEP-PB is a robust tool that can aid in rapidly developing accurate FEP protocols and increasing the number of targets that are amenable to the technology.
科研通智能强力驱动
Strongly Powered by AbleSci AI