计算机科学
稳健性(进化)
蛋白质数据库
分子动力学
生物系统
算法
匹配(统计)
蛋白质结构
化学
计算化学
数学
统计
生物
生物化学
基因
作者
Michael S. Jones,Smayan Khanna,Andrew L. Ferguson
标识
DOI:10.1021/acs.jcim.4c02046
摘要
Coarse-grained models have become ubiquitous in biomolecular modeling tasks aimed at studying slow dynamical processes such as protein folding and DNA hybridization. These models can considerably accelerate sampling but it remains challenging to accurately and efficiently restore all-atom detail to the coarse-grained trajectory, which can be vital for detailed understanding of molecular mechanisms and calculation of observables contingent on all-atom coordinates. In this work, we introduce FlowBack as a deep generative model employing a flow-matching objective to map samples from a coarse-grained prior distribution to an all-atom data distribution. We construct our prior distribution to be agnostic to the coarse-grained map and molecular type. A protein-specific model trained on ∼65k structures from the Protein Data Bank achieves state-of-the-art performance on structural metrics compared to previous generative and rules-based approaches in applications to static PDB structures, all-atom simulations of fast-folding proteins, and coarse-grained trajectories generated by a machine-learned force field. A DNA–protein model trained on ∼1.5k DNA–protein complexes achieves excellent reconstruction and generative capabilities on static DNA–protein complexes from the Protein Data Bank as well as on out-of-distribution coarse-grained dynamical simulations of DNA–protein complexation. FlowBack offers an accurate, efficient, and easy-to-use tool to recover all-atom structures from coarse-grained molecular simulations with higher robustness and fewer steric clashes than previous approaches. We make FlowBack freely available to the community as an open source Python package.
科研通智能强力驱动
Strongly Powered by AbleSci AI