计算生物学
可进化性
计算机科学
蛋白质结构
严重急性呼吸综合征冠状病毒2型(SARS-CoV-2)
序列(生物学)
抗体
蛋白质测序
蛋白质设计
2019年冠状病毒病(COVID-19)
蛋白质工程
功能(生物学)
人工智能
肽序列
生物
基因
医学
免疫学
遗传学
生物化学
疾病
病理
传染病(医学专业)
酶
作者
Varun R. Shanker,Theodora U. J. Bruun,Brian Hie,Peter S. Kim
出处
期刊:Science
[American Association for the Advancement of Science (AAAS)]
日期:2024-07-04
卷期号:385 (6704): 46-53
被引量:7
标识
DOI:10.1126/science.adk8946
摘要
Large language models trained on sequence information alone can learn high-level principles of protein design. However, beyond sequence, the three-dimensional structures of proteins determine their specific function, activity, and evolvability. Here, we show that a general protein language model augmented with protein structure backbone coordinates can guide evolution for diverse proteins without the need to model individual functional tasks. We also demonstrate that ESM-IF1, which was only trained on single-chain structures, can be extended to engineer protein complexes. Using this approach, we screened about 30 variants of two therapeutic clinical antibodies used to treat severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) infection. We achieved up to 25-fold improvement in neutralization and 37-fold improvement in affinity against antibody-escaped viral variants of concern BQ.1.1 and XBB.1.5, respectively. These findings highlight the advantage of integrating structural information to identify efficient protein evolution trajectories without requiring any task-specific training data.
科研通智能强力驱动
Strongly Powered by AbleSci AI