蛋白质设计
计算机科学
序列(生物学)
杠杆(统计)
马尔可夫链
蛋白质测序
人工智能
蛋白质结构预测
机器学习
蛋白质结构
肽序列
生物
生物化学
遗传学
基因
作者
Mingrong Ren,Chungong Yu,Dongbo Bu,Haicang Zhang
标识
DOI:10.1101/2023.08.07.552204
摘要
Abstract Protein sequence design, the inverse problem of protein structure prediction, plays a crucial role in protein engineering. Although recent deep learning-based methods have shown promising advancements, achieving accurate and robust protein sequence design remains an ongoing challenge. Here, we present CarbonDesign, a new approach that draws inspiration from successful ingredients of AlphaFold for protein structure prediction and makes significant and novel developments tailored specifically for protein sequence design. At its core, CarbonDesign explores Inverseformer, a novel network architecture adapted from AlphaFold’s Evoformer, to learn representations from backbone structures and an amortized Markov Random Fields model for sequence decoding. Moreover, we incorporate other essential AlphaFold concepts into CarbonDesign: an end-to-end network recycling technique to leverage evolutionary constraints in protein language models and a multi-task learning technique to generate side chain structures corresponding to the designed sequences. Through rigorous evaluations on independent testing data sets, including the CAMEO and recent CASP15 data sets, as well as the predicted structures from AlphaFold, we show that CarbonDesign outperforms other published methods, achieving high accuracy in sequence generation. Moreover, it exhibits superior performance on de novo backbone structures obtained from recent diffusion generative models such as RFdiffusion and FrameDiff, highlighting its potential for enhancing de novo protein design. Notably, CarbonDesign also supports zero-shot prediction of the functional effects of sequence variants, indicating its potential application in directed evolution-based design. In summary, our results illustrate CarbonDesign’s accurate and robust performance in protein sequence design, making it a promising tool for applications in bioengineering.
科研通智能强力驱动
Strongly Powered by AbleSci AI