蛋白质工程
计算机科学
可扩展性
瓶颈
序列空间
合成生物学
软件
突变
蛋白质设计
定向进化
计算生物学
人工智能
生物
蛋白质结构
基因
遗传学
突变
程序设计语言
数学
数据库
突变体
生物化学
纯数学
嵌入式系统
巴拿赫空间
酶
作者
Jason Yang,Julie Ducharme,Kadina E. Johnston,Francesca-Zhoufan Li,Yisong Yue,Frances H. Arnold
标识
DOI:10.1101/2023.05.11.540424
摘要
ABSTRACT With advances in machine learning (ML)-assisted protein engineering, models based on data, biophysics, and natural evolution are being used to propose informed libraries of protein variants to explore. Synthesizing these libraries for experimental screens is a major bottleneck, as the cost of obtaining large numbers of exact gene sequences is often prohibitive. Degenerate codon (DC) libraries are a cost-effective alternative for generating combinatorial mutagenesis libraries where mutations are targeted to a handful of amino acid sites. However, existing computational methods to optimize DC libraries to include desired protein variants are not well suited to design libraries for ML-assisted protein engineering. To address these drawbacks, we present DEgenerate Codon Optimization for Informed Libraries (DeCOIL), a generalized method which directly optimizes DC libraries to be useful for protein engineering: to sample protein variants that are likely to have both high fitness and high diversity in the sequence search space. Using computational simulations and wet-lab experiments, we demonstrate that DeCOIL is effective across two specific case studies, with potential to be applied to many other use cases. DeCOIL offers several advantages over existing methods, as it is direct, easy-to-use, generalizable, and scalable. With accompanying software ( https://github.com/jsunn-y/DeCOIL ), DeCOIL can be readily implemented to generate desired informed libraries. Abstract Figure
科研通智能强力驱动
Strongly Powered by AbleSci AI