定向进化
定向分子进化
序列空间
工作流程
计算机科学
生物信息学
序列(生物学)
人工智能
计算生物学
机器学习
蛋白质工程
蛋白质测序
化学空间
生物
肽序列
生物信息学
酶
遗传学
数学
药物发现
生物化学
基因
巴拿赫空间
突变体
数据库
纯数学
作者
Zachary Wu,S. B. Jennifer Kan,Russell D. Lewis,Bruce J. Wittmann,Frances H. Arnold
标识
DOI:10.1073/pnas.1901979116
摘要
To reduce experimental effort associated with directed protein evolution and to explore the sequence space encoded by mutating multiple positions simultaneously, we incorporate machine learning into the directed evolution workflow. Combinatorial sequence space can be quite expensive to sample experimentally, but machine-learning models trained on tested variants provide a fast method for testing sequence space computationally. We validated this approach on a large published empirical fitness landscape for human GB1 binding protein, demonstrating that machine learning-guided directed evolution finds variants with higher fitness than those found by other directed evolution approaches. We then provide an example application in evolving an enzyme to produce each of the two possible product enantiomers (i.e., stereodivergence) of a new-to-nature carbene Si–H insertion reaction. The approach predicted libraries enriched in functional enzymes and fixed seven mutations in two rounds of evolution to identify variants for selective catalysis with 93% and 79% ee (enantiomeric excess). By greatly increasing throughput with in silico modeling, machine learning enhances the quality and diversity of sequence solutions for a protein engineering problem.
科研通智能强力驱动
Strongly Powered by AbleSci AI