蛋白质工程
定向进化
定向分子进化
计算机科学
健身景观
人工智能
多样性(政治)
序列(生物学)
适应度函数
计算生物学
机器学习
生物
酶
遗传算法
遗传学
突变体
生物化学
人口
人口学
社会学
人类学
基因
作者
Kerr Ding,M. A. Chin,Yunlong Zhao,Wei Huang,Binh Khanh,Huanan Wang,Peng Liu,Yang Yang,Yunan Luo
标识
DOI:10.1038/s41467-024-50698-y
摘要
Abstract The effective design of combinatorial libraries to balance fitness and diversity facilitates the engineering of useful enzyme functions, particularly those that are poorly characterized or unknown in biology. We introduce MODIFY, a machine learning (ML) algorithm that learns from natural protein sequences to infer evolutionarily plausible mutations and predict enzyme fitness. MODIFY co-optimizes predicted fitness and sequence diversity of starting libraries, prioritizing high-fitness variants while ensuring broad sequence coverage. In silico evaluation shows that MODIFY outperforms state-of-the-art unsupervised methods in zero-shot fitness prediction and enables ML-guided directed evolution with enhanced efficiency. Using MODIFY, we engineer generalist biocatalysts derived from a thermostable cytochrome c to achieve enantioselective C-B and C-Si bond formation via a new-to-nature carbene transfer mechanism, leading to biocatalysts six mutations away from previously developed enzymes while exhibiting superior or comparable activities. These results demonstrate MODIFY’s potential in solving challenging enzyme engineering problems beyond the reach of classic directed evolution.
科研通智能强力驱动
Strongly Powered by AbleSci AI