Conditional language models enable the efficient design of proficient enzymes

计算机科学
作者
Mahmoud E. S. Soliman,R. Illanes,Silvia Fruncillo,Ioanna T Nakou,Sebastian Lindner,Gavin Ayres,Lesley S. Sheehan,Steven J. Moss,Ulrich Eckhard,Philipp Lorenz,Noelia Ferruz
标识
DOI:10.1101/2024.05.03.592223
摘要

The design of functional enzymes holds promise for transformative solutions across various domains but presents significant challenges. Inspired by the success of language models in generating nature-like proteins, we explored the potential of an enzyme-specific language model in designing catalytically active artificial enzymes. Here, we introduce ZymCTRL ('enzyme control'), a conditional language model trained on the enzyme sequence space, capable of generating enzymes based on user-defined specifications. Experimental validation at diverse data regimes and for different enzyme families demonstrated ZymCTRL's ability to generate active enzymes across various sequence identity ranges. Specifically, we describe the design of carbonic anhydrases and lactate dehydrogenases in zero-shot, without requiring further training of the model, and showcasing activity at sequence identities below 40% compared to natural proteins. Biophysical analysis confirmed the globularity and well-folded nature of the generated sequences. Furthermore, fine-tuning the model enabled the generation of lactate dehydrogenases more likely to pass in silico filters and with activity comparable to their natural counterparts. Two of the artificial lactate dehydrogenases were scaled up and successfully lyophilised, maintaining activity and demonstrating preliminary conversion in one-pot enzymatic cascades under extreme conditions. Our findings open a new door towards the rapid and cost-effective design of artificial proficient enzymes. The model and training data are freely available to the community.
最长约 10秒,即可获得该文献文件

科研通智能强力驱动
Strongly Powered by AbleSci AI
更新
大幅提高文件上传限制,最高150M (2024-4-1)

科研通是完全免费的文献互助平台,具备全网最快的应助速度,最高的求助完成率。 对每一个文献求助,科研通都将尽心尽力,给求助人一个满意的交代。
实时播报
1秒前
睡到自然醒完成签到,获得积分10
2秒前
随机游动完成签到,获得积分10
2秒前
ww发布了新的文献求助10
3秒前
Eazin完成签到,获得积分10
3秒前
4秒前
Akim应助Yana采纳,获得10
4秒前
萌萌许完成签到,获得积分10
4秒前
OnionJJ完成签到,获得积分10
5秒前
搜集达人应助mengdi采纳,获得40
5秒前
6秒前
ClaudiaCY完成签到,获得积分10
8秒前
8秒前
顾矜应助小王不爱上班采纳,获得10
9秒前
janice116688完成签到,获得积分10
10秒前
11秒前
自觉子默完成签到,获得积分10
12秒前
许起眸完成签到,获得积分10
13秒前
卓隶完成签到,获得积分10
13秒前
香菜不明白完成签到,获得积分10
14秒前
14秒前
风趣访卉完成签到,获得积分10
15秒前
霁风朗月完成签到,获得积分10
15秒前
小丛雨完成签到,获得积分10
18秒前
xiaosui完成签到 ,获得积分10
18秒前
XHY发布了新的文献求助10
19秒前
Chandler完成签到,获得积分10
20秒前
谦让的焱完成签到,获得积分10
21秒前
甜甜圈完成签到,获得积分10
23秒前
东郭一斩完成签到,获得积分10
24秒前
筱筱完成签到 ,获得积分10
24秒前
木子李完成签到,获得积分10
26秒前
糟糕的冷雪完成签到,获得积分10
27秒前
安逸寻找完成签到,获得积分20
27秒前
28秒前
8R60d8应助小于采纳,获得10
30秒前
积极方盒发布了新的文献求助30
31秒前
31秒前
31秒前
Xu完成签到,获得积分10
32秒前
高分求助中
Production Logging: Theoretical and Interpretive Elements 2000
Very-high-order BVD Schemes Using β-variable THINC Method 1200
BIOLOGY OF NON-CHORDATES 1000
进口的时尚——14世纪东方丝绸与意大利艺术 Imported Fashion:Oriental Silks and Italian Arts in the 14th Century 800
Autoregulatory progressive resistance exercise: linear versus a velocity-based flexible model 550
The Collected Works of Jeremy Bentham: Rights, Representation, and Reform: Nonsense upon Stilts and Other Writings on the French Revolution 320
Generative AI in Higher Education 300
热门求助领域 (近24小时)
化学 医学 生物 材料科学 工程类 有机化学 生物化学 物理 内科学 纳米技术 计算机科学 化学工程 复合材料 基因 遗传学 物理化学 催化作用 细胞生物学 免疫学 冶金
热门帖子
关注 科研通微信公众号,转发送积分 3357121
求助须知:如何正确求助?哪些是违规求助? 2980638
关于积分的说明 8695327
捐赠科研通 2662283
什么是DOI,文献DOI怎么找? 1457757
科研通“疑难数据库(出版商)”最低求助积分说明 674851
邀请新用户注册赠送积分活动 665893