蛋白质工程
计算机科学
蛋白质设计
合理设计
理论(学习稳定性)
功能(生物学)
定向进化
人工智能
蛋白质功能
信息学
深度学习
序列空间
蛋白质功能预测
机器学习
蛋白质测序
任务(项目管理)
计算生物学
代表(政治)
序列(生物学)
生物
蛋白质结构
肽序列
突变体
数学
工程类
遗传学
系统工程
生物化学
基因
酶
法学
纯数学
政治学
巴拿赫空间
电气工程
政治
作者
Ethan C. Alley,Grigory Khimulya,Surojit Biswas,Mohammed AlQuraishi,George M. Church
出处
期刊:Nature Methods
[Springer Nature]
日期:2019-10-21
卷期号:16 (12): 1315-1322
被引量:828
标识
DOI:10.1038/s41592-019-0598-1
摘要
Rational protein engineering requires a holistic understanding of protein function. Here, we apply deep learning to unlabeled amino-acid sequences to distill the fundamental features of a protein into a statistical representation that is semantically rich and structurally, evolutionarily and biophysically grounded. We show that the simplest models built on top of this unified representation (UniRep) are broadly applicable and generalize to unseen regions of sequence space. Our data-driven approach predicts the stability of natural and de novo designed proteins, and the quantitative function of molecularly diverse mutants, competitively with the state-of-the-art methods. UniRep further enables two orders of magnitude efficiency improvement in a protein engineering task. UniRep is a versatile summary of fundamental protein features that can be applied across protein engineering informatics. UniRep learns fundamental protein features from millions of amino-acid sequences using a recurrent neural network. This summary of features can then be used for protein engineering.
科研通智能强力驱动
Strongly Powered by AbleSci AI