困惑
自回归模型
计算机科学
序列(生物学)
蛋白质设计
一般化
算法
人工智能
深度学习
机器学习
蛋白质结构
数学
语言模型
计量经济学
物理
数学分析
生物
遗传学
核磁共振
出处
期刊:Protein Engineering Design & Selection
[Oxford University Press]
日期:2023-12-29
卷期号:37
标识
DOI:10.1093/protein/gzad024
摘要
Abstract Deep learning methods for protein sequence design focus on modeling and sampling the many- dimensional distribution of amino acid sequences conditioned on the backbone structure. To produce physically foldable sequences, inter-residue couplings need to be considered properly. These couplings are treated explicitly in iterative methods or autoregressive methods. Non-autoregressive models treating these couplings implicitly are computationally more efficient, but still await tests by wet experiment. Currently, sequence design methods are evaluated mainly using native sequence recovery rate and native sequence perplexity. These metrics can be complemented by sequence-structure compatibility metrics obtained from energy calculation or structure prediction. However, existing computational metrics have important limitations that may render the generalization of computational test results to performance in real applications unwarranted. Validation of design methods by wet experiments should be encouraged.
科研通智能强力驱动
Strongly Powered by AbleSci AI