密码子使用偏好性
计算机科学
翻译(生物学)
计算生物学
起始密码子
翻译效率
信使核糖核酸
生物信息学
氨基酸
机器翻译
生物
人工智能
遗传学
基因组
基因
作者
Zilin Ren,Lili Jiang,Yaxin Di,Dingxi Zhang,Jianli Gong,Jianting Gong,Qiwei Jiang,Zhiguo Fu,Pingping Sun,Boxiong Yang,Ming Ni
出处
期刊:Bioinformatics
[Oxford University Press]
日期:2024-05-24
卷期号:40 (7)
被引量:1
标识
DOI:10.1093/bioinformatics/btae330
摘要
Abstract Motivation Due to the varying delivery methods of mRNA vaccines, codon optimization plays a critical role in vaccine design to improve the stability and expression of proteins in specific tissues. Considering the many-to-one relationship between synonymous codons and amino acids, the number of mRNA sequences encoding the same amino acid sequence could be enormous. Finding stable and highly expressed mRNA sequences from the vast sequence space using in silico methods can generally be viewed as a path-search problem or a machine translation problem. However, current deep learning-based methods inspired by machine translation may have some limitations, such as recurrent neural networks, which have a weak ability to capture the long-term dependencies of codon preferences. Results We develop a BERT-based architecture that uses the cross-attention mechanism for codon optimization. In CodonBERT, the codon sequence is randomly masked with each codon serving as a key and a value. In the meantime, the amino acid sequence is used as the query. CodonBERT was trained on high-expression transcripts from Human Protein Atlas mixed with different proportions of high codon adaptation index codon sequences. The result showed that CodonBERT can effectively capture the long-term dependencies between codons and amino acids, suggesting that it can be used as a customized training framework for specific optimization targets. Availability and implementation CodonBERT is freely available on https://github.com/FPPGroup/CodonBERT.
科研通智能强力驱动
Strongly Powered by AbleSci AI