非编码RNA
计算生物学
计算机科学
瓶颈
核糖核酸
编码(社会科学)
源代码
生物
基因
遗传学
统计
数学
嵌入式系统
操作系统
作者
Lei Deng,Jiang Ying,Xiaowen Hu,Rongtao Zheng,Zhijian Huang,Jingpu Zhang
标识
DOI:10.1021/acs.jcim.3c00366
摘要
With the continuous development of ribosome profiling, sequencing technology, and proteomics, evidence is mounting that noncoding RNA (ncRNA) may be a novel source of peptides or proteins. These peptides and proteins play crucial roles in inhibiting tumor progression and interfering with cancer metabolism and other essential physiological processes. Therefore, identifying ncRNAs with coding potential is vital to ncRNA functional research. However, existing studies perform well in classifying ncRNAs and mRNAs, and no research has been explicitly raised to distinguish whether ncRNA transcripts have coding potential. For this reason, we propose an attention mechanism-based bidirectional LSTM network called ABLNCPP to assess the coding possibility of ncRNA sequences. Considering the sequential information loss in previous methods, we introduce a novel nonoverlapping trinucleotide embedding (NOLTE) method for ncRNAs to obtain embeddings containing sequential features. The extensive evaluations show that ABLNCPP outperforms other state-of-the-art models. In general, ABLNCPP overcomes the bottleneck of ncRNA coding potential prediction and is expected to provide valuable contributions to cancer discovery and treatment in the future. The source code and data sets are freely available at https://github.com/YinggggJ/ABLNCPP.
科研通智能强力驱动
Strongly Powered by AbleSci AI