编码器
人工智能
计算机科学
机器学习
过采样
生成语法
代表(政治)
深度学习
残余物
变压器
集合(抽象数据类型)
试验装置
特征学习
模式识别(心理学)
生物
算法
工程类
计算机网络
带宽(计算)
程序设计语言
操作系统
电气工程
电压
政治
政治学
法学
作者
Zutan Li,Bingbing Jin,Jingya Fang
出处
期刊:Genomics
[Elsevier]
日期:2024-01-01
卷期号:116 (1): 110749-110749
标识
DOI:10.1016/j.ygeno.2023.110749
摘要
N4-acetylcytidine (ac4C) is a highly conserved RNA modification that plays a crucial role in various biological processes. Accurately identifying ac4C sites is of paramount importance for gaining a deeper understanding of their regulatory mechanisms. Nevertheless, the existing experimental techniques for ac4C site identification are characterized by limitations in terms of cost-effectiveness, while the performance of current computational methods in accurately identifying ac4C sites requires further enhancement.In this paper, we present MetaAc4C, an advanced deep learning model that leverages pre-trained bidirectional encoder representations from transformers (BERT). The model is based on a bi-directional long short-term memory network (BLSTM) architecture, incorporating attention mechanism and residual connection. To address the issue of data imbalance, we adapt generative adversarial networks to generate synthetic feature samples. On the independent test set, MetaAc4C surpasses the current state-of-the-art ac4C prediction model, exhibiting improvements in terms of ACC, MCC, and AUROC by 2.36%, 4.76%, and 3.11%, respectively, on the unbalanced dataset. When evaluated on the balanced dataset, MetaAc4C achieves improvements in ACC, MCC, and AUROC by 2.6%, 5.11%, and 1.01%, respectively. Notably, our approach of utilizing WGAN-GP augmented training RNA samples demonstrates even superior performance compared to the SMOTE oversampling method.
科研通智能强力驱动
Strongly Powered by AbleSci AI