DGA-5mC: A 5-methylcytosine site prediction model based on an improved DenseNet and bidirectional GRU method

计算机科学 深度学习 人工智能 机器学习 编码(社会科学) 鉴定(生物学) 算法 数学 生物 统计 植物
作者
Jianhua Jia,Lulu Qin,Rufeng Lei
出处
期刊:Mathematical Biosciences and Engineering [American Institute of Mathematical Sciences]
卷期号:20 (6): 9759-9780 被引量:3
标识
DOI:10.3934/mbe.2023428
摘要

The 5-methylcytosine (5mC) in the promoter region plays a significant role in biological processes and diseases. A few high-throughput sequencing technologies and traditional machine learning algorithms are often used by researchers to detect 5mC modification sites. However, high-throughput identification is laborious, time-consuming and expensive; moreover, the machine learning algorithms are not so advanced. Therefore, there is an urgent need to develop a more efficient computational approach to replace those traditional methods. Since deep learning algorithms are more popular and have powerful computational advantages, we constructed a novel prediction model, called DGA-5mC, to identify 5mC modification sites in promoter regions by using a deep learning algorithm based on an improved densely connected convolutional network (DenseNet) and the bidirectional GRU approach. Furthermore, we added a self-attention module to evaluate the importance of various 5mC features. The deep learning-based DGA-5mC model algorithm automatically handles large proportions of unbalanced data for both positive and negative samples, highlighting the model's reliability and superiority. So far as the authors are aware, this is the first time that the combination of an improved DenseNet and bidirectional GRU methods has been used to predict the 5mC modification sites in promoter regions. It can be seen that the DGA-5mC model, after using a combination of one-hot coding, nucleotide chemical property coding and nucleotide density coding, performed well in terms of sensitivity, specificity, accuracy, the Matthews correlation coefficient (MCC), area under the curve and Gmean in the independent test dataset: 90.19%, 92.74%, 92.54%, 64.64%, 96.43% and 91.46%, respectively. In addition, all datasets and source codes for the DGA-5mC model are freely accessible at https://github.com/lulukoss/DGA-5mC.
最长约 10秒,即可获得该文献文件

科研通智能强力驱动
Strongly Powered by AbleSci AI
更新
大幅提高文件上传限制,最高150M (2024-4-1)

科研通是完全免费的文献互助平台,具备全网最快的应助速度,最高的求助完成率。 对每一个文献求助,科研通都将尽心尽力,给求助人一个满意的交代。
实时播报
刚刚
木槿完成签到 ,获得积分10
2秒前
科目三应助relexer采纳,获得10
2秒前
求求各位大老爷了给求求各位大老爷了的求助进行了留言
2秒前
科目三应助Jim luo采纳,获得10
3秒前
4秒前
勤劳弘文发布了新的文献求助10
4秒前
Qvby3完成签到 ,获得积分10
4秒前
ly关注了科研通微信公众号
6秒前
lm18994782585完成签到,获得积分10
6秒前
啦啦啦喽给啦啦啦喽的求助进行了留言
6秒前
liumu完成签到 ,获得积分10
6秒前
Seyon完成签到,获得积分10
8秒前
song完成签到 ,获得积分10
8秒前
风中泰坦给风中泰坦的求助进行了留言
10秒前
上官枫完成签到 ,获得积分10
11秒前
12秒前
绿色催化完成签到,获得积分10
13秒前
leeOOO完成签到,获得积分10
13秒前
FashionBoy应助dery采纳,获得10
15秒前
16秒前
Hello应助高源伯采纳,获得10
16秒前
不甜发布了新的文献求助60
16秒前
ringleung完成签到,获得积分20
17秒前
Jim luo发布了新的文献求助10
19秒前
ly发布了新的文献求助10
22秒前
万能图书馆应助Splaink采纳,获得10
22秒前
23秒前
鄙视注册完成签到,获得积分10
23秒前
24秒前
24秒前
小旋风应助是风动采纳,获得10
25秒前
旺旺小老太完成签到,获得积分10
26秒前
Lucas应助勤劳弘文采纳,获得10
26秒前
29秒前
sci完成签到 ,获得积分10
30秒前
锺zhishui完成签到 ,获得积分10
30秒前
cwm完成签到,获得积分10
30秒前
30秒前
31秒前
高分求助中
中国国际图书贸易总公司40周年纪念文集 大事记1949-1987 2000
Sustainability in ’Tides Chemistry 1500
TM 5-855-1(Fundamentals of protective design for conventional weapons) 1000
草地生态学 880
Threaded Harmony: A Sustainable Approach to Fashion 799
中国有机(类)肥料 500
Queer Politics in Times of New Authoritarianisms: Popular Culture in South Asia 500
热门求助领域 (近24小时)
化学 医学 生物 材料科学 工程类 有机化学 生物化学 物理 内科学 纳米技术 计算机科学 化学工程 复合材料 基因 遗传学 催化作用 物理化学 免疫学 量子力学 细胞生物学
热门帖子
关注 科研通微信公众号,转发送积分 3060283
求助须知:如何正确求助?哪些是违规求助? 2715805
关于积分的说明 7446827
捐赠科研通 2361491
什么是DOI,文献DOI怎么找? 1251506
科研通“疑难数据库(出版商)”最低求助积分说明 607767
版权声明 596475