DNA-MP: a generalized DNA modifications predictor for multiple species based on powerful sequence encoding method

编码器 计算机科学 水准点(测量) 编码(内存) 分类器(UML) DNA DNA测序 人工智能 模式识别(心理学) 计算生物学 机器学习 生物 遗传学 大地测量学 操作系统 地理
作者
Muhammad Asim,Muhammad Ali Ibrahim,Ahtisham Fazeel,Andreas Dengel,Sheraz Ahmed
出处
期刊:Briefings in Bioinformatics [Oxford University Press]
卷期号:24 (1) 被引量:4
标识
DOI:10.1093/bib/bbac546
摘要

Accurate prediction of deoxyribonucleic acid (DNA) modifications is essential to explore and discern the process of cell differentiation, gene expression and epigenetic regulation. Several computational approaches have been proposed for particular type-specific DNA modification prediction. Two recent generalized computational predictors are capable of detecting three different types of DNA modifications; however, type-specific and generalized modifications predictors produce limited performance across multiple species mainly due to the use of ineffective sequence encoding methods. The paper in hand presents a generalized computational approach "DNA-MP" that is competent to more precisely predict three different DNA modifications across multiple species. Proposed DNA-MP approach makes use of a powerful encoding method "position specific nucleotides occurrence based 117 on modification and non-modification class densities normalized difference" (POCD-ND) to generate the statistical representations of DNA sequences and a deep forest classifier for modifications prediction. POCD-ND encoder generates statistical representations by extracting position specific distributional information of nucleotides in the DNA sequences. We perform a comprehensive intrinsic and extrinsic evaluation of the proposed encoder and compare its performance with 32 most widely used encoding methods on $17$ benchmark DNA modifications prediction datasets of $12$ different species using $10$ different machine learning classifiers. Overall, with all classifiers, the proposed POCD-ND encoder outperforms existing $32$ different encoders. Furthermore, combinedly over 5-fold cross validation benchmark datasets and independent test sets, proposed DNA-MP predictor outperforms state-of-the-art type-specific and generalized modifications predictors by an average accuracy of 7% across 4mc datasets, 1.35% across 5hmc datasets and 10% for 6ma datasets. To facilitate the scientific community, the DNA-MP web application is available at https://sds_genetic_analysis.opendfki.de/DNA_Modifications/.
最长约 10秒,即可获得该文献文件

科研通智能强力驱动
Strongly Powered by AbleSci AI
科研通是完全免费的文献互助平台,具备全网最快的应助速度,最高的求助完成率。 对每一个文献求助,科研通都将尽心尽力,给求助人一个满意的交代。
实时播报
海边听海完成签到 ,获得积分10
刚刚
刚刚
大吴克发布了新的文献求助10
2秒前
Glitter完成签到 ,获得积分10
4秒前
每天完成签到 ,获得积分10
4秒前
小井盖完成签到 ,获得积分10
4秒前
gqb完成签到,获得积分10
5秒前
书是人类进步的阶梯完成签到 ,获得积分10
5秒前
6秒前
家向松完成签到,获得积分10
7秒前
Aurora完成签到,获得积分10
8秒前
Jj完成签到,获得积分10
8秒前
任性的半凡完成签到,获得积分10
8秒前
跳跳虎完成签到 ,获得积分10
12秒前
大吴克发布了新的文献求助10
12秒前
饱满若灵发布了新的文献求助10
13秒前
13秒前
14秒前
14秒前
14秒前
好好学习完成签到,获得积分10
14秒前
有有完成签到 ,获得积分10
15秒前
天上掉下篇NCS完成签到,获得积分10
16秒前
冷静曲奇完成签到 ,获得积分10
19秒前
xwx发布了新的文献求助10
19秒前
19秒前
清玄一叶发布了新的文献求助10
19秒前
ZZWSWJ发布了新的文献求助10
20秒前
everyone_woo完成签到,获得积分10
23秒前
wangxuan完成签到,获得积分10
24秒前
饱满若灵完成签到,获得积分10
27秒前
岩中花树完成签到,获得积分20
27秒前
27秒前
大吴克发布了新的文献求助10
28秒前
ZZWSWJ完成签到,获得积分10
29秒前
30秒前
xiaoguang li完成签到,获得积分10
31秒前
打打应助xwx采纳,获得10
31秒前
科研搬运工完成签到,获得积分10
32秒前
找文献呢完成签到,获得积分10
33秒前
高分求助中
Continuum Thermodynamics and Material Modelling 3000
Production Logging: Theoretical and Interpretive Elements 2700
Les Mantodea de Guyane Insecta, Polyneoptera 1000
Structural Load Modelling and Combination for Performance and Safety Evaluation 1000
Conference Record, IAS Annual Meeting 1977 820
電気学会論文誌D(産業応用部門誌), 141 巻, 11 号 510
Typology of Conditional Constructions 500
热门求助领域 (近24小时)
化学 材料科学 生物 医学 工程类 有机化学 生物化学 物理 纳米技术 计算机科学 内科学 化学工程 复合材料 基因 遗传学 物理化学 催化作用 量子力学 光电子学 冶金
热门帖子
关注 科研通微信公众号,转发送积分 3571404
求助须知:如何正确求助?哪些是违规求助? 3141954
关于积分的说明 9445076
捐赠科研通 2843424
什么是DOI,文献DOI怎么找? 1562840
邀请新用户注册赠送积分活动 731366
科研通“疑难数据库(出版商)”最低求助积分说明 718524