随机森林
分类器(UML)
人工智能
假阳性悖论
机器学习
计算机科学
生物信息学
算法
生物
生物化学
基因
作者
Yu Huang,Ningning He,Yu Chen,Zhen Chen,Lei Li
摘要
N 6 -methyladenosine (m 6 A) is a prevalent RNA methylation modification involved in several biological processes.Hundreds or thousands of m 6 A sites identified from different species using high-throughput experiments provides a rich resource to construct in-silico approaches for identifying m 6 A sites.The existing m 6 A predictors are developed using conventional machine-learning (ML) algorithms and most are species-centric.In this paper, we develop a novel cross-species deep-learning classifier based on bidirectional Gated Recurrent Unit (BGRU) for the prediction of m 6 A sites.In comparison with conventional ML approaches, BGRU achieves outstanding performance for the Mammalia dataset that contains over fifty thousand m 6 A sites but inferior for the Saccharomyces cerevisiae dataset that covers around a thousand positives.The accuracy of BGRU is sensitive to the data size and the sensitivity is compensated by the integration of a random forest classifier with a novel encoding of enhanced nucleic acid content.The integrated approach dubbed as BGRU-based Ensemble RNA Methylation site Predictor (BERMP) has competitive performance in both cross-validation test and independent test.BERMP also outperforms existing m 6 A predictors for different species.Therefore, BERMP is a novel multi-species tool for identifying m 6 A sites with high confidence.This classifier is freely available at http://www.bioinfogo.org/bermp.
科研通智能强力驱动
Strongly Powered by AbleSci AI