i2OM: Toward a better prediction of 2′-O-methylation in human RNA

过度拟合 判别式 特征选择 计算机科学 支持向量机 人工智能 随机森林 机器学习 计算生物学 生物 人工神经网络
作者
Yuhe R. Yang,Cai-Yi Ma,Dong Gao,Xiaowei Liu,Shi-Shi Yuan,Hui Ding
出处
期刊:International Journal of Biological Macromolecules [Elsevier]
卷期号:239: 124247-124247 被引量:12
标识
DOI:10.1016/j.ijbiomac.2023.124247
摘要

2'-O-methylation (2OM) is an omnipresent post-transcriptional modification in RNAs. It is important for the regulation of RNA stability, mRNA splicing and translation, as well as innate immunity. With the increase in publicly available 2OM data, several computational tools have been developed for the identification of 2OM sites in human RNA. Unfortunately, these tools suffer from the low discriminative power of redundant features, unreasonable dataset construction or overfitting. To address those issues, based on four types of 2OM (2OM-adenine (A), cytosine (C), guanine (G), and uracil (U)) data, we developed a two-step feature selection model to identify 2OM. For each type, the one-way analysis of variance (ANOVA) combined with mutual information (MI) was proposed to rank sequence features for obtaining the optimal feature subset. Subsequently, four predictors based on eXtreme Gradient Boosting (XGBoost) or support vector machine (SVM) were presented to identify the four types of 2OM sites. Finally, the proposed model could produce an overall accuracy of 84.3 % on the independent set. To provide a convenience for users, an online tool called i2OM was constructed and can be freely access at i2om.lin-group.cn. The predictor may provide a reference for the study of the 2OM.
最长约 10秒,即可获得该文献文件

科研通智能强力驱动
Strongly Powered by AbleSci AI
科研通是完全免费的文献互助平台,具备全网最快的应助速度,最高的求助完成率。 对每一个文献求助,科研通都将尽心尽力,给求助人一个满意的交代。
实时播报
刚刚
一二发布了新的文献求助10
刚刚
玻璃外的世界完成签到,获得积分10
1秒前
TOO发布了新的文献求助10
1秒前
1秒前
科研通AI5应助机智小白锋采纳,获得10
1秒前
ZLDDLDX发布了新的文献求助30
1秒前
1秒前
2秒前
2秒前
华仔应助文御采纳,获得10
3秒前
wu完成签到,获得积分10
4秒前
4秒前
甜蜜梦琪发布了新的文献求助30
4秒前
顺利凡蕾完成签到,获得积分10
4秒前
芒琪发布了新的文献求助20
4秒前
4秒前
5秒前
5秒前
5秒前
科研通AI5应助ZYF采纳,获得10
6秒前
爆米花应助小猪同学采纳,获得10
6秒前
6秒前
xf发布了新的文献求助10
6秒前
6秒前
wu发布了新的文献求助10
7秒前
乐乐应助一二采纳,获得10
7秒前
8秒前
8秒前
科研通AI5应助cm采纳,获得30
8秒前
cleva完成签到,获得积分10
8秒前
8秒前
10秒前
小七发布了新的文献求助10
10秒前
kkqq关注了科研通微信公众号
11秒前
衍乔发布了新的文献求助30
11秒前
杨树发布了新的文献求助10
11秒前
无花果应助淡定香萱采纳,获得10
12秒前
所所应助苹果丝采纳,获得10
12秒前
老水完成签到,获得积分10
13秒前
高分求助中
Continuum Thermodynamics and Material Modelling 4000
Production Logging: Theoretical and Interpretive Elements 2700
Les Mantodea de Guyane Insecta, Polyneoptera 1000
Unseen Mendieta: The Unpublished Works of Ana Mendieta 1000
El viaje de una vida: Memorias de María Lecea 800
Novel synthetic routes for multiple bond formation between Si, Ge, and Sn and the d- and p-block elements 700
Neuromuscular and Electrodiagnostic Medicine Board Review 700
热门求助领域 (近24小时)
化学 材料科学 生物 医学 工程类 有机化学 生物化学 物理 纳米技术 计算机科学 内科学 化学工程 复合材料 基因 遗传学 物理化学 催化作用 量子力学 光电子学 冶金
热门帖子
关注 科研通微信公众号,转发送积分 3514919
求助须知:如何正确求助?哪些是违规求助? 3097284
关于积分的说明 9234961
捐赠科研通 2792241
什么是DOI,文献DOI怎么找? 1532370
邀请新用户注册赠送积分活动 712002
科研通“疑难数据库(出版商)”最低求助积分说明 707071