假尿苷
计算机科学
马修斯相关系数
集合预报
试验装置
人工智能
集成学习
计算生物学
生物信息学
机器学习
鉴定(生物学)
数据挖掘
核糖核酸
生物
尿苷
遗传学
支持向量机
基因
植物
作者
Muhammad Taseer Suleman,Yaser Daanial Khan
标识
DOI:10.1016/j.ab.2023.115247
摘要
Pseudouridine (ψ) is reported to occur frequently in all types of RNA. This uridine modification has been shown to be essential for processes such as RNA stability and stress response. Also, it is linked to a few human diseases, such as prostate cancer, anemia, etc. A few laboratory techniques, such as Pseudo-seq and N3-CMC-enriched Pseudouridine sequencing (CeU-Seq) are used for detecting ψ sites. However, these are laborious and drawn-out methods. The convenience of sequencing data has enabled the development of computationally intelligent models for improving ψ site identification methods. The proposed work provides a prediction model for the identification of ψ sites through popular ensemble methods such as stacking, bagging, and boosting. Features were obtained through a novel feature extraction mechanism with the assimilation of statistical moments, which were used to train ensemble models. The cross-validation test and independent set test were used to evaluate the precision of the trained models. The proposed model outperformed the preexisting predictors and revealed 87% accuracy, 0.90 specificity, 0.85 sensitivity, and a 0.75 Matthews correlation coefficient. A web server has been built and is available publicly for the researchers at https://taseersuleman-y-test-pseu-pred-c2wmtj.streamlit.app/.
科研通智能强力驱动
Strongly Powered by AbleSci AI