一般化
计算机科学
人工智能
集成学习
深度学习
结合位点
序列(生物学)
Web服务器
序列母题
计算生物学
机器学习
理论计算机科学
生物
数学
互联网
遗传学
万维网
数学分析
DNA
作者
Zhengfeng Wang,Xiujuan Lei
出处
期刊:Methods
[Elsevier]
日期:2022-07-08
卷期号:205: 179-190
被引量:7
标识
DOI:10.1016/j.ymeth.2022.06.014
摘要
Circular RNA (circRNA) can exert biological functions by interacting with RNA-binding protein (RBP), and some deep learning-based methods have been developed to predict RBP binding sites on circRNA. However, most of these methods identify circRNA-RBP binding sites are only based on single data resource and cannot provide exact binding sites, only providing the probability value of a sequence fragment. To solve these problems, we propose a binding sites localization algorithm that fuses binding sites from multiple databases, and further design a stacked generalization ensemble deep learning model named CirRBP to identify RBP binding sites on circRNA. The CirRBP is trained by combining the binding sites from multiple databases and makes predictions by weighted aggregating the predictions of each sub-model. The results show that the CirRBP outperforms any sub-model and existing online prediction model. For better access to our research results, we develop an open-source web application called CRWS (CircRNA-RBP Web Server). Its back-end learning model of the CRWS is a stacked generalization ensemble learning model CirRBP based on different deep learning frameworks. Given a full-length circRNA or fragment sequence and a target RBP, the CRWS can analyze and provide the exact potential binding sites of the target RBP on the given sequence through the binding sites localization algorithm, and visualize it. In addition, the CRWS can discover the most widely distributed motif in each RBP dataset. Up to now, CRWS is the first significant online tool that uses multi-source data to train models and predict exact binding sites. CRWS is now publicly and freely available without login requirement at: http://www.bioinformatics.team.
科研通智能强力驱动
Strongly Powered by AbleSci AI