水准点(测量)
计算机科学
序列(生物学)
核酸
计算生物学
特征(语言学)
集合(抽象数据类型)
代表(政治)
人工智能
核糖核酸
Web服务器
数据挖掘
机器学习
数据集
理论计算机科学
生物
遗传学
基因
互联网
程序设计语言
法学
地理
哲学
万维网
政治
语言学
政治学
大地测量学
作者
Jia-Wei Feng,Ning Wang,Jun Zhang,Bin Liu
标识
DOI:10.1016/j.compbiomed.2022.105940
摘要
Proteins interact with nucleic acids to regulate the life activities of organisms. Therefore, how to accurately and efficiently identify nucleic acid-binding proteins (NABPs) is particularly significant. Some sequence-based computational methods have been proposed to identify DNA- and RNA-binding proteins in previous studies. However, the benchmark datasets used by these methods ignore the proportion of NABPs in the real world, and some integration methods only integrate traditional machine learning algorithms, resulting in limited prediction performance. In this study, we proposed a sequence-based method called iDRBP-ECHF to predict the DNA-binding proteins (DBPs) and RNA-binding proteins (RBPs). We constructed a benchmark dataset by considering the proportion of positive and negative samples in the real world, and used down-sampling to generate three relatively balanced datasets to train the iDRBP-ECHF. In addition, we incorporated the deep learning algorithms into the framework to obtain a more compact high-level feature representation of the input data. The results on two independent datasets show that it achieves the most advanced performance and is superior to the other existing sequence-based DBP and RBP prediction methods. In addition, we set up a webserver iDRBP-ECHF, which can be accessed at http://bliulab.net/iDRBP-ECHF.
科研通智能强力驱动
Strongly Powered by AbleSci AI