支持向量机
计算机科学
人工智能
人工神经网络
伪氨基酸组成
机器学习
训练集
随机森林
二肽
氨基酸
化学
生物化学
作者
Poonam Pandey,Vinal Patel,Nithin V. George,Sairam S. Mallajosyula
标识
DOI:10.1021/acs.jproteome.8b00322
摘要
Cell-penetrating peptides (CPPs) facilitate the transport of pharmacologically active molecules, such as plasmid DNA, short interfering RNA, nanoparticles, and small peptides. The accurate identification of new and unique CPPs is the initial step to gain insight into CPP activity. Experiments can provide detailed insight into the cell-penetration property of CPPs. However, the synthesis and identification of CPPs through wet-lab experiments is both resource- and time-expensive. Therefore, the development of an efficient prediction tool is essential for the identification of unique CPP prior to experiments. To this end, we developed a kernel extreme learning machine (KELM) based CPP prediction model called KELM-CPPpred. The main data set used in this study consists of 408 CPPs and an equal number of non-CPPs. The input features, used to train the proposed prediction model, include amino acid composition, dipeptide amino acid composition, pseudo amino acid composition, and the motif-based hybrid features. We further used an independent data set to validate the proposed model. In addition, we have also tested the prediction accuracy of KELM-CPPpred models with the existing artificial neural network (ANN), random forest (RF), and support vector machine (SVM) approaches on respective benchmark data sets used in the previous studies. Empirical tests showed that KELM-CPPpred outperformed existing prediction approaches based on SVM, RF, and ANN. We developed a web interface named KELM-CPPpred, which is freely available at http://sairam.people.iitgn.ac.in/KELM-CPPpred.html.
科研通智能强力驱动
Strongly Powered by AbleSci AI