抗冻蛋白
边距(机器学习)
肽
随机森林
防冻剂
集合(抽象数据类型)
计算机科学
氨基酸
嗜冷菌
任务(项目管理)
人工智能
模式识别(心理学)
计算生物学
生物
化学
机器学习
生物化学
酶
工程类
有机化学
程序设计语言
系统工程
作者
Shujaat Khan,Imran Naseem,Roberto Togneri,Mohammed Bennamoun
标识
DOI:10.1109/tcbb.2016.2617337
摘要
In extreme cold weather, living organisms produce Antifreeze Proteins (AFPs) to counter the otherwise lethal intracellular formation of ice. Structures and sequences of various AFPs exhibit a high degree of heterogeneity, consequently the prediction of the AFPs is considered to be a challenging task. In this research, we propose to handle this arduous manifold learning task using the notion of localized processing. In particular, an AFP sequence is segmented into two sub-segments each of which is analyzed for amino acid and di-peptide compositions. We propose to use only the most significant features using the concept of information gain (IG) followed by a random forest classification approach. The proposed RAFP-Pred achieved an excellent performance on a number of standard datasets. We report a high Youden's index (sensitivity+specificity-1) value of 0.75 on the standard independent test data set outperforming the AFP-PseAAC, AFP_PSSM, AFP-Pred, and iAFP by a margin of 0.05, 0.06, 0.14, and 0.68, respectively. The verification rate on the UniProKB dataset is found to be 83.19 percent which is substantially superior to the 57.18 percent reported for the iAFP method.
科研通智能强力驱动
Strongly Powered by AbleSci AI