抗冻蛋白
化学
氨基酸
序列(生物学)
Web服务器
肽序列
计算生物学
生物化学
计算机科学
万维网
互联网
基因
生物
作者
Reny Pratiwi,Aijaz Ahmad Malik,Nalini Schaduangrat,Virapong Prachayasittikul,Jarl E. S. Wikberg,Chanin Nantasenamat,Watshara Shoombuatong
摘要
Antifreeze protein (AFP) is an ice-binding protein that protects organisms from freezing in extremely cold environments. AFPs are found across a diverse range of species and, therefore, significantly differ in their structures. As there are no consensus sequences available for determining the ice-binding domain of AFPs, thus the prediction and characterization of AFPs from their sequence is a challenging task. This study addresses this issue by predicting AFPs directly from sequence on a large set of 478 AFPs and 9,139 non-AFPs using machine learning (e.g., random forest) as a function of interpretable features (e.g., amino acid composition, dipeptide composition, and physicochemical properties). Furthermore, AFPs were characterized using propensity scores and important physicochemical properties via statistical and principal component analysis. The predictive model afforded high performance with an accuracy of 88.28% and results revealed that AFPs are likely to be composed of hydrophobic amino acids as well as amino acids with hydroxyl and sulfhydryl side chains. The predictive model is provided as a free publicly available web server called CryoProtect for classifying query protein sequence as being either AFP or non-AFP. The data set and source code are for reproducing the results which are provided on GitHub.
科研通智能强力驱动
Strongly Powered by AbleSci AI