内在无序蛋白质
计算生物学
蛋白质二级结构
序列(生物学)
蛋白质测序
蛋白质结构
功能(生物学)
折叠(DSP实现)
马修斯相关系数
化学
蛋白质折叠
蛋白质结构预测
肽序列
生物
生物系统
计算机科学
生物化学
遗传学
人工智能
支持向量机
电气工程
基因
工程类
作者
Deepak Chaurasiya,Rajkrishna Mondal,Tapobrata Lahiri,Asmita Tripathi,Tejas Ghinmine
标识
DOI:10.1080/07391102.2023.2290615
摘要
Discovery of intrinsically disordered proteins (IDPs) and protein hybrids that contain both intrinsically disordered protein regions (IDPRs) along with ordered regions has changed the sequence–structure–function paradigm of protein. These proteins with lack of persistently fixed structure are often found in all organisms and play vital roles in various biological processes. Some of them are considered as potential drug targets due to their overrepresentation in pathophysiological processes. The major bottlenecks for characterizing such proteins are their occasional overexpression, difficulty in getting purified homogeneous form and the challenge of investigating them experimentally. Sequence-based prediction of intrinsic disorder remains a useful strategy especially for many large-scale proteomic investigations. However, worst accuracy still occurs for short disordered regions with less than ten residues, for the residues close to order–disorder boundaries, for regions that undergo coupled folding and binding in presence of partner, and for prediction of fully disordered proteins. Annotation of fully disordered proteins mostly relies on the far-UV circular dichroism experiment which gives overall secondary structure composition without residue-level resolution. Current methods including that using secondary structure information failed to predict half of target IDPs correctly in the recent Critical Assessment of protein Intrinsic Disorder prediction (CAID) experiment. This study utilized profiles of random sequential appearance of physicochemical properties of amino acids and random sequential appearance of order and disorder promoting amino acids in protein together with the existing CIDER feature for the prediction of IDP from sequence input. Our method was found to significantly outperform the existing predictors across different datasets.
科研通智能强力驱动
Strongly Powered by AbleSci AI