伪氨基酸组成
亚科
氨基酸
两亲性
刀切重采样
序列(生物学)
作文(语言)
生物系统
化学
计算机科学
数学
生物
生物化学
统计
有机化学
哲学
估计员
基因
聚合物
语言学
二肽
共聚物
出处
期刊:Bioinformatics
[Oxford University Press]
日期:2004-08-12
卷期号:21 (1): 10-19
被引量:912
标识
DOI:10.1093/bioinformatics/bth466
摘要
Abstract Motivation: With protein sequences entering into databanks at an explosive pace, the early determination of the family or subfamily class for a newly found enzyme molecule becomes important because this is directly related to the detailed information about which specific target it acts on, as well as to its catalytic process and biological function. Unfortunately, it is both time-consuming and costly to do so by experiments alone. In a previous study, the covariant-discriminant algorithm was introduced to identify the 16 subfamily classes of oxidoreductases. Although the results were quite encouraging, the entire prediction process was based on the amino acid composition alone without including any sequence-order information. Therefore, it is worthy of further investigation. Results: To incorporate the sequence-order effects into the predictor, the ‘amphiphilic pseudo amino acid composition’ is introduced to represent the statistical sample of a protein. The novel representation contains 20 + 2λ discrete numbers: the first 20 numbers are the components of the conventional amino acid composition; the next 2λ numbers are a set of correlation factors that reflect different hydrophobicity and hydrophilicity distribution patterns along a protein chain. Based on such a concept and formulation scheme, a new predictor is developed. It is shown by the self-consistency test, jackknife test and independent dataset tests that the success rates obtained by the new predictor are all significantly higher than those by the previous predictors. The significant enhancement in success rates also implies that the distribution of hydrophobicity and hydrophilicity of the amino acid residues along a protein chain plays a very important role to its structure and function. Contact: kchou@san.rr.com
科研通智能强力驱动
Strongly Powered by AbleSci AI