序列(生物学)
计算生物学
化学
类型(生物学)
计算机科学
生物化学
生物
生态学
作者
Jiesi Luo,Wenling Li,Zhongyu Liu,Yanzhi Guo,Xuemei Pu,Menglong Li
出处
期刊:Analyst
[The Royal Society of Chemistry]
日期:2015-01-01
卷期号:140 (9): 3048-3056
被引量:14
摘要
Many Gram-negative bacteria use the type I secretion system (T1SS) to translocate a wide range of substrates (type I secreted RTX proteins, T1SRPs) from the cytoplasm across the inner and outer membrane in one step to the extracellular space. Since T1SRPs play an important role in pathogen-host interactions, identifying them is crucial for a full understanding of the pathogenic mechanism of T1SS. However, experimental identification is often time-consuming and expensive. In the post-genomic era, it becomes imperative to predict new T1SRPs using information from the amino acid sequence alone when new proteins are being identified in a high-throughput mode. In this study, we report a two-level method for the first attempt to identify T1SRPs using sequence-derived features and the random forest (RF) algorithm. At the full-length sequence level, the results show that the unique feature of T1SRPs is the presence of variable numbers of the calcium-binding RTX repeats. These RTX repeats have a strong predictive power and so T1SRPs can be well distinguished from non-T1SRPs. At another level, different from that of the secretion signal, we find that a sequence segment located at the last 20-30 C-terminal amino acids may contain important signal information for T1SRP secretion because obvious differences were shown between the corresponding positions of T1SRPs and non-T1SRPs in terms of amino acid and secondary structure compositions. Using five-fold cross-validation, overall accuracies of 97% at the full-length sequence level and 89% at the secretion signal level were achieved through feature evaluation and optimization. Benchmarking on an independent dataset, our method could correctly predict 63 and 66 of 74 T1SRPs at the full-length sequence and secretion signal levels, respectively. We believe that this study will be useful in elucidating the secretion mechanism of T1SS and facilitating hypothesis-driven experimental design and validation.
科研通智能强力驱动
Strongly Powered by AbleSci AI