深度学习
计算机科学
人工智能
基础(拓扑)
机器学习
数学
数学分析
作者
Rong Huang,Hejian Zhang,Min Wu,Zhiyue Men,Huanyu Chu,Jie Bai,Hong Chang,Jian Cheng,Xiaoping Liao,Yuwan Liu,Yajian Song,Huifeng Jiang
出处
期刊:PubMed
日期:2024-12-25
卷期号:40 (12): 4670-4681
标识
DOI:10.13345/j.cjb.240255
摘要
The structures and activities of enzymes are influenced by pH of the environment. Understanding and distinguishing the adaptation mechanisms of enzymes to extreme pH values is of great significance for elucidating the molecular mechanisms and promoting the industrial applications of enzymes. In this study, the ESM-2 protein language model was used to encode the secreted microbial proteins with the optimal performance above pH 9 and below pH 5, which yielded 47 725 high-pH protein sequences and 66 079 low-pH protein sequences, respectively. A deep learning model was constructed to identify protein acid-base tolerance based on amino acid sequences. The model showcased significantly higher accuracy than other methods, with the overall accuracy of 94.8%, precision of 91.8%, and a recall rate of 93.4% on the test set. Furthermore, we built a website (https://enzymepred.biodesign.ac.cn), which enabled users to predict the acid-base tolerance by submitting the protein sequences of enzymes. This study has accelerated the application of enzymes in various fields, including biotechnology, pharmaceuticals, and chemicals. It provides a powerful tool for the rapid screening and optimization of industrial enzymes.
科研通智能强力驱动
Strongly Powered by AbleSci AI