人工智能
深度学习
功能(生物学)
萃取(化学)
计算机科学
机器学习
生物
化学
色谱法
进化生物学
作者
Youqing Wang,Yifei Zhang,Yue Feng,Haoqian Wang,Xiao‐Fan Lin,Xin Ma,Yifei Zhang
标识
DOI:10.1101/2024.10.16.618767
摘要
Accurately predicting the functions of peptides and proteins from their amino acid sequences is essential for understanding life processes and advancing biomolecule engineering. Due to the time-consuming and resource-intensive nature of experimental procedures, computational approaches, especially those based on machine learning frameworks, have garnered significant interest. However, many existing machine learning tools are limited to specific tasks and lack adaptability across different predictions. Here we propose a versatile framework BBATProt for the prediction of various protein and peptide functions. BBATProt employs transfer learning with a pre-trained Bidirectional Encoder Representations from Transformers (BERT) model, to effectively capture high-dimensional features from amino acid sequences. The whole custom-designed network, integrating Bidirectional Long Short-Term Memory (Bi-LSTM) and Temporal Convolutional Networks (TCN), can align with the spatial characteristics of proteins. It combines local and global feature extraction through attention mechanisms for precise functional prediction. This approach ensures that key features are adaptively extracted and balanced across diverse tasks. Comprehensive evaluations show BBATProt outperforms state-of-the-art models in predicting functions like hydrolytic catalysis, activity of peptides, and post-translational modification sites. Visualizations of feature evolution and refinement via attention mechanisms validate the framework's interpretability, providing transparency into the evolutional process and offering deeper insights into function prediction.
科研通智能强力驱动
Strongly Powered by AbleSci AI