判别式
人工智能
变压器
水准点(测量)
计算机科学
一般化
源代码
机器学习
端到端原则
化学
数学分析
物理
数学
大地测量学
量子力学
地理
操作系统
电压
作者
Hao Cheng,B. Dharma Rao,Lei Lü,Lizhen Cui,Guobao Xiao,Ran Su,Leyi Wei
标识
DOI:10.1021/acs.analchem.1c00354
摘要
The detectability of peptides is fundamentally important in shotgun proteomics experiments. At present, there are many computational methods to predict the detectability of peptides based on sequential composition or physicochemical properties, but they all have various shortcomings. Here, we present PepFormer, a novel end-to-end Siamese network coupled with a hybrid architecture of a Transformer and gated recurrent units that is able to predict the peptide detectability based on peptide sequences only. Specially, we, for the first time, use contrastive learning and construct a new loss function for model training, greatly improving the generalization ability of our predictive model. Comparative results demonstrate that our model performs significantly better than state-of-the-art methods on benchmark data sets in two species (Homo sapiens and Mus musculus). To make the model more interpretable, we further investigate the embedded representations of peptide sequences automatically learnt from our model, and the visualization results indicate that our model can efficiently capture high-latent discriminative information, improving the predictive performance. In addition, our model shows a strong ability of cross-species transfer learning and adaptability, demonstrating that it has great potential in robust prediction of peptides detectability on different species. The source code of our proposed method can be found via https://github.com/WLYLab/PepFormer.
科研通智能强力驱动
Strongly Powered by AbleSci AI