计算机科学
鉴定(生物学)
人工智能
试验数据
匹配(统计)
样品(材料)
实验数据
标识符
模式识别(心理学)
计算生物学
机器学习
化学
统计
数学
生物
色谱法
植物
程序设计语言
作者
Jianbai Ye,Xiangnan He,Shujuan Wang,Meng‐Qiu Dong,Feng Wu,Shan Lu,Fuli Feng
标识
DOI:10.1021/acs.jproteome.3c00229
摘要
In bottom-up proteomics, peptide-spectrum matching is critical for peptide and protein identification. Recently, deep learning models have been used to predict tandem mass spectra of peptides, enabling the calculation of similarity scores between the predicted and experimental spectra for peptide-spectrum matching. These models follow the supervised learning paradigm, which trains a general model using paired peptides and spectra from standard data sets and directly employs the model on experimental data. However, this approach can lead to inaccurate predictions due to differences between the training data and the experimental data, such as sample types, enzyme specificity, and instrument calibration. To tackle this problem, we developed a test-time training paradigm that adapts the pretrained model to generate experimental data-specific models, namely, PepT3. PepT3 yields a 10–40% increase in peptide identification depending on the variability in training and experimental data. Intriguingly, when applied to a patient-derived immunopeptidomic sample, PepT3 increases the identification of tumor-specific immunopeptide candidates by 60%. Two-thirds of the newly identified candidates are predicted to bind to the patient's human leukocyte antigen isoforms. To facilitate access of the model and all the results, we have archived all the intermediate files in Zenodo.org with identifier 8231084.
科研通智能强力驱动
Strongly Powered by AbleSci AI