计算机科学
杠杆(统计)
变压器
串联
碎片(计算)
碰撞
质谱
图形
串联质谱法
人工智能
算法
数据挖掘
机器学习
模式识别(心理学)
理论计算机科学
质谱法
化学
物理
电压
材料科学
计算机安全
色谱法
量子力学
操作系统
复合材料
作者
Adamo Young,Bo Wang,Hannes Röst
出处
期刊:Cornell University - arXiv
日期:2021-01-01
被引量:5
标识
DOI:10.48550/arxiv.2111.04824
摘要
Tandem mass spectra capture fragmentation patterns that provide key structural information about a molecule. Although mass spectrometry is applied in many areas, the vast majority of small molecules lack experimental reference spectra. For over seventy years, spectrum prediction has remained a key challenge in the field. Existing deep learning methods do not leverage global structure in the molecule, potentially resulting in difficulties when generalizing to new data. In this work we propose a new model, MassFormer, for accurately predicting tandem mass spectra. MassFormer uses a graph transformer architecture to model long-distance relationships between atoms in the molecule. The transformer module is initialized with parameters obtained through a chemical pre-training task, then fine-tuned on spectral data. MassFormer outperforms competing approaches for spectrum prediction on multiple datasets, and is able to recover prior knowledge about the effect of collision energy on the spectrum. By employing gradient-based attribution methods, we demonstrate that the model can identify relationships between fragment peaks. To further highlight MassFormer's utility, we show that it can match or exceed existing prediction-based methods on two spectrum identification tasks. We provide open-source implementations of our model and baseline approaches, with the goal of encouraging future research in this area.
科研通智能强力驱动
Strongly Powered by AbleSci AI