化学
代谢组学
集合(抽象数据类型)
计算模型
变化(天文学)
数据集
支持向量机
计算机科学
数据挖掘
人工智能
色谱法
生物系统
天体物理学
生物
物理
程序设计语言
作者
Zixuan Zhang,Huaxu Yu,Ethan Wong-Ma,Pouneh Dokouhaki,Ahmed Mostafa,Jay Shavadia,Fang Wu,Tao Huan
标识
DOI:10.1021/acs.analchem.3c04046
摘要
Processing liquid chromatography–mass spectrometry-based metabolomics data using computational programs often introduces additional quantitative uncertainty, termed computational variation in a previous work. This work develops a computational solution to automatically recognize metabolic features with computational variation in a metabolomics data set. This tool, AVIR (short for "Accurate eValuation of alIgnment and integRation"), is a support vector machine-based machine learning strategy (https://github.com/HuanLab/AVIR). The rationale is that metabolic features with computational variation have a poor correlation between chromatographic peak area and peak height-based quantifications across the samples in a study. AVIR was trained on a set of 696 manually curated metabolic features and achieved an accuracy of 94% in a 10-fold cross-validation. When tested on various external data sets from public metabolomics repositories, AVIR demonstrated an accuracy range of 84%–97%. Finally, tested on a large-scale metabolomics study, AVIR clearly indicated features with computational variation and thus guided us to manually correct them. Our results show that 75.3% of the samples with computational variation had a relative intensity difference of over 20% after correction. This demonstrates the critical role of AVIR in reducing computational variation to improve quantitative certainty in untargeted metabolomics analysis.
科研通智能强力驱动
Strongly Powered by AbleSci AI