人工智能
计算机科学
机器学习
特征选择
过度拟合
代谢组学
朴素贝叶斯分类器
背景(考古学)
可解释性
支持向量机
生物标志物发现
决策树
模式识别(心理学)
人工神经网络
生物信息学
蛋白质组学
生物
化学
基因
生物化学
古生物学
作者
Julien Boccard,Alexandros Kalousis,Mélanie Hilario,Pierre Lantéri,Mohamed Hanafi,Gérard Mazerolles,Jean‐Luc Wolfender,Pierre‐Alain Carrupt,Serge Rudaz
标识
DOI:10.1016/j.chemolab.2010.03.003
摘要
Metabolomics experiments involve the simultaneous detection of a high number of metabolites leading to large multivariate datasets and computer-based applications are required to extract relevant biological information. A high-throughput metabolic fingerprinting approach based on ultra performance liquid chromatography (UPLC) and high resolution time-of-flight (TOF) mass spectrometry (MS) was developed for the detection of wound biomarkers in the model plant Arabidopsis thaliana. High-dimensional data were generated and analysed with chemometric methods. Besides, machine learning classification algorithms constitute promising tools to decipher complex metabolic phenotypes but their application remains however scarcely reported in that research field. The present work proposes a comparative evaluation of a set of diverse machine learning schemes in the context of metabolomic data with respect to their ability to provide a deeper insight into the metabolite network involved in the wound response. Standalone classifiers, i.e. J48 (decision tree), kNN (instance-based learner), SMO (support vector machine), multilayer perceptron and RBF network (neural networks) and Naive Bayes (probabilistic method), or combinations of classification and feature selection algorithms, such as Information Gain, RELIEF-F, Correlation Feature-based Selection and SVM-based methods, are concurrently assessed and cross-validation resampling procedures are used to avoid overfitting. This study demonstrates that machine learning methods represent valuable tools for the analysis of UPLC-TOF/MS metabolomic data. In addition, remarkable performance was achieved, while the models' stability showed the robustness and the interpretability potential. The results allowed drawing attention to both temporal and spatial metabolic patterns in the context of stress signalling and highlighting relevant biomarkers not evidenced with standard data treatment.
科研通智能强力驱动
Strongly Powered by AbleSci AI