化学
卷积神经网络
人工智能
高效液相色谱法
数据集
均方预测误差
一般化
深度学习
保留时间
色谱法
人工神经网络
模式识别(心理学)
机器学习
计算机科学
数学
数学分析
作者
E. S. Fedorova,Dmitriy D. Matyushin,I. V. Plyushchenko,А. Н. Ставрианиди,А. К. Buryak
标识
DOI:10.1016/j.chroma.2021.462792
摘要
Retention time prediction in high-performance liquid chromatography (HPLC) is the subject of many studies since it can improve the identification of unknown molecules in untargeted profiling using HPLC coupled with high-resolution mass spectrometry. Lots of approaches were developed for retention time prediction in liquid chromatography for a different number of molecules considering various molecular properties and machine learning algorithms. The recently built large retention time data set of standard compounds from the Metabolite and Chemical Entity Database (METLIN) allows researchers to create a model that can be used for retention time prediction of small molecules with wide varieties of structures and physicochemical properties. The ability to predict retention times using the largest data set was studied for different architectures of deep learning models that were trained on molecular fingerprints, and SMILES (string representation of a molecule) represented as one-hot matrices. The best result was achieved with a one-dimensional convolutional neural network (1D CNN) that uses SMILES as an input. The proposed model reached the mean absolute error and the median absolute error equal to 34.7 and 18.7 s, respectively, which outperformed the results previously obtained for this data set. The pre-trained 1D CNN on the METLIN SMRT data set was transferred on five other data sets to evaluate the generalization ability.
科研通智能强力驱动
Strongly Powered by AbleSci AI