Esm4ao: A Confident Learning and Protein Language Model Based Predictor for Antioxidative Peptides Screening
化学
计算机科学
自然语言处理
作者
Ruihao Zhang,Yonghui Li,Yang Li,Hui Zhang
标识
DOI:10.2139/ssrn.4825353
摘要
Antioxidative peptides possess the ability to mitigate the detrimental effects of free radicals on human health, as well as prevent food oxidation, thus prolonging its shelf life. This study aims to develop an antioxidative peptides prediction model with confident learning and evolutionary scale modeling (ESM-2) embeddings. The Balanced accuracy and Matthews correlation coefficient of the ESM4AO model developed in this study are 0.948±0.018 and 0.892±0.033, which are 17.91% and 46.71% higher than those of the SOTA model, respectively. Moreover, this study tests 6 traditional peptide embedding methods. Compared to traditional methods, the combination of confident learning and ESM-2 embedding demonstrates superior performance in enhancing model prediction. The Matthews correlation coefficient of the present study's model was 82.79% higher than that of the optimal model without using confident learning. Therefore, the model developed in this study has important reference values for the precise screening and batch screening of future antioxidant peptides.