NIR robustness model of variable selection investigation of critical quality attributes coupled with different simulate noises by prediction capability and reproducibility
variable selection is critical to select characteristic variables of critical quality attributes to improve model performance and interpret the identified variables in multivariate calibration. However, classical variable selection methods were developed and optimized by the prediction error. It is rare for the robustness evaluation of variable selection methods. In this study, the robustness of four different variable selection methods was investigated by adding different types of simulate noises to validation set and calibration and validation sets, respectively. The reproducibility as well as root mean squared error of prediction (RMSEP) were used together as common measure in assessing the robustness of different variable selection methods. The robustness of four variable selection methods method was investigated using two near infrared (NIR) datasets including open-source dataset of corn and Chinese herbal medicine (CHM) dataset. The result illustrated that variable importance in projection (VIP) was substantially more robust to additive noise, with smaller RMSEP value and high reproducibility. This provides a novel strategy for the reliability evaluation of variable selection methods in NIR model of critical quality attributes.