化学计量学
偏最小二乘回归
校准
集合(抽象数据类型)
数据集
实验设计
计算机科学
主成分分析
多元统计
硝酸
线性回归
生物系统
化学
人工智能
统计
机器学习
数学
生物
无机化学
程序设计语言
作者
Luke R. Sadergaski,Gretchen Toney,Lætitia H. Delmau,Kristian Myhre
标识
DOI:10.1177/0003702820987281
摘要
Implementing remote, real-time spectroscopic monitoring of radiochemical processing streams in hot cell environments requires efficiency and simplicity. The success of optical spectroscopy for the quantification of species in chemical systems highly depends on representative training sets and suitable validation sets. Selecting a training set (i.e., calibration standards) to build multivariate regression models is both time- and resource-consuming using standard one-factor-at-a-time approaches. This study describes the use of experimental design to generate spectral training sets and a validation set for the quantification of sodium nitrate (0–1 M) and nitric acid (0.1–10 M) using the near-infrared water band centered at 1440 nm. Partial least squares regression models were built from training sets generated by both D- and I-optimal experimental designs and a one-factor-at-a-time approach. The prediction performance of each model was evaluated by comparing the bias and standard error of prediction for statistical significance. D- and I-optimal designs reduced the number of samples required to build regression models compared with one-factor-at-a-time while also improving performance. Models must be confirmed against a validation sample set when minimizing the number of samples in the training set. The D-optimal design performed the best when considering both performance and efficiency by improving predictive capability and reducing number of samples in the training set by 64% compared with the one-factor-at-a-time approach. The experimental design approach objectively selects calibration and validation spectral data sets based on statistical criterion to optimize performance and minimize resources.
科研通智能强力驱动
Strongly Powered by AbleSci AI