计算机科学
校准
不确定度量化
Python(编程语言)
机器学习
实验数据
数据挖掘
过程(计算)
实验设计
单变量
工作流程
贝叶斯概率
人工智能
统计
数学
多元统计
数据库
操作系统
作者
Laura M. Helleckes,Michael Osthege,Wolfgang Wiechert,Eric von Lieres,Marco Oldiges
标识
DOI:10.1371/journal.pcbi.1009223
摘要
High-throughput experimentation has revolutionized data-driven experimental sciences and opened the door to the application of machine learning techniques. Nevertheless, the quality of any data analysis strongly depends on the quality of the data and specifically the degree to which random effects in the experimental data-generating process are quantified and accounted for. Accordingly calibration, i.e. the quantitative association between observed quantities and measurement responses, is a core element of many workflows in experimental sciences. Particularly in life sciences, univariate calibration, often involving non-linear saturation effects, must be performed to extract quantitative information from measured data. At the same time, the estimation of uncertainty is inseparably connected to quantitative experimentation. Adequate calibration models that describe not only the input/output relationship in a measurement system but also its inherent measurement noise are required. Due to its mathematical nature, statistically robust calibration modeling remains a challenge for many practitioners, at the same time being extremely beneficial for machine learning applications. In this work, we present a bottom-up conceptual and computational approach that solves many problems of understanding and implementing non-linear, empirical calibration modeling for quantification of analytes and process modeling. The methodology is first applied to the optical measurement of biomass concentrations in a high-throughput cultivation system, then to the quantification of glucose by an automated enzymatic assay. We implemented the conceptual framework in two Python packages, calibr8 and murefi , with which we demonstrate how to make uncertainty quantification for various calibration tasks more accessible. Our software packages enable more reproducible and automatable data analysis routines compared to commonly observed workflows in life sciences. Subsequently, we combine the previously established calibration models with a hierarchical Monod-like ordinary differential equation model of microbial growth to describe multiple replicates of Corynebacterium glutamicum batch cultures. Key process model parameters are learned by both maximum likelihood estimation and Bayesian inference, highlighting the flexibility of the statistical and computational framework.
科研通智能强力驱动
Strongly Powered by AbleSci AI