特征选择
计算机科学
关键质量属性
特征(语言学)
排名(信息检索)
质量(理念)
生化工程
数据挖掘
人工智能
机器学习
化学
工程类
哲学
物理化学
认识论
粒径
语言学
作者
Neelesh Gangwar,Keerthiveena Balraj,Anurag S. Rathore
标识
DOI:10.1007/s00253-024-13147-w
摘要
Abstract Cell culture media play a critical role in cell growth and propagation by providing a substrate; media components can also modulate the critical quality attributes (CQAs). However, the inherent complexity of the cell culture media makes unraveling the impact of the various media components on cell growth and CQAs non-trivial. In this study, we demonstrate an end-to-end machine learning framework for media component selection and prediction of CQAs. The preliminary dataset for feature selection was generated by performing CHO-GS (-/-) cell culture in media formulations with varying metal ion concentrations. Acidic and basic charge variant composition of the innovator product (24.97 ± 0.54% acidic and 11.41 ± 1.44% basic) was chosen as the target variable to evaluate the media formulations. Pearson’s correlation coefficient and random forest-based techniques were used for feature ranking and feature selection for the prediction of acidic and basic charge variants. Furthermore, a global interpretation analysis using SHapley Additive exPlanations was utilized to select optimal features by evaluating the contributions of each feature in the extracted vectors. Finally, the medium combinations were predicted by employing fifteen different regression models and utilizing a grid search and random search cross-validation for hyperparameter optimization. Experimental results demonstrate that Fe and Zn significantly impact the charge variant profile. This study aims to offer insights that are pertinent to both innovators seeking to establish a complete pipeline for media development and optimization and biosimilar-based manufacturers who strive to demonstrate the analytical and functional biosimilarity of their products to the innovator. Key points • Developed a framework for optimizing media components and prediction of CQA. • SHAP enhances global interpretability, aiding informed decision-making. • Fifteen regression models were employed to predict medium combinations.
科研通智能强力驱动
Strongly Powered by AbleSci AI