Niaz Bahar Chowdhury,Mark Kathol,Nabia Shahreen,Rajib Saha
标识
DOI:10.1101/2025.02.21.639544
摘要
Rhodopseudomonas palustris, a versatile bacterium with diverse biotechnological applications, can effectively breakdown lignin, a complex and abundant polymer in plant biomass. This study investigates the metabolic response of R. palustris when catabolizing various lignin breakdown products (LBPs), including the monolignols p coumaryl alcohol, coniferyl alcohol, sinapyl alcohol, p coumarate, sodium ferulate, and kraft lignin. Transcriptomics and proteomics data were generated for those specific LBP breakdown conditions and used as features to train machine learning models, with growth rates as the target. Three models, namely Artificial Neural Networks (ANN), Random Forest (RF), and Support Vector Machine (SV), were compared, with ANN achieving the highest predictive accuracy for both transcriptomics (94%) and proteomics (96%) datasets. Permutation feature importance analysis of the ANN models identified the top twenty genes and proteins influencing growth rates. Combining results from both transcriptomics and proteomics, eight key transport proteins were found to significantly influence the growth of R. palustris on LBPs. Re-training the ANN using only these eight transport proteins achieved predictive accuracies of 86% and 76% for proteomics and transcriptomics, respectively. This work highlights the potential of ANN-based models to predict growth-associated genes and proteins, shedding light on the metabolic behavior of R. palustris in lignin degradation under aerobic and anaerobic conditions.