合成子
机器学习
人工智能
醇脱氢酶
偏最小二乘回归
支持向量机
计算机科学
基质(水族馆)
酒
主成分分析
化学
算法
组合化学
立体化学
生物
生物化学
生态学
作者
Arindam Ghatak,Anirudh P. Shanbhag,Santanu Datta
标识
DOI:10.1016/j.bbrc.2023.149298
摘要
Alcohol dehydrogenases (ADHs) are popular catalysts for synthesizing chiral synthons a vital step for active pharmaceutical intermediate (API) production. They are grouped into three superfamilies namely, medium-chain (MDRs), short-chain dehydrogenase/reductases (SDRs), and iron-containing alcohol dehydrogenases. The former two are used extensively for producing various chiral synthons. Many studies screen multiple enzymes or engineer a specific enzyme for catalyzing a substrate of interest. These processes are resource-intensive and intricate. The current study attempts to decipher the ability to match different ADHs with their ideal substrates using machine learning algorithms. We explore the catalysis of 284 antibacterial ketone intermediates, against MDRs and SDRs to demonstrate a unique pattern of activity. To facilitate machine learning we curated a dataset comprising 33 features, encompassing 4 descriptors for each compound. Subsequently, an ensemble of machine learning techniques viz. Partial Least Squares (PLS) regression, k-Nearest Neighbors (kNN) regression, and Support Vector Machine (SVM) regression, was harnessed. Moreover, the assimilation of Principal Component Analysis (PCA) augmented precision and accuracy, thereby refining and demarcating diverse compound classes. As such, this classification is useful for discerning substrates amenable to diverse alcohol dehydrogenases, thereby mitigating the reliance on high-throughput screening or engineering in identifying the optimal enzyme for specific substrate.
科研通智能强力驱动
Strongly Powered by AbleSci AI