化学信息学
生物信息学
随机森林
分子描述符
化学
血脑屏障
色谱法
四极飞行时间
质谱法
机器学习
中枢神经系统
数量结构-活动关系
计算机科学
医学
生物化学
串联质谱法
立体化学
计算化学
内分泌学
基因
作者
Raihana Edros,T.W. Feng,Ruihai Dong
标识
DOI:10.1080/1062936x.2023.2230868
摘要
Current in silico modelling techniques, such as molecular dynamics, typically focus on compounds with the highest concentration from chromatographic analyses for bioactivity screening. Consequently, they reduce the need for labour-intensive in vitro studies but limit the utilization of extensive chromatographic data and molecular diversity for compound classification. Compound permeability across the blood–brain barrier (BBB) is a key concern in central nervous system (CNS) drug development, and this limitation can be addressed by applying cheminformatics with codeless machine learning (ML). Among the four models developed in this study, the Random Forest (RF) algorithm with the most robust performance in both internal and external validation was selected for model construction, with an accuracy (ACC) of 87.5% and 86.9% and area under the curve (AUC) of 0.907 and 0.726, respectively. The RF model was deployed to classify 285 compounds detected using liquid chromatography quadrupole time-of-flight mass spectrometry (LCQTOF-MS) in Kelulut honey; of which, 140 compounds were screened with 94 descriptors. Seventeen compounds were predicted to permeate the BBB, revealing their potential as drugs for treating neurodegenerative diseases. Our results highlight the importance of employing ML pattern recognition to identify compounds with neuroprotective potential from the entire pool of chromatographic data.
科研通智能强力驱动
Strongly Powered by AbleSci AI