虚拟筛选
计算机科学
生物信息学
数据挖掘
适用范围
集合(抽象数据类型)
领域(数学分析)
数据集
机器学习
人工智能
计算生物学
药物发现
生物信息学
数量结构-活动关系
数学
化学
生物
基因
数学分析
生物化学
程序设计语言
作者
Jiahao Xu,Zejun Huang,Hao Duan,Weihua Li,Jingyan Zhuang,Le Xiong,Yun Tang,Guixia Liu
标识
DOI:10.1002/cmdc.202400298
摘要
Estrogen-related receptor α (ERRα) is considered a very promising target for treating metabolic diseases such as type 2 diabetes. Development of a prediction model to quickly identify potential ERRα agonists can significantly reduce the time spent on virtual screening. In this study, 298 ERRα agonists and numerous nonagonists were collected from various sources to build a new dataset of ERRα agonists. Then a total of 90 models were built using a combination of different algorithms, molecular characterization methods, and data sampling techniques. The consensus model with optimal performance was also validated on the test set (AUC=0.876, BA=0.816) and external validation set (AUC=0.867, BA=0.777) based on five selected baseline models. Furthermore, the model's applicability domain and privileged substructures were examined, and the feature importance was analyzed using the SHAP method to help interpret the model. Based on the above, it's hoped that our publicly accessible data, models, codes, and analytical techniques will prove valuable in quick screening and rational designing more novel and potent ERRα agonists.
科研通智能强力驱动
Strongly Powered by AbleSci AI