工作流程
计算机科学
数据整理
数据科学
人工智能
数据库
出处
期刊:Methods in molecular biology
日期:2024-09-23
卷期号:: 115-130
标识
DOI:10.1007/978-1-0716-4003-6_5
摘要
The recent advancements in machine learning and the new availability of large chemical datasets made the development of tools and protocols for computational chemistry a topic of high interest. In this chapter a standard procedure to develop Quantitative Structure-Activity Relationship (QSAR) models was presented and implemented in two freely available and easy-to-use workflows. The first workflow helps the user retrieving chemical data (SMILES) from the web, checking their correctness and curating them to produce consistent and ready-to-use datasets for cheminformatic. The second workflow implements six machine learning methods to develop classification QSAR models. Models can be additionally used to predict external chemicals. Calculation and selection of chemical descriptors, tuning of models' hyperparameters, and methods to handle data unbalancing are also incorporated in the workflow. Both the workflows are implemented in KNIME and represent a useful tool for computational scientists, as well as an intuitive and straightforward introduction to QSAR.
科研通智能强力驱动
Strongly Powered by AbleSci AI