摩尔吸收率
分子
光毒性
吸收(声学)
谱线
生物系统
训练集
数据集
试验装置
分子描述符
吸收光谱法
有机分子
集合(抽象数据类型)
人工智能
模式识别(心理学)
化学
计算机科学
数量结构-活动关系
机器学习
物理
有机化学
光学
生物
生物化学
程序设计语言
体外
天文
作者
Rafael Mamede,Florbela Pereira,João Aires‐de‐Sousa
标识
DOI:10.1038/s41598-021-03070-9
摘要
Abstract Machine learning (ML) algorithms were explored for the classification of the UV–Vis absorption spectrum of organic molecules based on molecular descriptors and fingerprints generated from 2D chemical structures. Training and test data (~ 75 k molecules and associated UV–Vis data) were assembled from a database with lists of experimental absorption maxima. They were labeled with positive class (related to photoreactive potential) if an absorption maximum is reported in the range between 290 and 700 nm (UV/Vis) with molar extinction coefficient (MEC) above 1000 Lmol −1 cm −1 , and as negative if no such a peak is in the list. Random forests were selected among several algorithms. The models were validated with two external test sets comprising 998 organic molecules, obtaining a global accuracy up to 0.89, sensitivity of 0.90 and specificity of 0.88. The ML output (UV–Vis spectrum class) was explored as a predictor of the 3T3 NRU phototoxicity in vitro assay for a set of 43 molecules. Comparable results were observed with the classification directly based on experimental UV–Vis data in the same format.
科研通智能强力驱动
Strongly Powered by AbleSci AI