化学空间
生成语法
水准点(测量)
公制(单位)
计算机科学
人工智能
一般化
生成模型
机器学习
深度学习
集合(抽象数据类型)
空格(标点符号)
数学
药物发现
生物信息学
生物
工程类
数学分析
操作系统
运营管理
程序设计语言
地理
大地测量学
作者
Jie Zhang,Rocío Mercado,Ola Engkvist,Hongming Chen
标识
DOI:10.1021/acs.jcim.0c01328
摘要
In recent years, deep molecular generative models have emerged as promising methods for de novo molecular design. Thanks to the rapid advance of deep learning techniques, deep learning architectures such as recurrent neural networks, variational autoencoders, and adversarial networks have been successfully employed for constructing generative models. Recently, quite a few metrics have been proposed to evaluate these deep generative models. However, many of these metrics cannot evaluate the chemical space coverage of sampled molecules. This work presents a novel and complementary metric for evaluating deep molecular generative models. The metric is based on the chemical space coverage of a reference dataset—GDB-13. The performance of seven different molecular generative models was compared by calculating what fraction of the structures, ring systems, and functional groups could be reproduced from the largely unseen reference set when using only a small fraction of GDB-13 for training. The results show that the performance of the generative models studied varies significantly using the benchmark metrics introduced herein, such that the generalization capabilities of the generative models can be clearly differentiated. In addition, the coverages of GDB-13 ring systems and functional groups were compared between the models. Our study provides a useful new metric that can be used for evaluating and comparing generative models.
科研通智能强力驱动
Strongly Powered by AbleSci AI