化学空间
药物发现
虚拟筛选
化学数据库
计算机科学
组合化学
化学
计算生物学
化学
数据库
生物
生物化学
有机化学
作者
Xavier Lucas,Björn Grüning,Stefan Bleher,Stefan Günther
标识
DOI:10.1021/acs.jcim.5b00116
摘要
The screening of a reduced yet diverse and synthesizable region of the chemical space is a critical step in drug discovery. The ZINC database is nowadays routinely used to freely access and screen millions of commercially available compounds. We collected ∼125 million compounds from chemical catalogs and the ZINC database, yielding more than 68 million unique molecules, including a large portion of described natural products (NPs) and drugs. The data set was filtered using advanced medicinal chemistry rules to remove potentially toxic, promiscuous, metabolically labile, or reactive compounds. We studied the physicochemical properties of this compilation and identified millions of NP-like, fragment-like, inhibitors of protein-protein interactions (i-PPIs) like, and drug-like compounds. The related focused libraries were subjected to a detailed scaffold diversity analysis and compared to reference NPs and marketed drugs. This study revealed thousands of diverse chemotypes with distinct representations of building block combinations among the data sets. An analysis of the stereogenic and shape complexity properties of the libraries also showed that they present well-defined levels of complexity, following the tendency: i-PPIs-like < drug-like < fragment-like < NP-like. As the collected compounds have huge interest in drug discovery and particularly virtual screening and library design, we offer a freely available collection comprising over 37 million molecules under: http://pbox.pharmaceutical-bioinformatics.org , as well as the filtering rules used to build the focused libraries described herein.
科研通智能强力驱动
Strongly Powered by AbleSci AI