术语
混乱
计算机科学
化学
酸离解常数
数据科学
数据挖掘
化学
药物发现
哲学
语言学
心理学
生物化学
物理化学
水溶液
精神分析
作者
Jonathan W. Zheng,Ivo Leito,William H. Green
标识
DOI:10.1021/acs.jcim.4c01420
摘要
The acid dissociation constant (pKa), which quantifies the propensity for a solute to donate a proton to its solvent, is crucial for drug design and synthesis, environmental fate studies, chemical manufacturing, and many other fields. Unfortunately, the terminology used for describing acid–base phenomena is sometimes inconsistent, causing large potential for misinterpretation. In this work, we examine a systematic confusion underlying the definition of "acidic" and "basic" pKa values for zwitterionic compounds. Due to this confusion, some pKa data are misrepresented in data repositories, including the widely used and highly trusted ChEMBL database. Such datasets are frequently used to supply training data for pKa prediction models, and hence, confusion and errors in the data make the model performance worse. Herein, we discuss the intricacies of this issue. We make suggestions for describing acid–base phenomena, training pKa prediction models, and stewarding pKa datasets, given the high potential for confusion and potentially high impact in downstream applications.
科研通智能强力驱动
Strongly Powered by AbleSci AI