协变量
逻辑回归
决策树
计量经济学
选择(遗传算法)
选型
统计
回归
逻辑模型树
回归分析
计算机科学
数学
机器学习
作者
Vincent C. Arena,Nancy B. Sussman,Sati Mazumdar,Shuang Yu,Orest T. Macina
标识
DOI:10.1080/1062936032000169633
摘要
Structure–activity relationship (SAR) models can be used to predict the biological activity of potential developmental toxicants whose adverse effects include death, structural abnormalities, altered growth and functional deficiencies in the developing organism. Physico-chemical descriptors of spatial, electronic and lipophilic properties were used to derive SAR models by two modeling approaches, logistic regression and Classification and Regression Tree (CART), using a new developmental database of 293 chemicals (FDA/TERIS). Both single models and ensembles of models (termed bagging) were derived to predict toxicity. Assessment of the empirical distributions of the prediction measures was performed by repeated random partitioning of the data set. Results showed that both the decision tree and logistic regression derived developmental SAR models exhibited modest prediction accuracy. Bagging tended to enhance the prediction accuracy and reduced the variability of prediction measures compared to the single model for CART-based models but not consistently for logistic-based models. Prediction accuracy of single logistic-based models was higher than single CART-based models but bagged CART-based models were more predictive. Descriptor selection in SAR for the understanding of the developmental mechanism was highly dependent on the modeling approach. Although prediction accuracy was similar in the two modeling approaches, there was inconsistency in the model descriptors.
科研通智能强力驱动
Strongly Powered by AbleSci AI