贝叶斯概率
选型
贝叶斯推理
计算机科学
数学
算法
人工智能
机器学习
统计
标识
DOI:10.1016/j.jmp.2020.102474
摘要
Comparison of competing statistical models is an essential part of psychological research. From a Bayesian perspective, various approaches to model comparison and selection have been proposed in the literature. However, the applicability of these approaches depends on the assumptions about the model space M. Also, traditional methods like leave-one-out cross-validation (LOO-CV) estimate the expected log predictive density (ELPD) of a model to investigate how the model generalises out-of-sample, and quickly become computationally inefficient when sample size becomes large. Here, a tutorial on Pareto-smoothed importance sampling leave-one-out cross-validation (PSIS-LOO-CV) is provided, which is computationally more efficient. It is shown how Bayesian model selection can be scaled efficiently for big data via PSIS-LOO-CV in combination with approximate posterior inference and probability-proportional-to-size subsampling. First, several model views and the available Bayesian model comparison methods in each are discussed. The Bayesian logistic regression model is then used as a running example to show how to apply the method in practice, and demonstrate that it provides similarly accurate ELPD estimates like LOO-CV or information criteria. Subsequently, the power and exponential law models relating reaction times to practice are used to demonstrate the approach with more complex models. Guidance is provided how to compare competing models based on the ELPD estimates and how to conduct posterior predictive checks to safeguard against overconfidence in one of the models under consideration. The intended audience are researchers who practice mathematical modelling and comparison, possibly with large datasets, and who are well acquainted to Bayesian statistics.
科研通智能强力驱动
Strongly Powered by AbleSci AI