估计员
选型
平滑的
数学
广义线性模型
线性模型
协变量
交叉验证
广义加性模型
加性模型
非参数统计
数学优化
度量(数据仓库)
渐近最优算法
应用数学
计算机科学
统计
数据挖掘
作者
Ze Chen,Jun Liao,Wangli Xu,Yuhong Yang
标识
DOI:10.1080/10618600.2023.2174127
摘要
Generalized Additive Partial Linear Models (GAPLMs) are appealing for model interpretation and prediction. However, for GAPLMs, the covariates and the degree of smoothing in the nonparametric parts are often difficult to determine in practice. To address this model selection uncertainty issue, we develop a computationally feasible Model Averaging (MA) procedure. The model weights are data-driven and selected based on multifold Cross-Validation (CV) (instead of leave-one-out) for computational saving. When all the candidate models are misspecified, we show that the proposed MA estimator for GAPLMs is asymptotically optimal in the sense of achieving the lowest possible Kullback-Leibler loss. In the other scenario where the candidate model set contains at least one quasi-correct model, the weights chosen by the multifold CV are asymptotically concentrated on the quasi-correct models. As a by-product, we propose a variable importance measure to quantify the importances of the predictors in GAPLMs based on the MA weights. It is shown to be able to asymptotically identify the variables in the true model. Moreover, when the number of candidate models is very large, a model screening method is provided. Numerical experiments show the superiority of the proposed MA method over some existing model averaging and selection methods. Supplementary materials for this article are available online.
科研通智能强力驱动
Strongly Powered by AbleSci AI