过度拟合
偏差(统计)
概化理论
计算机科学
构造(python库)
实证研究
计量经济学
机器学习
人工智能
心理学
统计
数学
人工神经网络
程序设计语言
作者
Nicholas P. Danks,Soumya Ray,Galit Shmueli
出处
期刊:Management Science
[Institute for Operations Research and the Management Sciences]
日期:2023-02-22
卷期号:70 (1): 647-669
被引量:3
标识
DOI:10.1287/mnsc.2023.4705
摘要
Construct-based models have become a mainstay of management and information systems research. However, these models are likely overfit to the data samples upon which they are estimated, making them risky to use in explanatory, prescriptive, or predictive ways outside a given sample. Empirical researchers currently lack tools to analyze why and how their models may not generalize out of sample. We propose a composite overfit analysis (COA) framework that applies predictive tools to describe the sources and ramifications of overfit in terms of the focal concepts important to empirical researchers: cases, constructs, and causal paths. The COA framework begins by using a leave-one-out crossvalidation procedure to identify cases with unusually high predictive error given their in-sample fit—a difference we describe as predictive deviance. The framework then employs a novel deviance tree method to group deviant cases that have similar predictive deviance and for similar theoretical reasons. We then employ a leave-deviant-group-out method, which sequentially analyzes how each deviant group affects model parameters, thereby identifying potentially unstable paths in the model. We can then infer descriptive reasons for why and how overfit affects a given model and data sample using the grouping criteria of the deviance tree, construct scores of deviant groups, and resulting unstable paths. These insights allow researchers to identify unexpected behavior that could define boundary conditions of their theory or point to new theoretical phenomena. We demonstrate the practical utility of our analytical framework on a technology adoption model in a new context. This paper was accepted by Dongjun Wu, information systems. Funding: This work was partially supported by the Ministry of Science and Technology, Taiwan [Grants 109-2811-H-007-503 and 108-2410-H-007-091-MY3] and by the 2021 Arts and Social Sciences Benefaction Fund of Trinity College Dublin, Ireland. Supplemental Material: The data files and online appendix are available at https://doi.org/10.1287/mnsc.2023.4705 .
科研通智能强力驱动
Strongly Powered by AbleSci AI