差异项目功能
震级(天文学)
心理学
项目反应理论
统计
差速器(机械装置)
心理测量学
数学
工程类
物理
天文
航空航天工程
作者
Shan Huang,Hidetoki Ishii
标识
DOI:10.1177/00131644241279882
摘要
Despite numerous studies on the magnitude of differential item functioning (DIF), different DIF detection methods often define effect sizes inconsistently and fail to adequately account for testing conditions. To address these limitations, this study introduces the unified M-DIF model, which defines the magnitude of DIF as the difference in item difficulty parameters between reference and focal groups. The M-DIF model can incorporate various DIF detection methods and test conditions to form a quantitative model. The pretrained approach was employed to leverage a sufficiently representative large sample as the training set and ensure the model’s generalizability. Once the pretrained model is constructed, it can be directly applied to new data. Specifically, a training dataset comprising 144 combinations of test conditions and 144,000 potential DIF items, each equipped with 29 statistical metrics, was used. We adopt the XGBoost method for modeling. Results show that, based on root mean square error (RMSE) and BIAS metrics, the M-DIF model outperforms the baseline model in both validation sets: under consistent and inconsistent test conditions. Across all 360 combinations of test conditions (144 consistent and 216 inconsistent with the training set), the M-DIF model demonstrates lower RMSE in 357 cases (99.2%), illustrating its robustness. Finally, we provided an empirical example to showcase the practical feasibility of implementing the M-DIF model.
科研通智能强力驱动
Strongly Powered by AbleSci AI