概化理论
印为红字的
写作评估
可靠性(半导体)
等级间信度
心理学
任务(项目管理)
同行评估
计算机科学
应用心理学
数学教育
工程类
发展心理学
量子力学
评定量表
物理
功率(物理)
系统工程
作者
Jinyan Huang,Danni Zhu,Duquan Xie,Tiantian Shu
标识
DOI:10.1016/j.asw.2023.100693
摘要
Using generalizability (G-) theory and rater think-aloud protocols (TAPs) as research methods, this study examined the effects of person, task, rater, and the interactions among these facets on the variability and reliability of the HSK-6 (i.e., an international Chinese proficiency standardized assessment) writing scores assigned by the national HSK writing raters as well as their scoring decision making processes. Sixty-four HSK-6 writing samples written by 32 CFL (Chinese as a foreign language) learners from 17 L1 (first language) backgrounds were scored holistically by ten experienced HSK writing raters using the authentic HSK-6 scoring rubric. They were then invited to produce a written retrospective TAP of their scoring decision making processes immediately after they had completed scoring each HSK-6 writing sample, which resulted in 64 protocols per rater. A total of 640 protocols were included in the qualitative data analysis. The G-theory results indicated that the current single-task and two-rater holistic scoring scheme would be unable to yield acceptable generalizability and dependability coefficients. The rater TAP results also revealed considerable rater variations in their scoring decision making processes. Important implications for the HSK-6 writing assessment policy makers in China are discussed.
科研通智能强力驱动
Strongly Powered by AbleSci AI