登录    注册    忘记密码

详细信息

Comparative Judgment: Building a Shared Consensus Over Rater Variation in Assessing Second Language Writing Performance    

文献类型:期刊文献

英文题名:Comparative Judgment: Building a Shared Consensus Over Rater Variation in Assessing Second Language Writing Performance

作者:Wu, Qian[1]

机构:[1]Shaoxing Univ, Sch Foreign Languages, Chengnan Ave 900, Shaoxing 312000, Zhejiang, Peoples R China

年份:2025

卷号:15

期号:2

外文期刊名:SAGE OPEN

收录:SSCI(收录号:WOS:001516532600001)、、WOS

基金:The author disclosed receipt of the following financial supportfor the research, authorship, and/or publication of this article:This work was supported by the University research grant (NO. 13011002002/114) and the project grant (No. 2023SK008) from Shaoxing University.

语种:英文

外文关键词:comparative judgment; rater variation; L2 writing assessment; pairwise comparisons; reliability; construct validity

外文摘要:Rater variation has been a persistent concern for rater-mediated writing assessments. Instead of treating rater variation as an undesired source of measurement error, the method of comparative judgment (CJ) uses pairwise comparisons to elicit relative judgments from raters and statistical estimation to construct a measurement scale to rank object items, offering a viable approach to accommodate rater-associated heterogeneity of judgment making on the one hand and obtain reliable and valid outcomes on the other hand. The current study systematically examined the utility and quality of CJ as an assessment tool in the context of second language writing. A group of 16 raters (8 experienced and 8 novice) performed the CJ assessment on 94 pieces of English writing texts in the absence of rubric criteria. Despite raters' varying expertise and rating experiences, raters were able to deliver judgments consistent with the shared consensus, yielding a CJ rank order of the writing texts with a moderate reliability. The analyses of raters' justifications for judgment making showed that raters varied substantially in terms of evaluation criteria, but the collective expertise derived from the iterative CJ process presented a close alignment with the established scoring rubric. Additionally, inconsistencies were explored when raters and texts significantly deviated from the consensus of judgments, and practical implications were discussed. The results provide empirical evidence for the construct validity of CJ and add a novel perspective to the discussion of rater variation in second language writing assessment.

参考文献:

正在载入数据...

版权所有©绍兴文理学院 重庆维普资讯有限公司 渝B2-20050021-8
渝公网安备 50019002500408号 违法和不良信息举报中心