期刊名称:Journal of Measurement and Evaluation in Education and Psychology
电子版ISSN:1309-6575
出版年度:2020
卷号:11
期号:2
页码:147-162
DOI:10.21031/epod.662964
语种:Turkish
出版社:EPODDER
摘要:One’s experience can greatly contribute to a diversified rating performance in educational scoring. Heterogeneous ratings can negatively affect examinees’ results. The aim of the study is to examine raters’ rating performance in assessing oral tests among lower secondary school students using Multi-facet Rasch Measurement (MFRM) model indicated by raters’ severity. Respondents are thirty English Language teachers clustered into two groups based on their rating experience in high-stakes assessment. The respondents listened to ten examinees’ recorded answers of three oral test items and provided their ratings. Instruments include items,examinees’ answers,scoring rubric,and scoring sheet used to appraise examinees’ competence in three domains which are vocabulary,grammar,and communicative competence. MFRM analysis showed that raters exhibited diversity in their severity level with chi-square χ2=2.661. Raters’ severity measures ranged from 2.13 to -1.45 logits. Independent t-test indicated that there was a significant difference in ratings provided by the inexperienced and the experienced raters,t-value = -0.96,df = 28,p<0.01. The findings of this study suggest that assessment developers must ensure raters are well versed before they can rate examinees in operational settings gained through assessment practices or rater training. Further research is needed to account for the varying effects of rating experience in other assessment contexts and the effects of interaction between facets on estimates of examinees’ measures. The present study provides additional evidence with respect to the role of rating experience in inspiring raters to provide accurate ratings.