摘要:This paper investigates the use of a specific case of the Linear Logistic Test
Model, known as the rating scale rater model, in which the item parameter is
conceptualized to include an item difficulty parameter, plus a rating severity
parameter. Using this model, the severity of groups of teachers is investigated
when they scored sets of 321 pretests and posttests designed to be congruent
with an embedded assessment system. The items were included in a linked design
involving multiple booklets randomly allocated to students. Individual teachers
were found to differ in overall severity, but also showed a reasonable amount of
consistency within two of the three district moderation groups. Teachers also
showed some mean differences between districts. There is also evidence that the
model may be too tightly constrained, and further exploration using a less
constrained model is indicated.