摘要:Background: In low-stakes educational assessments, test takers might show a performance decline (PD) on end-of-test items. PD is a concern in educational assessments, especially when groups of students are to be compared on the profciency variable because item responses gathered in the groups could be diferently afected by PD. In order to account for PD, mixture item response theory (IRT) models have been proposed in the literature. Methods: In this article, multigroup extensions of three existing mixture models that assess PD are compared. The models were applied to the mathematics test in a largescale study targeting school track diferences in profciency. Results: Despite the diferences in the specifcation of PD, all three models showed rather similar item parameter estimates that were, however, diferent from the estimates given by a standard two parameter IRT model. In addition, all models indicated that the amount of PD difered between tracks, in that school track diferences in profciency were slightly reduced when PD was accounted for. Nevertheless, the models gave diferent estimates of the proportion of students showing PD, and difered somewhat from each other in the adjustment of profciency scores for PD. Conclusions: Multigroup mixture models can be used to study how PD interacts with profciency and other variables to provide a better understanding of the mechanisms behind PD. Diferences between the presented models with regard to their assumptions about the relationship between PD and item responses are discussed.