摘要:Differential item functioning (DIF), or item bias, is a relatively new concept. It has been one of the most controversial and the most studied subject in measurement theory. DIF occurs when people who have the same ability level but from different groups have a different probability of a correct response. According to Item Response Theory (IRT), DIF occurs when item characteristic curves (ICC) of two groups are not identical or do not have the same item parameters after rescaling. Also, DIF might occur when latent ability space is misspecified. When the groups have different multidimensional ability distributions and test items chosen to discriminate among these abilities, using unidimensional scoring, might flag items as DIF items. The purpose of this study was to compare four DIF procedures the Mantel Haenszel (MH), the Simultaneous Item Bias Test (SIBTEST), the IRT, the Logistic Regression (LR) when the underlying ability distribution is erroneously assumed to be homogenous. To illustrate the effect of assuming a homogenous ability distribution for the groups while they differ in terms of their underlying multidimensional ability levels on the DIF procedures, two different data sets were generated; one set in which DIF occurs, and one set in which no DIF occurs by using 2PL model. The UNIGEN program was used to generate the data. Each of the data sets contained 1000 examinees and 25 items. Item parameters where chosen to be capable of measuring a two dimensional ability distribution of the two groups. The MH, the SIBTEST, the AREA and the LR procedures were applied to the data both with DIF and without DIF. The study showed that all the four methods identified items as biased when the ability space was misspecified.