Comparing Performances (Type I error and Power) of IRT Likelihood Ratio SIBTEST and Mantel-Haenszel Methods in the Determination of Differential Item Functioning

ATALAY KABASAKAL, KÜBRA; Arsan, Nihan; GÖK, BİLGE; KELECİOĞLU, HÜLYA

doi:10.12738/estp.2014.6.2165

Comparing Performances (Type I error and Power) of IRT Likelihood Ratio SIBTEST and Mantel-Haenszel Methods in the Determination of Differential Item Functioning

ATALAY KABASAKAL K., Arsan N., GÖK B., KELECİOĞLU H.

KURAM VE UYGULAMADA EGITIM BILIMLERI, cilt.14, sa.6, ss.2186-2193, 2014 (SSCI, Scopus, TRDizin)

Yayın Türü: Makale / Tam Makale
Cilt numarası: 14 Sayı: 6
Basım Tarihi: 2014
Doi Numarası: 10.12738/estp.2014.6.2165
Dergi Adı: KURAM VE UYGULAMADA EGITIM BILIMLERI
Derginin Tarandığı İndeksler: Social Sciences Citation Index (SSCI), Scopus, TR DİZİN (ULAKBİM)
Sayfa Sayıları: ss.2186-2193
Hacettepe Üniversitesi Adresli: Evet

Özet

This simulation study compared the performances (Type I error and power) of Mantel-Haenszel (MH), SIBTEST, and item response theory-likelihood ratio (IRT-LR) methods under certain conditions. Manipulated factors were sample size, ability differences between groups, test length, the percentage of differential item functioning (DIF), and underlying model used to generate data. Results suggest that SIBTEST had the highest Type I error in the detection of uniform DIF, but MH had the highest power under all conditions. In addition, the percentage of DIF and the underlying model appear to have influenced the Type I error rate of IRT-LR. Ability differences between groups, test length, the percentage of DIF, model, and the interactions between ability differences*percentage of DIF, ability differences*test length, test length*percentage of DIF, test length*model affected the SIBTEST methods' Type I error rate. In the MH procedure, effective factors for Type I error rate were: sample size, test length, the percentage of DIF, ability differences*percentage of DIF, ability differences*model, and ability differences*percentage of DIF*model. No factors were effective on the power of SIBTEST and MH, but the underlying model had a significant effect on the IRT-LR power rate.