A Comparison of Kernel Equating and Item Response Theory Equating Methods


Akin-Arikan C., GELBAL S.

EURASIAN JOURNAL OF EDUCATIONAL RESEARCH, sa.93, ss.179-198, 2021 (ESCI) identifier

  • Yayın Türü: Makale / Tam Makale
  • Basım Tarihi: 2021
  • Doi Numarası: 10.14689/ejer.2021.93.9
  • Dergi Adı: EURASIAN JOURNAL OF EDUCATIONAL RESEARCH
  • Derginin Tarandığı İndeksler: Emerging Sources Citation Index (ESCI), Scopus, EBSCO Education Source, Educational research abstracts (ERA), ERIC (Education Resources Information Center)
  • Sayfa Sayıları: ss.179-198
  • Anahtar Kelimeler: Equating, kernel, IRT, error, LINKING, TESTS
  • Hacettepe Üniversitesi Adresli: Evet

Özet

Purpose: This study aims to compare the performances of Item Response Theory (IRT) equating and kernel equating (KE) methods based on equating errors (RMSD) and standard error of equating (SEE) using the anchor item nonequivalent groups design. Method: Within this scope, a set of conditions, including ability distribution, type of anchor items (internalexternal), the ratio of anchor items, and spread of anchor item difficulty, were observed in 24 different simulation conditions. Findings: The results showed that ability distribution, type of anchor items, the ratio of anchor items, and spread of anchor item difficulty affected the performance of the equating methods. It was also observed that kernel chained equating methods (KE CE) were less affected by the difference in group mean ability. Moreover, in the case of increased average differences in ability between groups, a high range of score scale yielded higher standard errors in KE methods, while a medium-high range of scale scores exhibited higher standard errors in IRT equating. Using external anchor items led to lower SEE and RMSD than using internal anchor items, and both errors decreased as the ratio of anchor items increased. When internal anchor items were used with similar average group ability distribution, mini and midi anchor tests gave similar results. On the other hand, a midi anchor test performed better with increased average differences in group ability distribution for external anchor items. At the end of the scale scores, the IRT equating method had a lower rate of errors. Implications for Research and Practice: KE methods can be used while IRT assumptions are not met. (C) 2021 Ani Publishing Ltd. All rights reserved