JOURNAL OF MEASUREMENT AND EVALUATION IN EDUCATION AND PSYCHOLOGY-EPOD, vol.13, no.2, pp.105-116, 2022 (ESCI)
In this research, the aim was to evaluate the effect of zero imputation and multiple imputation missing data handling methods on item response theory (IRT) based test equating methods under different conditions. Data in this study was obtained from the administration of the TIMSS 2019 eighth-grade science test. Data sets were formed by randomly selecting a sample of 1000 students with full data from booklets 7 and 8. By deleting data under a completely random missing data mechanism within the scope of common-item nonequivalent groups (CINEG) design, four different data sets were obtained with the missing data rates of 10% or 20% in the new test or in both tests. The missing da ta problem was solved by using zero imputation and multiple imputation methods from these data sets. In this way, 8 different data sets were formed. Then, scaling transformation was performed by using characteristic curve transformation methods (Haebara, Stocking-Lord). Test equating results were reported in terms of observed scores. The root mean square error (RMSE) was used as the evaluation criterion to determine the error involved in test equating. As a result, it was determined that in the case of 10% missing data in both tests, generally lower RMSE values were obtained. It was observed that the multiple imputation method, one of the methods for handling missing data, was the method that produced RMSE values that were both the lowest and closer to the full data set as a reference value compared to the zero-imputation method. In addition, it was determined that, when compared to the Haebara method, Stocking-Lord method, one of the characteristic curve transformation methods, produced lower RMSE values and these values were closer to the full data set, which was taken as a reference value.