JOURNAL OF MEASUREMENT AND EVALUATION IN EDUCATION AND PSYCHOLOGY-EPOD, cilt.6, sa.1, ss.12-24, 2015 (ESCI)
In this study, four approaches to the estimation of interrater reliability are studied: correlation, comparison of means, percentage of agreement, and generalizability theory. For the data-composed of ratings for 43 students on ten items by two raters- the reliability estimates varied because of the situation that the ranges of the obtained values by used approaches and different calculation processes. The highest estimate was 0.90 which is estimated by G theory. Besides this result, it was obtained that there was positive and high correlation coefficient (0.74). The estimate of percentage of exact matches of agreement between the two raters was found as 58.9 %. Finally, although there were no statistically differences between general mean of scores, there were statistical differences among three of the items by means of rater scoring. Although G theory seems more complex than the other methods illustrated in the study, it yields more information than the other methods because of handling multiple sources of error at the same time. Therefore, it is proposed to be used when estimating interrater reliability.