In this study, the Classical test theory and generalizability theory were used for determination to reliability of scores obtained from measurement tool of mathematics success. 24 open-ended mathematics question of the TIMSS-1999 was applied to 203 students in 2007-spring semester. Internal consistency of scores was found as 0.92. For determination of interrater consistency, Kendall's concordance coefficient was calculated as 0.52. Generalizability coefficient for mathematics scores was 0.92 and phi coefficient was 0.90. The variance component of raters accounted for 2.1% of the total variance. According to all results, it was seen that measurement tool of mathematics success was reliable for determination of students' mathematics success. Although there was a difference between means of four raters' scores, it was found that there was consistency of their scores.