Investigation of Variables Affecting PISA Reading Comprehension Achievement Levels of Countries with Different Levels of Achievement with CRT and RF Methods

Kasap Y., DOĞAN N., Kocak C.

Van Yüzüncü Yıl Üniversitesi Eğitim Fakültesi Dergisi, vol.20, no.2, pp.459-483, 2023 (Peer-Reviewed Journal) identifier


The aim of this research is to determine the important variables that predict the PISA 2018 reading comprehension achievement score of countries with different achievement levels, using 34 independent variables obtained from the student questionnaire given to the students who participated in PISA in 2018. For this purpose, 79 countries that participated PISA were ranked according to their success percentages then, these countries were sorted into lower, middle and upper group countries. A sample of lower, middle and upper group countries was formed then, three countries were selected from each of the lower group, middle group and upper group countries and a sample of lower, middle and upper group countries was formed. Data mining analyzes were carried out on the samples obtained by using the Classification and Regression Tree and Random Forest methods. It has been observed that the number of important variables that predict reading comprehension success can be reduced from 34 to three to eight. Like this; Data mining classification prediction models, which can predict the success level of PISA, were obtained by using a small number of variables. It has been determined that the models obtained have an acceptable level of predictive performance in predicting success in three categories (low, medium-high). The most important predictor variables obtained from the models are information and communication technologies resources, perception of reading difficulty, professional status expected from the student, perception of difficulty in the PISA test, reading pleasure, weekly test language learning time, disciplinary climate, socio-economic status index.