International Journal of Bioinformatics Research and Applications, cilt.16, sa.4, ss.387-407, 2020 (Scopus)
Machine learning (ML) algorithms are used in various areas.
However, there has been no study analysing health expenditures using ML
methods. This work is a step forward in comparing the regression
performances of lasso (L), K-nearest neighbourhood (KNN), Random Forest
(RF) and support vector machine (SVM) regression while changing
hyperparameter values. In this study, lambda (λ), number of neighbours (NN),
number of trees (NT) and epsilon (ε) parameter for L, KNN, RF and SVM
regression were determined as hyperparameters, respectively. K-fold crossvalidation
was performed to examine regression performance results. Study
results show that KNN (R2 > 0.75; RMSE < 0.70; MAE < 0.55) and
L (R2 > 0.79; RMSE < 0.20; MAE < 0.15) regression yields better results in
predicting health expenditure per capita and out-of-pocket health expenditure
(%) respectively. Moreover, L, KNN, RF and SVM regression methods
performance differences are statistically significant (p < 0.001). It is hoped that
these results will stimulate further interest in using ML methods to predict
health expenditures.