Comparative regression performances of machine learning methods optimising hyperparameters: application to health expenditures
International Journal of Bioinformatics Research and Applications, cilt.16, sa.4, ss.387-407, 2020 (Scopus)
- Yayın Türü: Makale / Tam Makale
- Cilt numarası: 16 Sayı: 4
- Basım Tarihi: 2020
- Dergi Adı: International Journal of Bioinformatics Research and Applications
- Derginin Tarandığı İndeksler: Scopus, Aerospace Database, Agricultural & Environmental Science Database, Biotechnology Research Abstracts, Communication Abstracts, INSPEC, Metadex, Civil Engineering Abstracts
- Sayfa Sayıları: ss.387-407
- Hacettepe Üniversitesi Adresli: Evet
Özet
Machine learning (ML) algorithms are used in various areas.
However, there has been no study analysing health expenditures using ML
methods. This work is a step forward in comparing the regression
performances of lasso (L), K-nearest neighbourhood (KNN), Random Forest
(RF) and support vector machine (SVM) regression while changing
hyperparameter values. In this study, lambda (λ), number of neighbours (NN),
number of trees (NT) and epsilon (ε) parameter for L, KNN, RF and SVM
regression were determined as hyperparameters, respectively. K-fold crossvalidation
was performed to examine regression performance results. Study
results show that KNN (R2 > 0.75; RMSE < 0.70; MAE < 0.55) and
L (R2 > 0.79; RMSE < 0.20; MAE < 0.15) regression yields better results in
predicting health expenditure per capita and out-of-pocket health expenditure
(%) respectively. Moreover, L, KNN, RF and SVM regression methods
performance differences are statistically significant (p < 0.001). It is hoped that
these results will stimulate further interest in using ML methods to predict
health expenditures.