Variable Selection by Using a Genetic Algorithm for Regression Model

Yigiter A., Cetin M.

INTERNATIONAL JOURNAL OF APPLIED MATHEMATICS & STATISTICS, vol.57, no.4, pp.1-9, 2018 (Journal Indexed in ESCI) identifier

  • Publication Type: Article / Article
  • Volume: 57 Issue: 4
  • Publication Date: 2018
  • Page Numbers: pp.1-9


Variable selection is an important process to obtain the best subset of variables in a regression model. Forward, backward, stepwise methods are known as classical variable selection methods in the regression model. As a search-based method, Genetic Algorithm (GA) is also used for variable selection. The aim of this study is to select the subsets of independent variables in the regression model using GA. GA method is known as a heuristic search algorithm used to solve complex optimization problems. Additionally, GA could be used as a conventional method for variable selection in regression analysis as in many statistical areas. In this study, the GA method is described, and then, the subset is obtained by minimizing information criteria such as Akaike information criterion (AIC). Mallow's Cp (Cp), Bozdogan's index of informational complexity (ICOMP) and robust AIC (ROAIC), and by maximizing Akaike information criterion with Fisher information (AICF) (Cetin and Erar, 2006). A simulation study is used to evaluate the performance of the criterian in GA in case of an outlier or without an outlier in the data.