On sampling strategies for small and continuous data with the modeling of genetic programming and adaptive neuro-fuzzy inference system


Sen S., Sezer E. A., Gokceoglu C., Yagiz S.

JOURNAL OF INTELLIGENT & FUZZY SYSTEMS, cilt.23, sa.6, ss.297-304, 2012 (SCI-Expanded) identifier identifier

  • Yayın Türü: Makale / Tam Makale
  • Cilt numarası: 23 Sayı: 6
  • Basım Tarihi: 2012
  • Doi Numarası: 10.3233/ifs-2012-0521
  • Dergi Adı: JOURNAL OF INTELLIGENT & FUZZY SYSTEMS
  • Derginin Tarandığı İndeksler: Science Citation Index Expanded (SCI-EXPANDED), Scopus
  • Sayfa Sayıları: ss.297-304
  • Hacettepe Üniversitesi Adresli: Evet

Özet

Sampling strategies which have very significant role on examining data characteristics (i.e. imbalanced, small, exhaustive) have been discussed in the literature for the last couple decades. In this study, the sampling problem encountered on small and continuous data sets is examined. Sampling with measured data by employing k-fold cross validation, and sampling with synthetic data generated by fuzzy c-means clustering are applied, and then the performances of genetic programming (GP) and adaptive neuro fuzzy inference system (ANFIS) on these data sets are discussed. Concluding remarks are that when the experimental results are considered, fuzzy c-means based synthetic sampling is more successful than k-fold cross validation while modeling small and continous data sets with ANFIS and GP, so it can be proposed for these type of data sets. Additionally, ANFIS shows slightly better performance than GP when sytnthetic data is employed, but GP is less sensitive to data set and produces ouputs that are narrower range than ANFIS's outputs while k-fold cross validation is employed.