An assessment on producing synthetic samples by fuzzy C-means for limited number of data in prediction models


APPLIED SOFT COMPUTING, vol.24, pp.126-134, 2014 (SCI-Expanded) identifier identifier

  • Publication Type: Article / Article
  • Volume: 24
  • Publication Date: 2014
  • Doi Number: 10.1016/j.asoc.2014.06.056
  • Journal Indexes: Science Citation Index Expanded (SCI-EXPANDED), Scopus
  • Page Numbers: pp.126-134
  • Hacettepe University Affiliated: Yes


For most of rock engineering and engineering geology projects, strength and deformability parameters of intact rocks have crucial importance. However, it is highly challenging to obtain these parameters from weak and very weak rocks due to their nature and testing requirements. For this reason, prediction models are commonly used to obtain desired parameters indirectly. When developing a prediction model, data sets having sufficient size are required. If sufficient data size is not provided for a prediction model, insufficient data problem arises. The main purpose of this study was to investigate the use of synthetic data in indirect determination of rock strength by employing fuzzy C-means (FCM) and adaptive neuro-fuzzy inference system (ANFIS). For the purpose, the experiments were carried out in two stages; (i) uniaxial compressive strength (UCS) prediction by using real data with ANFIS, and (ii) production of synthetic data sets having different sizes, and synthetic data set evaluation in modeling. According to the results obtained, FCM is a practical and suitable method for synthetic data production. Development of prediction models for rock strength by using synthetic data is found to be successful based on statistical performance indices. Additionally, the use of proposed size for synthetic data reduces modeling effort significantly because it eliminates the iterative approach in modeling, hence development of models for limited number of data becomes more practical. (C) 2014 Elsevier B.V. All rights reserved.