On the Evaluation of CNN Models in Remote-Sensing Scene Classification Domain


Sen O., Keles H.

PFG-JOURNAL OF PHOTOGRAMMETRY REMOTE SENSING AND GEOINFORMATION SCIENCE, vol.88, no.6, pp.477-492, 2020 (Journal Indexed in SCI) identifier identifier

  • Publication Type: Article / Article
  • Volume: 88 Issue: 6
  • Publication Date: 2020
  • Doi Number: 10.1007/s41064-020-00129-6
  • Title of Journal : PFG-JOURNAL OF PHOTOGRAMMETRY REMOTE SENSING AND GEOINFORMATION SCIENCE
  • Page Numbers: pp.477-492
  • Keywords: Deep learning, Convolutional neural network (CNN), Land-cover classification, Land-use classification, Remote-sensing scene recognition, Transfer learning, GEOSPATIAL OBJECT DETECTION, IMAGE CLASSIFICATION, FEATURES, RETRIEVAL, TEXTURE

Abstract

Land-cover and land-use classification from aerial images is a challenging problem due to high intra-class diversity and inter-class similarities of the images. To analyze the performances of deep convolutional neural network (CNN) models in this domain, we provide three pre-trained CNN models that are adapted to NWPU-RESISC45 dataset using three different training splits, i.e., 80%, 20%, and 10% ratios. The architectures of all three models are redesigned to be modest in size and their structure is kept simple, yet when tested with the NWPU-RESISC45 dataset, all three models perform comparably to the state-of-the-art models. Each of these models is then used to classify the scenes taken from five well-known datasets in this domain without any fine-tuning. We aim to assess the generalization capabilities of these models on the selected datasets. For better analysis, we considered using top-3 and top-5 accuracies of the models in addition to the best predicted category (top-1) that is usually reported by the models. This way of interpretation is very suitable in this domain, since the datasets contain a high number of fine grained categories with large semantic overlaps. We empirically show that the proposed CNN models actually learn the relevant semantic features in the aerial images better than we observed via standard measures. To the best of our knowledge, this is the first work in this domain that analyzes and presents model generalization performances the way we presented it here.