Hand and Face Segmentation with Deep Convolutional Networks using Limited Labelled Data

Sincan O. M., Gencoglu S., Bacak M., Keles H.

3rd International Symposium on Multidisciplinary Studies and Innovative Technologies, ISMSIT 2019, Ankara, Turkey, 11 - 13 October 2019 identifier

  • Publication Type: Conference Paper / Full Text
  • Doi Number: 10.1109/ismsit.2019.8932835
  • City: Ankara
  • Country: Turkey
  • Keywords: CNN, deep learning, face segmentation, gesture segmentation, sign language, U-net, VGG
  • Hacettepe University Affiliated: No


© 2019 IEEE.Segmentation is a crucial step for many classification problems. There are many researchers that approach the problem using classical computer vision methods, recently deep learning approaches have been used more frequently in different domains. In this paper, we propose two segmentation networks that mark face and hands from static images for sign language recognition using only a few training data. Our networks have encoder-decoder structure that contains convolutional, max pooling and upsampling layers; the first one is a U-Net based network and the second one is a VGG-based network. We evaluate our models on two sign language datasets; the first one is our Ankara University Turkish Sign Language dataset (AU-TSL) and the second one is Montalbano Italian gesture dataset. Datasets contain background and illumination variations. Also, they are recorded with different signers. We train our models using only 400 images that we randomly selected from video frames. Our experiments show that even when we reduce the training data in half, we can still obtain satisfactory results. Proposed methods have achieved more than 98% precision using 400 frames with both datasets. Our code is available at https://github.com/au-cvml-lab/Hands-and-Face-Segmentation-With-Limited-Data.