Isolated Sign Language Recognition with Multi-scale Features using LSTM

Mercanoglu Sincan O., Tur A. O., Yalim Keles H.

27th Signal Processing and Communications Applications Conference (SIU), Sivas, Türkiye, 24 - 26 Nisan 2019

Yayın Türü: Bildiri / Tam Metin Bildiri
Doi Numarası: 10.1109/siu.2019.8806467
Basıldığı Şehir: Sivas
Basıldığı Ülke: Türkiye
Anahtar Kelimeler: convolutional neural networks, long short-term memory, feature pooling module, sign language recognition
Hacettepe Üniversitesi Adresli: Hayır

Özet

Sign language recognition systems are used to convert signs in video streams to text automatically. In this work, an original isolated sign language recognition model is created using Convolutional Neural Networks (CNNs), Feature Pooling Module and Long Short-Term Memory Networks (LSTMs). In the CNN part, a pre-trained VGG-16 model is used identically in two parallel architectures, after adapting its weights to the dataset; in this architecture, the features from color (RGB) and depth streams are extracted in parallel. The extracted features are directed to FPM to generate multi-scale features. The features matrices are reduced to representative feature vectors, using Global Average Pooling (GAP). The features that are obtained from RGB and depth streams are concatenated and passed to the LSTM architecture after instance normalization. We get 93.15% test accuracy on Montalbano Italian sign language dataset using the proposed model; this result is comparable with the recent state-of-the-art methods.