Deep operational audio-visual emotion recognition


Aktürk K., KEÇELİ A. S.

Neurocomputing, cilt.588, 2024 (SCI-Expanded) identifier

  • Yayın Türü: Makale / Tam Makale
  • Cilt numarası: 588
  • Basım Tarihi: 2024
  • Doi Numarası: 10.1016/j.neucom.2024.127713
  • Dergi Adı: Neurocomputing
  • Derginin Tarandığı İndeksler: Science Citation Index Expanded (SCI-EXPANDED), Scopus, Academic Search Premier, PASCAL, Applied Science & Technology Source, Biotechnology Research Abstracts, Compendex, Computer & Applied Sciences, INSPEC, zbMATH
  • Anahtar Kelimeler: Audio-visual emotion classification, Multi-input classification, Operational neural network, Visual geometry group
  • Hacettepe Üniversitesi Adresli: Evet

Özet

Emotions play a large role in interpersonal communication, marketing, healthcare and the service industry. For this reason, much research has been carried out on emotion classification until today. Audio-visual emotion recognition is a field within artificial intelligence and machine learning that focuses on recognizing and understanding human emotions from both visual and audio cues. It combines computer vision and audio processing techniques to analyze and interpret emotional states expressed by individuals. This paper presents a deep learning model developed over an operational neural network using multiple inputs and aimed at audio-visual emotion recognition. The proposed network utilizes both visual and audio information in an end to end approach. The primary objective of this work is to demonstrate that multi-input models can produce more efficient outcomes compared to single-input models in emotion classification. Another objective is to demonstrate the superior performance of weight calculation methods employed in operational neural networks compared to the conventional weight calculation methods used in convolutional neural networks. Therefore, we want to demonstrate that substituting convolutional neural network approaches with operational neural network methods can yield superior outcomes in emotion categorization models. In the proposed architecture, regular convolutional layers are replaced with operational layers. The experimental results demonstrate that the operational convolutional architecture performs better compared to the classical convolutional neural network architecture.