Using independently recurrent networks for reinforcement learning based unsupervised video summarization


Yaliniz G., İKİZLER CİNBİŞ N.

MULTIMEDIA TOOLS AND APPLICATIONS, cilt.80, sa.12, ss.17827-17847, 2021 (SCI-Expanded) identifier identifier

  • Yayın Türü: Makale / Tam Makale
  • Cilt numarası: 80 Sayı: 12
  • Basım Tarihi: 2021
  • Doi Numarası: 10.1007/s11042-020-10293-x
  • Dergi Adı: MULTIMEDIA TOOLS AND APPLICATIONS
  • Derginin Tarandığı İndeksler: Science Citation Index Expanded (SCI-EXPANDED), Scopus, FRANCIS, ABI/INFORM, Applied Science & Technology Source, Compendex, Computer & Applied Sciences, INSPEC, zbMATH
  • Sayfa Sayıları: ss.17827-17847
  • Anahtar Kelimeler: Video summarization, Recurrent neural networks, Reinforcement learning, Unsupervised learning, KEYFRAME SELECTION
  • Hacettepe Üniversitesi Adresli: Evet

Özet

Sigmoid and hyperbolic activation functions in long short-term memory (LSTM) and gated recurrent unit (GRU) based models used in recent studies on video summarization, may cause gradient decay over layers. Moreover, interpreting and developing network models are difficult because of entanglement of neurons on recurrent neural network (RNN). To solve these issues, in this study, we propose a method that uses deep reinforcement learning together with independently recurrent neural networks (IndRNN) for unsupervised video summarization. In this method, Leaky Rectified Linear Unit (Leaky ReLU) is used as an activation function to deal with decaying gradient and dying neuron problems. The model, which does not rely on any labels or user interaction, is designed with a reward function that jointly accounts for uniformity, diversity and representativeness of generated summaries. In this way, our model can create summaries as uniform as possible, has more layers and can be trained with more steps without having any problem related to gradients. Based on the experiments conducted on two benchmark datasets, we observe that, compared to the state-of-the-art methods on video summarization task, better summarization performance can be obtained.