Red Carpet to Fight Club: Partially-supervised Domain Transfer for Face Recognition in Violent Videos


Creative Commons License

Bilge Y. C., Yucel M. K., CİNBİŞ R. G., İKİZLER CİNBİŞ N., DUYGULU ŞAHİN P.

IEEE Winter Conference on Applications of Computer Vision (WACV), ELECTR NETWORK, 5 - 09 January 2021, pp.3357-3368 identifier identifier

  • Publication Type: Conference Paper / Full Text
  • Doi Number: 10.1109/wacv48630.2021.00340
  • Country: ELECTR NETWORK
  • Page Numbers: pp.3357-3368
  • Hacettepe University Affiliated: Yes

Abstract

In many real-world problems, there is typically a large discrepancy between the characteristics of data used in training versus deployment. A prime example is the analysis of aggression videos: in a criminal incidence, typically suspects need to be identified based on their clean portrait-like photos, instead of their prior video recordings. This results in three major challenges; large domain discrepancy between violence videos and ID-photos, the lack of video examples for most individuals and limited training data availability. To mimic such scenarios, we formulate a realistic domain-transfer problem, where the goal is to transfer the recognition model trained on clean posed images to the target domain of violent videos, where training videos are available only for a subset of subjects. To this end, we introduce the "WildestFaces" dataset, tailored to study cross-domain recognition under a variety of adverse conditions. We divide the task of transferring a recognition model from the domain of clean images to the violent videos into two sub-problems and tackle them using (i) stacked affine-transforms for classifier-transfer, (ii) attention-driven pooling for temporal-adaptation. We additionally formulate a self-attention based model for domain-transfer. We establish a rigorous evaluation protocol for this "clean-to-violent" recognition task, and present a detailed analysis of the proposed dataset and the methods. Our experiments highlight the unique challenges introduced by the WildestFaces dataset and the advantages of the proposed approach.