Hands-on: deformable pose and motion models for spatiotemporal localization of fine-grained dyadic interactions

C.J. van Gemeren, R.W. Poppe, R.C. Veltkamp

Research output: Contribution to journalArticleAcademicpeer-review

Abstract

We introduce a novel spatiotemporal deformable part model for the localization of fine-grained human interactions of two persons in unsegmented videos. Our approach is the first to classify interactions and additionally provide the temporal and spatial extent of the interaction in the video. To this end, our models contain part detectors that support different scales as well as different types of feature descriptors, which are combined in a single graph. This allows us to model the detailed coordination between people in terms of body pose and motion. We demonstrate that this helps to avoid confusions between visually similar interactions. We show that robust results can be obtained when training on small numbers of training sequences (5–15) per interaction class. We achieve AuC scores of 0.82 with an IoU of 0.3 on the publicly available ShakeFive2 dataset, which contains interactions that differ slightly in their coordination. To further test the generalization of our models, we perform cross-dataset experiments where we test on two other publicly available datasets: UT-Interaction and SBU Kinect. These experiments show that our models generalize well to different environments.
Original languageEnglish
Article number2018-16
JournalEurasip Journal on Image and Video Processing
DOIs
Publication statusPublished - 2018

Keywords

  • Interaction detection
  • Dyadic interactions
  • Spatiotemporal localization
  • Social behavior
  • Video analysis

Fingerprint

Dive into the research topics of 'Hands-on: deformable pose and motion models for spatiotemporal localization of fine-grained dyadic interactions'. Together they form a unique fingerprint.

Cite this