Abstract
We introduce a novel spatio-temporal deformable part model
for offline detection of fine-grained interactions in video. One novelty of
the model is that part detectors model the interacting individuals in a
single graph that can contain different combinations of feature descriptors.
This allows us to use both body pose and movement to model the
coordination between two people in space and time. We evaluate the
performance of our approach on novel and existing interaction datasets.
When testing only on the target class, we achieve mean average precision
scores of 0.82. When presented with distractor classes, the additional
modelling of the motion of specific body parts significantly reduces the
number of confusions. Cross-dataset tests demonstrate that our trained
models generalize well to other settings.
for offline detection of fine-grained interactions in video. One novelty of
the model is that part detectors model the interacting individuals in a
single graph that can contain different combinations of feature descriptors.
This allows us to use both body pose and movement to model the
coordination between two people in space and time. We evaluate the
performance of our approach on novel and existing interaction datasets.
When testing only on the target class, we achieve mean average precision
scores of 0.82. When presented with distractor classes, the additional
modelling of the motion of specific body parts significantly reduces the
number of confusions. Cross-dataset tests demonstrate that our trained
models generalize well to other settings.
Original language | English |
---|---|
Title of host publication | Proceedings of the International Workshop on Human Behavior Understanding (HBU) |
Publisher | Springer |
Pages | 116-133 |
ISBN (Electronic) | 978-3-319-46843-3 |
ISBN (Print) | 978-3-319-46842-6 |
DOIs | |
Publication status | Published - 2016 |
Publication series
Name | Lecture Notes in Computer Science |
---|---|
Publisher | Springer |
Volume | 9997 |
Keywords
- Human behavior
- Interaction detection
- Spatio-temporal localization