Abstract
In parent-child interactions (PCIs), there is frequent physical contact between the two actors. Quantifying this contact provides valuable input to assess the nature of the interaction or the relation between parent and child. Here, we explore the application of vision-based techniques to automatically detect contact signatures at each frame of video recordings of playful parent-infant interactions. We employ two separate models: (i) a multimodal convolutional neural network (CNN) that integrates 2D pose and body part information, and (ii) a unimodal graph convolutional neural network (GCN) that utilizes only 2D pose. We showcase the potential and limitations of automatic contact signature estimation through quantitative and qualitative assessments using a parent-infant free play interaction dataset consisting of 100 parent-child dyadic interactions, covering 20 hours. Additionally, our experiments provide insights into various design choices through systematic experimentation. By releasing our annotations and code, we aim to enable further research in the automatic contact signature estimation during free play interactions between parents and infants.
Original language | English |
---|---|
Title of host publication | ICMI '24: Proceedings of the 26th International Conference on Multimodal Interaction |
Publisher | Association for Computing Machinery |
Pages | 38-46 |
Number of pages | 9 |
DOIs | |
Publication status | Published - 4 Nov 2024 |
Bibliographical note
Publisher Copyright:© 2024 Copyright held by the owner/author(s).
Keywords
- contact detection
- free play
- graph convolutional neural network
- interaction analysis
- parent-child interaction
- pose estimation