Using hearing and vision for motion prediction, motion perception, and localization

Research output: Working paperPreprintAcademic

Abstract

Predicting the location of moving objects in noisy environments is essential to everyday behavior, like when participating in traffic. Although many objects provide multisensory information, it remains unknown how humans use multisensory information to localize moving objects, and how this depends on expected sensory interference (e.g., occlusion). In four experiments, we systematically investigated localization performance for auditory, visual, and audiovisual targets. Performance for audiovisual targets was compared to performance predicted by Maximum Likelihood Estimation (MLE). In Experiment 1A, moving targets were occluded by an audiovisual occluder, and their final locations had to be inferred from target speed and occlusion duration. Participants relied exclusively on the visual component of the audiovisual target, even though the auditory component demonstrably provided useful location information when presented in isolation. In contrast, when a visual-only occluder was used in Experiment 1B, participants relied primarily on the auditory component of the audiovisual target, even though the visual component demonstrably provided useful location information when presented in isolation. In Experiment 2, no multisensory precision benefits were found when participants localized moving audiovisual target objects. Yet, localization estimates were in line with MLE predictions. In Experiment 3, a substantial multisensory benefit was found when participants localized static audiovisual target objects, showing near-MLE audiovisual integration. In sum, observers use both hearing and vision when localizing static objects, but use only unisensory input when localizing moving objects and predicting motion under occlusion, perhaps to minimize short-term memory load. Moreover, observers can flexibly prioritize one sense over the other, in anticipation of modality-specific interference.
Original languageEnglish
PublisherSocArXiv
Pages1-72
Number of pages72
DOIs
Publication statusPublished - 28 May 2024

Funding

This work was supported by the China Scholarship Council (Grant No. 202206380011) to Yichen Yuan

FundersFunder number
Chinese Scholarship Council202206380011

    Keywords

    • multisensory perception
    • maximum likelihood estimation
    • audiovisual integration
    • motion perception
    • occlusion

    Fingerprint

    Dive into the research topics of 'Using hearing and vision for motion prediction, motion perception, and localization'. Together they form a unique fingerprint.

    Cite this