P05Session 1 (Monday 12 January 2026, 15:00-17:30)Audiovisual integration and gaze behaviour in degraded speech perception: An EEG and eye-tracking study
Background: Speech is often perceived multimodally, with visual information from a speaker’s face facilitating speech perception compared to auditory-only listening, especially in challenging listening conditions. Different gaze strategies, particularly where listeners fixate on the face, can affect speech perception outcomes. Yet, how visual cues influence degraded speech perception and how gaze patterns modulate the neural processing of speech in naturalistic settings remain poorly understood. Cortical tracking, which quantifies how closely the neural activity aligns with the temporal dynamics of speech (e.g., the speech envelope), provides an index of how effectively the brain encodes speech and enables studies using more ecologically valid stimuli. This study aimed to investigate whether audiovisual integration benefits occur when processing noise-vocoded naturalistic speech, and whether specific gaze patterns modulate neural encoding of speech envelope and audiovisual benefits.
Methods: Nineteen native British English adults were presented with naturalistic noise-vocoded stories under three presentation modalities (audiovisual, auditory-only, visual-only) at two degradation levels (4-band and 8-band) while electroencephalography (EEG) and eye-tracking data were recorded simultaneously. Cortical tracking analyses employed forward encoding models of the speech envelope, and audiovisual integration benefits were quantified using the additive model criterion [AV > (A + V)], which tests whether multisensory responses exceed the sum of uni-sensory responses. Two facial regions of interest (eyes and mouth) were defined for eye-tracking analysis. Gaze data were analysed using k-means clustering of the Eye-Mouth Index, which quantifies the relative fixation preference for mouth versus eyes, to identify distinct gaze patterns. Pearson’s correlations were used to investigate the relationship between gaze behaviour and cortical tracking accuracy.
Results: While we did not find audiovisual integration benefits in cortical tracking for either degradation level, this study found a novel relationship between adaptive gaze strategies and neural speech tracking. Specifically, gaze analysis revealed distinct behavioural clusters, with participants categorised as either mouth-dominant or eye-dominant viewers based on their fixation preferences. Among the mouth-dominant viewers, reduced attention to the eyes relative to the mouth significantly correlated with enhanced neural tracking accuracy of the speech envelope in the more degraded condition. These findings provide novel evidence that adaptive, mouth-focused gaze strategies could functionally benefit neural processing of degraded speech and have important implications for understanding individual differences in multisensory speech perception.