SPIN2026: No bad apple! SPIN2026: No bad apple!

P10Session 2 (Tuesday 13 January 2026, 14:10-16:40)
A binaural model predicting psychometric functions for speech intelligibility in non-stationary noise and listeners with and without hearing loss

Anaïs Hiard
ENTPE, Ecole Centrale de Lyon, CNRS, LTDS, UMR5513, Vaulx-en-Velin, France

Thibault Vicente
Laboratoire d'Acoustique de l'Université du Mans (LAUM), UMR 6613, Institut d'Acoustique - Graduate School (IA-GS), CNRS, Le Mans Université, France

Virginia Best
Department of Speech, Language, and Hearing Sciences, Boston University, Boston, Massachusetts 02215, USA

Jörg Buccholz
Department of Linguistics-Audiology, Australian Hearing Hub, Macquarie University, New South Wales, 2109, Australia

Mathieu Lavandier
ENTPE, Ecole Centrale de Lyon, CNRS, LTDS, UMR5513, Vaulx-en-Velin, France

The binaural speech intelligibility prediction model proposed aims to take into account some of the auditory mechanisms that influence speech intelligibility in noise (spatial release from masking, dip listening, hearing loss). This model considers better-ear listening (BE) and binaural unmasking (BU). The model predicts the psychometric function averaged across normal-hearing (NH) listeners and the individual psychometrics functions for hearing-impaired (HI) listeners, using an internal noise (IN) implementation that includes the individual audiogram of the listener and the overall level of the external stimuli. The model is developed on the basis of previous work which predicted only differences in speech reception thresholds (SRT). It shares a common structure with those models. The speech and masker signals at the listener’s ears are filtered with a gammatone filterbank. In each frequency band, the advantages of binaural hearing (BE and BU) are estimated to compute an internal signal-to-noise ratio (SNR). To predict complete psychometric functions, a reference psychometric function is needed as input, whose characteristics relied on the target speech material considered and they are derived using experimental data measured in a given condition. This function is used to convert the internal SNR to percent correct. To consider dip listening, this scheme is calculated in time frames and averaged.

To validate the model, four datasets were used—each involving a different type of masker (stationary noise, modulated noise, 2 competing voices, 4 competing voices)—and four versions of the IN implementation were compared. The results showed that the model can capture well the difference in slopes highlighted in the literature: the slope of the psychometric functions in stationary noise are steeper than in modulated noise. The comparison of the different IN approaches indicates that it is necessary to set a ceiling value on the IN to get predictions that fit the data, otherwise the performances of the severely-impaired HI listeners are under-predicted. Compared to the models predicting only differences in SRT, the proposed model can capture the dependence of the perceptual effects (spatial release from masking, dip listening, hearing loss) on the SNR or intelligibility level considered.

Last modified 2025-11-21 16:50:42