P46Session 2 (Tuesday 13 January 2026, 14:10-16:40)A blind real-time capable binaural model for estimating subjectively perceived listening effort of normal-hearing and hearing-impaired listeners
Listening effort (LE) has become an established measure to assess communication situations that can provide more information for the perception of speech in situations, where speech intelligibility (SI) might be at ceiling. LE can be assessed using objective measurements as well as subjective ratings. LE and SI can be measured or assessed in hearing experiments and also predicted by models. Most models run offline on entire signals and may require clean source signals. Nevertheless, there can be cases where very frequent predictions might be required, such as in real-time applications. For example, a real-time model could be used in a hearing aid to automatically select the best algorithm at runtime to minimize induced LE. Usually, such applications have no source signals at hand, requiring them to work “blindly” on the ear signals only.
We propose a real-time, blind LE prediction model that uses block processing and accounts for binaural capabilities and hearing loss. The model consists of two stages: a binaural front-end and a monaural back-end. The front-end takes the ear signals as inputs and simulates hearing loss by adding threshold simulating noise. It then simulates spatial release of masking by simultaneously modeling binaural masking, using an Equalization Cancellation approach, and better ear listening. Thus, a binaurally enhanced single-channel signal is produced, which is then routed into the monaural back-end. The back-end uses a triphone classifier to make prediction on the subjectively perceived LE. The classifier outputs successive posterior probabilities for the set of known triphones. LE is then predicted by calculating a similarity measure over the course of consecutive posterior probabilities. Thus, lower similarity indicates lower LE, whereas greater similarity - potentially introduced by temporal smearing, e.g., through reverberation - indicates higher predicted LE.
The model was evaluated on data of subjectively perceived LE, measured using a novel real-time assessment method with normal-hearing listeners. It accurately predicted continuous, subjectively perceived LE (R² = 0.86) under conditions with varying signal-to-noise ratios (SNRs) and reverberation. Further evaluation of the model was conducted using stationary and modulated noise.