P58Session 2 (Tuesday 13 January 2026, 14:10-16:40)An experimental paradigm for testing context-aware closed-loop speech enhancement systems
Context-aware hearing instruments leverage user-related signals, such as EEG, gaze, or first-person video, to infer which speech sources the listener wants to hear. This information can be used to steer speech separation algorithms, selectively enhancing target speech streams while suppressing background noise. Yet, experimental paradigms for evaluating such closed-loop systems are still missing. Here, we present a behavioral audio-visual speech test paradigm designed to quantify improvements in speech intelligibility achieved by closed-loop systems in competing conversation scenarios. The speech material comprises spoken numbers concatenated to form speech streams with naturalistic speech statistics, each paired with AI-generated talking faces. Listeners are presented with four speech streams grouped into two simultaneous dyadic conversations. Within each conversation, talkers alternate between number ‘sentences’ and backchannel yes/no responses, all presented in background noise. Participants are instructed to attend to specific conversations and detect repeated numbers and backchannel keywords. This test paradigm enables evaluation of closed-loop systems that selectively enhance the speech from attended conversational groups rather than isolated speakers. Here, the paradigm is used to demonstrate a gaze-steered enhancement system that integrates gaze direction across time to decode and selectively enhance the currently attended conversation. We show that the paradigm is sensitive to differences in keyword detection performance between conversation-based enhancement, speaker-based enhancement, and no enhancement.