Skip to yearly menu bar Skip to main content

Workshop: Machine Learning for Audio

Improved sound quality human-inspired DNN-based audio applications

Chuan Wen · Sarah Verhulst


The human auditory system evolved into a structure that provides sharp frequency tuning while transforming sound into a neural code that is optimized for speech understanding in challenging acoustic environments. Employing hallmark features of human hearing in audio applications might thus leverage these systems beyond what is currently possible with purely data-driven approaches. A key requirement for such bio-inspired audio applications is a fully differentiable closed-loop system that includes a biophysically-realistic model of (hearing-impaired) auditory processing. However, existing state-of-the-art models introduce tonal artifacts within their processing that end up as detrimental audible artifacts in the resulting audio application. We propose a solution that improves the architecture of CNN-based auditory processing block to avoid the creation of spurious distortions, while we optimize computations to ensure that the audio applications have real-time capabilities (latency <10ms). We provide a proof-of-principle example for the case of closed-loop, CNN-based hearing-aid algorithms, and conclude that CNN-based auditory models embedded in closed-loop training systems hold great promise for the next generation of bio-inspired audio applications.

Chat is not available.