Tara Sainath: Multichannel Signal Processing with Deep Neural Networks for Automatic Speech Recognition
2016 Talk
in
Workshop: End-to-end Learning for Speech and Audio Processing
in
Workshop: End-to-end Learning for Speech and Audio Processing
Abstract
Automatic Speech Recognition systems commonly separate speech enhancement, including localization, beamforming and postfiltering, from acoustic modeling. In this talk, we perform multichannel enhancement jointly with acoustic modeling in a deep neural network framework. Overall, we find that such multichannel neural networks give a relative word error rate improvement of more than 5% compared to a traditional beamforming-based multichannel ASR system and more than 10% compared to a single channel model.
Chat is not available.
Successful Page Load