Deep Learning for Speech Recognition and Related Applications
Li Deng · Dong Yu · Geoffrey E Hinton

Sat Dec 12 07:30 AM -- 06:30 PM (PST) @ Hilton: Cheakamus
Event URL: http://research.microsoft.com/en-us/um/people/dongyu/NIPS2009/ »

Over the past 25 years or so, speech recognition technology has been dominated by a “shallow” architecture --- hidden Markov models (HMMs). Significant technological success has been achieved using complex and carefully engineered variants of HMMs. The next generation of the technology requires solutions to remaining technical challenges under diversified deployment environments. These challenges, not adequately addressed in the past, arise from the many types of variability present in the speech generation process. Overcoming these challenges is likely to require “deep” architectures with efficient learning algorithms.

For speech recognition and related sequential pattern recognition applications, some attempts have been made in the past to develop computational architectures that are “deeper” than conventional HMMs, such as hierarchical HMMs, hierarchical point-process models, hidden dynamic models, and multi-level detection-based architectures, etc. While positive recognition results have been reported, there has been a conspicuous lack of systematic learning techniques and theoretical guidance to facilitate the development of these deep architectures. Further, there has been virtually no effective communication between machine learning researchers and speech recognition researchers who are both advocating the use of deep architecture and learning. One goal of the proposed workshop is to bring together these two groups of researchers to review the progress in both fields and to identify promising and synergistic research directions for potential future cross-fertilization and collaboration.

Author Information

Li Deng (Citadel)
Dong Yu (Microsoft Research)
Geoffrey E Hinton (Google & University of Toronto)

Geoffrey Hinton received his PhD in Artificial Intelligence from Edinburgh in 1978 and spent five years as a faculty member at Carnegie-Mellon where he pioneered back-propagation, Boltzmann machines and distributed representations of words. In 1987 he became a fellow of the Canadian Institute for Advanced Research and moved to the University of Toronto. In 1998 he founded the Gatsby Computational Neuroscience Unit at University College London, returning to the University of Toronto in 2001. His group at the University of Toronto then used deep learning to change the way speech recognition and object recognition are done. He currently splits his time between the University of Toronto and Google. In 2010 he received the NSERC Herzberg Gold Medal, Canada's top award in Science and Engineering.

