Timezone: »
Poster abstracts and full papers: http://media.aau.dk/smc/ml4audio/
SPEECH SOURCE SEPARATION *Lijiang Guo and Minje Kim. Bitwise Source Separation on Hashed Spectra: An Efficient Posterior Estimation Scheme Using Partial Rank Order Metrics *Minje Kim and Paris Smaragdis. Bitwise Neural Networks for Efficient SingleChannel Source Separation *Mohit Dubey, Garrett Kenyon, Nils Carlson and Austin Thresher. Does Phase Matter For Monaural Source Separation?
SPEECH ENHANCEMENT *Rasool Fakoor, Xiaodong He, Ivan Tashev and Shuayb Zarar. Reinforcement Learning To Adapt Speech Enhancement to Instantaneous Input Signal Quality *Jong Hwan Ko, Josh Fromm, Matthai Phillipose, Ivan Tashev and Shuayb Zarar. Precision Scaling of Neural Networks for Efficient Audio Processing
AUTOMATIC SPEECH RECOGNITION Marius Paraschiv, Lasse Borgholt, Tycho Tax, Marco Singh and Lars Maaløe. Exploiting Nontrivial Connectivity for Automatic Speech Recognition *Brian Mcmahan and Delip Rao. Listening to the World Improves Speech Command Recognition * Andros Tjandra, Sakriani Sakti and Satoshi Nakamura. End-to-End Speech Recognition with Local Monotonic Attention Sri Harsha Dumpala, Rupayan Chakraborty and Sunil Kumar Kopparapu. A Novel Approach for Effective Learning in Low Resourced Scenarios
SPEECH SYNTHESIS *Yuxuan Wang, Rj SkerryRyan, Ying Xiao, Daisy Stanton, Joel Shor, Eric Battenberg, Rob Clark and Rif A. Saurous. Uncovering Latent Style Factors for Expressive Speech Synthesis *Younggun Lee, Azam Rabiee and Soo-Young Lee. Emotional End-to-End Neural Speech Synthesizer
Author Information
Shuayb Zarar (Microsoft AI and Research)
Rasool Fakoor (University of Texas At Arlington)
SRI HARSHA DUMPALA (TCS Research and Innovation)
Minje Kim (Indiana University)
Paris Smaragdis (University of Illinois Urbana-Champaign)
Mohit Dubey (Oberlin College)
Jong Hwan Ko (Georgia Institute of Technology)
Sakriani Sakti (Nara Institute of Science and Technology)
SAKRIANI SAKTI received the DAAD-Siemens Program Asia 21st Century Award to study in Communication Technology, University of Ulm, Germany, and received her MSc degree in 2002. During her thesis work, she worked with the Speech Understanding Department, DaimlerChrysler Research Center, Ulm, Germany. Between 2003-2009, she worked as a researcher at ATR SLC Labs, Japan, and during 2006-2011, she worked as an expert researcher at NICT SLC Groups, Japan. While working with ATR-NICT, Japan, she continued her study (2005-2008) with Dialog Systems Group University of Ulm, Germany, and received her Ph.D. degree in 2008. She actively involved in collaboration activities such as Asian Pacific Telecommunity Project (2003-2007), A-STAR, and U-STAR (2006-2011). In 2009-2011, she served as a visiting professor of the Computer Science Department, University of Indonesia (UI), Indonesia. In 2011-2017, she was an assistant professor at the Augmented Human Communication Laboratory, NAIST, Japan. She served also as a visiting scientific researcher of INRIA Paris-Rocquencourt, France, in 2015-2016, under JSPS Strategic Young Researcher Overseas Visits Program for Accelerating Brain Circulation. Currently, she is a research associate professor at NAIST, as well as a research scientist at RIKEN, Center for Advanced Intelligent Project AIP, Japan. She is a member of JNS, SFN, ASJ, ISCA, IEICE, and IEEE. She is also the officer of ELRA/ISCA Special Interest Group on Under-resourced Languages (SIGUL) and a Board Member of Spoken Language Technologies for Under-Resourced Languages (SLTU). Her research interests include statistical pattern recognition, graphical modeling framework, deep learning, multilingual speech recognition and synthesis, spoken language translation, affective dialog system, and cognitive-communication.
Yuxuan Wang (Google)
Lijiang Guo (Indiana University)
Garrett T Kenyon (Los Alamos National Laboratory)
Andros Tjandra (Nara Institute of Science and Technology)
Tycho Tax (Corti)
Younggun Lee (Korea advanced institute of science and technology)
More from the Same Authors
-
2021 Poster: Continuous Doubly Constrained Batch Reinforcement Learning »
Rasool Fakoor · Jonas Mueller · Kavosh Asadi · Pratik Chaudhari · Alexander Smola -
2021 Poster: Neural Dubber: Dubbing for Videos According to Scripts »
Chenxu Hu · Qiao Tian · Tingle Li · Wang Yuping · Yuxuan Wang · Hang Zhao -
2018 : Poster Sessions and Lunch (Provided) »
Akira Utsumi · Alane Suhr · Ji Zhang · Ramon Sanabria · Kushal Kafle · Nicholas Chen · Seung Wook Kim · Aishwarya Agrawal · SRI HARSHA DUMPALA · Shikhar Murty · Pablo Azagra · Jean ROUAT · Alaaeldin Ali · · SUBBAREDDY OOTA · Angela Lin · Shruti Palaskar · Farley Lai · Amir Aly · Tingke Shen · Dianqi Li · Jianguo Zhang · Rita Kuznetsova · Jinwon An · Jean-Benoit Delbrouck · Tomasz Kornuta · Syed Ashar Javed · Christopher Davis · John Co-Reyes · Vasu Sharma · Sungwon Lyu · Ning Xie · Ankita Kalra · Huan Ling · Oleksandr Maksymets · Bhavana Mahendra Jain · Shun-Po Chuang · Sanyam Agarwal · Jerome Abdelnour · Yufei Feng · vincent albouy · Siddharth Karamcheti · Derek Doran · Roberta Raileanu · Jonathan Heek -
2017 : Poster Session Music and environmental sounds »
Oriol Nieto · Jordi Pons · Bhiksha Raj · Tycho Tax · Benjamin Elizalde · Juhan Nam · Anurag Kumar -
2017 : Compact Recurrent Neural Network based on Tensor Train for Polyphonic Music Modeling »
Sakriani Sakti -
2017 : Adaptive Front-ends for End-to-end Source Separation »
Shrikant Venkataramani · Paris Smaragdis -
2014 Poster: Spectral Learning of Mixture of Hidden Markov Models »
Cem Subakan · Johannes Traa · Paris Smaragdis -
2009 Poster: A Sparse Non-Parametric Approach for Single Channel Separation of Known Sounds »
Paris Smaragdis · Madhusudana Shashanka · Bhiksha Raj -
2007 Poster: Sparse Overcomplete Latent Variable Decomposition of Counts Data »
Madhusudana Shashanka · Bhiksha Raj · Paris Smaragdis