End-to-End (E2E) approach to Automatic Speech Recognition (ASR) is a hot research agenda. It is interesting for less-resourced languages (LRL) since it avoids the use of a pronunciation dictionary. However, E2E is data greedy, which makes the application of E2E to LRL questionable. However, using data from other languages in a multilingual (ML) setup is being applied to solve the problem of data scarcity. We have conducted ML E2E ASR experiments for four less-resourced Ethiopian languages using different language and acoustic modeling units. The results of our experiments show that relative Word Error Rate (WER) reductions (over the monolingual E2E systems) of up to 29.83% can be achieved by just using data from two related languages in E2E ASR system training. Moreover, we have also noticed that the use of data from less related languages also leads to E2E ASR performance improvement over the use of monolingual data.
Martha Yifiru Tachbelie (Addis Ababa University)
More from the Same Authors
2022 : End-to-End Multilingual Automatic Speech Recognition for Less-Resourced Ethiopian Languages »
Martha Yifiru Tachbelie · Martha Tachbelie
2022 : Contributed Talks 5,6,7 »
Jamelle Watson-Daniels · Alexander Johnson · Martha Yifiru Tachbelie