Timezone: »
In this paper, we introduce SautiDB-Naija, a speech corpus of non-native speakers of English intended for research in accent translation, voice conversion, pronunciation classification, and accent classification. This initial release of our corpus includes over 900 recordings of non-native speakers of English whose first language (L1) is amongst the most common in Nigeria, namely Yoruba, Igbo, Edo, Efik-Ibibio, and Igala. To the best of our knowledge, this would be the first documented effort to curate a corpus of Nigerian accents for machine learning research to date. We demonstrate that neural networks are capable of learning linguistic features that distinguish between different accent classes by training a discriminative classifier on our corpus. This demonstrates the potential of SautiDB-Naija as a valuable resource for future computational linguistic research.
Author Information
Tejumade Afonja (Saarland University)
Tejumade Afonja is a Graduate Student at Saarland University studying Computer Science. Previously, she worked as an AI Software Engineer at InstaDeep Nigeria. She holds a B.Tech in Mechanical Engineering from Ladoke Akintola University of Technology (2015). She’s currently a remote research intern at Vector Institute where she is conducting research in the areas of privacy, security, and machine learning. Tejumade is the co-founder of AI Saturdays Lagos, an AI community in Lagos, Nigeria focused on conducting research and teaching machine learning related subjects to Nigerian youths. Tejumade is one of the 2020 Google EMEA Women Techmakers Scholar. Tejumade was a co-organizer for ML4D 2019 NeurIPS workshop and she is serving as the lead organizer this year. She is affiliated with several other workshops like BIA, WIML, ICLR, Deep Learning Indaba, AI4D, and DSA where she occasionally serves as a volunteer or mentor.
Ademola Malomo (AI Saturday Lagos)
Lawrence Francis (Instadeep)
Goodness C Duru (RETINA-AI Health, Inc.)
Kenechi Dukor (AI Saturdays Lagos)
I am an aspiring Machine Learning researcher who has done extensive self study and community work in Machine Learning. I hold and B.Sc in Mechanical Engineering and hope to resume graduate school by fall 2021.
Oluwafemi Azeez (Borealis AI)
Oladimeji Mudele (University of Pavia, Italy)
Olumide Okubadejo (AI Saturdays Lagos)
Related Events (a corresponding poster, oral, or spotlight)
-
2021 : SautiDB-Naija: A Nigerian L2 English Speech Dataset »
Dates n/a. Room
More from the Same Authors
-
2022 Workshop: Broadening Research Collaborations »
Sara Hooker · Rosanne Liu · Pablo Samuel Castro · FatemehSadat Mireshghallah · Sunipa Dev · Benjamin Rosman · João Madeira Araújo · Savannah Thais · Sara Hooker · Sunny Sanyal · Tejumade Afonja · Swapneel Mehta · Tyler Zhu -
2021 : Invite Talk Q&A »
Milind Tambe · Tejumade Afonja · Paula Rodriguez Diaz -
2021 Workshop: Machine Learning for the Developing World (ML4D): Global Challenges »
Paula Rodriguez Diaz · Konstantin Klemmer · Sally Simone Fobi · Oluwafemi Azeez · Niveditha Kalavakonda · Aya Salama · Tejumade Afonja -
2021 : Opening Remarks »
Tejumade Afonja · Paula Rodriguez Diaz -
2020 : Introduction of Invited Talk 1 »
Tejumade Afonja -
2020 : Introduction and Agenda Overview »
Tejumade Afonja -
2020 Workshop: Machine Learning for the Developing World (ML4D): Improving Resilience »
Tejumade Afonja · Konstantin Klemmer · Niveditha Kalavakonda · Oluwafemi Azeez · Aya Salama · Paula Rodriguez Diaz -
2019 Workshop: Machine Learning for the Developing World (ML4D): Challenges and Risks »
Maria De-Arteaga · Amanda Coston · Tejumade Afonja