SautiDB-Naija: A Nigerian L2 English Speech Dataset
Tejumade Afonja · Ademola Malomo · Lawrence Francis · Goodness C Duru · Kenechi Dukor · Oluwafemi Azeez · Oladimeji Mudele · Olumide Okubadejo

In this paper, we introduce SautiDB-Naija, a speech corpus of non-native speakers of English intended for research in accent translation, voice conversion, pronunciation classification, and accent classification. This initial release of our corpus includes over 900 recordings of non-native speakers of English whose first language (L1) is amongst the most common in Nigeria, namely Yoruba, Igbo, Edo, Efik-Ibibio, and Igala. To the best of our knowledge, this would be the first documented effort to curate a corpus of Nigerian accents for machine learning research to date. We demonstrate that neural networks are capable of learning linguistic features that distinguish between different accent classes by training a discriminative classifier on our corpus. This demonstrates the potential of SautiDB-Naija as a valuable resource for future computational linguistic research.

Author Information

Tejumade Afonja (Saarland University)

Tejumade Afonja is a Graduate Student at Saarland University studying Computer Science. Previously, she worked as an AI Software Engineer at InstaDeep Nigeria. She holds a B.Tech in Mechanical Engineering from Ladoke Akintola University of Technology (2015). She’s currently a remote research intern at Vector Institute where she is conducting research in the areas of privacy, security, and machine learning. Tejumade is the co-founder of AI Saturdays Lagos, an AI community in Lagos, Nigeria focused on conducting research and teaching machine learning related subjects to Nigerian youths. Tejumade is one of the 2020 Google EMEA Women Techmakers Scholar. Tejumade was a co-organizer for ML4D 2019 NeurIPS workshop and she is serving as the lead organizer this year. She is affiliated with several other workshops like BIA, WIML, ICLR, Deep Learning Indaba, AI4D, and DSA where she occasionally serves as a volunteer or mentor.

Ademola Malomo (AI Saturday Lagos)
Lawrence Francis (Instadeep)
Goodness C Duru (RETINA-AI Health, Inc.)
Kenechi Dukor (AI Saturdays Lagos)

I am an aspiring Machine Learning researcher who has done extensive self study and community work in Machine Learning. I hold and B.Sc in Mechanical Engineering and hope to resume graduate school by fall 2021.

Oluwafemi Azeez (Borealis AI)
Oladimeji Mudele (University of Pavia, Italy)

Mudele Oladimeji Ezekiel was born in Ondo, Nigeria in 1989. He completed a Bachelor of Engineering degree in Electrical and Electronic Engineering in 2013 from the Federal University of Technology, Akure, Nigeria. In 2017, he obtained a master degree in Electronic Engineering ‘’cum laude” from the University of Pavia, Italy. He is currently a Ph.D. student in the Telecommunications and Remote Sensing Laboratory of this same university.

Olumide Okubadejo (AI Saturdays Lagos)

