Skip to yearly menu bar Skip to main content


Oral
in
Affinity Workshop: Global South in AI

BELA: Bot for English Language Acquisition

Muskan Mahajan

Keywords: [ Indic-languages ] [ English language acquisition ] [ Natural Language Processing ] [ Chatbots ]


Abstract: Our paper introduces ‘BELA’, Bot for English Language Acquisition, an application of conversational agents (chatbots) for the Hindi-speaking youth. BELA is developed for the young underprivileged students at an Indian non-profit called Udayan Care. Hinglish is a way of writing Hindi words using English letters common among 350 million speakers in India$\footnote{\url{https://www.milestoneloc.com/guide-to-hinglish-language/}}$, BELA’s natural language understanding pipeline supports Hindi and Hinglish utterances by using a language identifier, an Indic-language transliterator and a translator.$\footnote{\url{ https://pypi.org/project/google-transliteration-api/}, \url{https://huggingface.co/salesken/translation-hi-en}}$BELA has two modes, a retrieval-based ‘tutor’ mode to facilitate question-answering on classic English tasks like word meanings, translations, and reading comprehensions, and a generative ‘buddy’ mode to facilitate open-domain chit-chat on general topics like movies, food, and school. Our dialogue management system is designed to route user utterances between the two modes using a binary classifier.Three tenets have governed the design of BELA the Bot: support for Hindi utterances, reliability of answers to learners’ queries, and graceful failure. We ensure that responses from BELA are accurate and reliable using tested translation and thesaurus APIs.$\footnote{\url{ https://developer.oxforddictionaries.com/}}$The challenges in developing BELA included a lack of data for intent classification and DM, and a lack of a database for reading passages and English videos levelled by learner-proficiency level (CEFR); we solved these by creating a custom dataset with text-augmentation techniques, and building a CEFR level predictor for English passages scraped from the Web.Our future work would focus on extending BELA’s support to more English learning tasks and using the mentees' Hinglish messages to adapt the transliterator pipeline to the mentees' regional variations of Hinglish.

Chat is not available.