Keywords: [ Responsible AI ] [ Natural Language Processing ] [ Indic Languages ]
A general human understanding of a teacher or homemaker being female and a professor or doctor being male has been prevalent for ages. The idea behind this falls into the trap of how gender roles have often been unofficially defined without taking into account that these gender roles don’t exist on a concrete base. This analogy exists in the word embeddings studied for NLP models. Natural Language Processing (NLP) is a subset of AI that allows systems to understand spoken language and interpret it the way human beings do. Systematic research in the field of NLP for trying to overcome this type of social (gender) bias is ongoing. There are not enough systems that can identify such type of prejudice to give fair a result. This research paper focuses on the stereotypes present in the Marathi language and properly training the data set by the usage of paired pronouns in Marathi (तो/ती/ते) and gender-neutral terms (सहभागी). Training systems to not interpret gender roles unless defined is a major way to eliminate gender bias from written texts.The generation of a truly unbiased dataset would be possible by giving representation to individuals belonging to different demographic groups where there is a slight change in the way a language is spoken or interpreted.