Skip to yearly menu bar Skip to main content

Workshop: AI meets Moral Philosophy and Moral Psychology: An Interdisciplinary Dialogue about Computational Ethics

#35: Cross-cultural differences in evaluating offensive language and the role of moral foundations

Aida Mostafazadeh Davani · Mark Díaz · Vinodkumar Prabhakaran

Keywords: [ safety ] [ artificial intelligence ] [ language models ] [ Moral Foundations ] [ Annotation ]

[ ] [ Project Page ]
Fri 15 Dec 12:50 p.m. PST — 1:50 p.m. PST


Detecting offensive content in text is an increasingly central challenge for both social-media platforms and AI-driven technologies. However offensiveness remains a subjective phenomenon as perspectives differ across sociodemographic characteristics, as well as cultural norms and moral values. This intricacy is largely ignored in the current AI-focused approaches for detecting offensiveness or related concepts such as hate speech and toxicity detection. We frame the task of determining offensiveness as essentially a matter of moral judgment --- deciding the boundaries of ethically wrong vs. right language to be used or generated within an implied set of sociocultural norms. In this paper, we investigate how judgment of offensiveness varies across diverse global cultural regions, and the crucial role of moral values in shaping these variations. Our findings highlight substantial cross-cultural differences in perceiving offensiveness, with moral concerns about Caring and Purity as the mediating factor driving these differences. These insights are of importance as AI safety protocols, shaped by human annotators' inputs and perspectives, embed their moral values which do not align with the notions of right and wrong in all contexts, and for all individuals.

Chat is not available.