NeurIPS #30: LLMs grasp morality in concept.

Poster
in
Workshop: AI meets Moral Philosophy and Moral Psychology: An Interdisciplinary Dialogue about Computational Ethics

#30: LLMs grasp morality in concept.

Mark Pock · Andre Ye · Jared Moore

Keywords: [ alignment ] [ AI Ethics ] [ Philosophy of AI ] [ meaning ] [ semiotics ] [ Fairness ]

[ Abstract ] [ Project Page ]

[ OpenReview]

Abstract:

Work in AI ethics and fairness has made much progress in regulating LLMs to reflect certain values, such as fairness, truth, and diversity.However, it has taken the problem of how LLMs might 'mean' anything at all for granted.Without addressing this, it is not clear what imbuing LLMs with such values even means.In response, we provide a general theory of meaning that extends beyond humans. We use this theory to explicate the precise nature of LLMs as meaning-agents. We suggest that the LLM, by virtue of its position as a meaning-agent, already grasps the constructions of human society (e.g. morality, gender, and race) in concept.Consequently, under certain ethical frameworks, currently popular methods for model alignment are limited at best and counterproductive at worst.Moreover, unaligned models may help us better develop our moral and social philosophy.

Chat is not available.

Poster in Workshop: AI meets Moral Philosophy and Moral Psychology: An Interdisciplinary Dialogue about Computational Ethics

#30: LLMs grasp morality in concept.

Mark Pock · Andre Ye · Jared Moore

Poster
in
Workshop: AI meets Moral Philosophy and Moral Psychology: An Interdisciplinary Dialogue about Computational Ethics