NeurIPS #04: The alignment problem’s problem: A response to Gabriel (2020)

Poster
in
Workshop: AI meets Moral Philosophy and Moral Psychology: An Interdisciplinary Dialogue about Computational Ethics

#04: The alignment problem’s problem: A response to Gabriel (2020)

Gus Skorburg · Walter Sinnott-Armstrong

Keywords: [ Alignment Problem ] [ Ideal Observer Theory ] [ Idealized Observers ] [ AI Ethics ]

[ Abstract ] [ Project Page ]

[ OpenReview]

Abstract:

Gabriel (2020) provided an early and important philosophical analysis of the alignment problem. In this paper, we argue that Gabriel (2020) is too quick to dismiss idealized preferences as a target of alignment for AI/ML systems. In Section 2, we summarize Gabriel’s arguments about specifying the targets of alignment, with a special focus on the objections to idealized preferences. In Section 3, we briefly sketch our version of an idealized observer theory. In Section 4, we describe an empirical method for approximating the preferences of these idealized observers. We then conclude by showing how the considerations and methods from Sections 3 and 4 address the objections raised in Section 2.

Chat is not available.

Poster in Workshop: AI meets Moral Philosophy and Moral Psychology: An Interdisciplinary Dialogue about Computational Ethics

#04: The alignment problem’s problem: A response to Gabriel (2020)

Gus Skorburg · Walter Sinnott-Armstrong

Poster
in
Workshop: AI meets Moral Philosophy and Moral Psychology: An Interdisciplinary Dialogue about Computational Ethics