Skip to yearly menu bar Skip to main content

Affinity Workshop: Queer in AI

Making Intelligence: Ethics, IQ, and ML Benchmarks

Leif Hancox-Li · Borhane Blili-Hamelin

Keywords: [ Benchmarks ] [ values ] [ philosophy ]


The ML community recognizes the importance of anticipating and mitigating the potential negative impacts of benchmark research. In this position paper, we argue that more attention needs to be paid to areas of ethical risk that lie at the technical and scientific core of ML benchmarks. We identify overlooked structural similarities between human IQ and ML benchmarks. These share similarities in setting standards for describing, evaluating and comparing performance on tasks relevant to intelligence. Drawing on prior research on IQ benchmarks from feminist philosophy of science, we argue that values need to be considered when creating ML benchmarks and datasets, and that it is not possible to avoid this choice by creating benchmarks that are value-neutral. Finally, we outline practical recommendations for benchmark research ethics and ethics review.

Chat is not available.