Timezone: »

AI and the Everything in the Whole Wide World Benchmark
Deborah Raji · Remi Denton · Emily M. Bender · Alex Hanna · Amandalynne Paullada
Event URL: https://openreview.net/forum?id=j6NxpQbREA1 »

There is a tendency across different subfields in AI to see value in a small collection of influential benchmarks, which we term 'general' benchmarks. These benchmarks operate as stand-ins or abstractions for a range of anointed common problems that are frequently framed as foundational milestones on the path towards flexible and generalizable AI systems. State-of-the-art performance on these benchmarks is widely understood as indicative of progress towards these long-term goals. In this position paper, we explore how such benchmarks are designed, constructed and used in order to reveal key limitations of their framing as the functionally 'general' broad measures of progress they are set up to be.

Author Information

Deborah Raji (UC Berkeley)
Remi Denton (Google)
Remi Denton

Remi Denton (they/them) is a Staff Research Scientist at Google, within the Technology, AI, Society, and Culture team, where they study the sociocultural impacts of AI technologies and conditions of AI development. Prior to joining Google, Remi received their PhD in Computer Science from the Courant Institute of Mathematical Sciences at New York University, where they focused on unsupervised learning and generative modeling of images and video. Prior to that, they received their BSc in Computer Science and Cognitive Science at the University of Toronto. Though trained formally as a computer scientist, Remi draws ideas and methods from multiple disciplines and is drawn towards highly interdisciplinary collaborations, in order to examine AI systems from a sociotechnical perspective. Remi’s recent research centers on emerging text- and image-based generative AI, with a focus on data considerations and representational harms. Remi published under the name "Emily Denton".

Emily M. Bender (University of Washington)
Alex Hanna (Google)
Amandalynne Paullada (University of Washington)

More from the Same Authors