Timezone: »
There is a tendency across different subfields in AI to see value in a small collection of influential benchmarks, which we term 'general' benchmarks. These benchmarks operate as stand-ins or abstractions for a range of anointed common problems that are frequently framed as foundational milestones on the path towards flexible and generalizable AI systems. State-of-the-art performance on these benchmarks is widely understood as indicative of progress towards these long-term goals. In this position paper, we explore how such benchmarks are designed, constructed and used in order to reveal key limitations of their framing as the functionally 'general' broad measures of progress they are set up to be.
Author Information
Deborah Raji (UC Berkeley)
Emily Denton (Google)
Emily M. Bender (University of Washington)
Alex Hanna (Google)
Amandalynne Paullada (University of Washington)
More from the Same Authors
-
2021 : Reduced, Reused and Recycled: The Life of a Dataset in Machine Learning Research »
Bernard Koch · Emily Denton · Alex Hanna · Jacob G Foster -
2021 : Artsheets for Art Datasets »
Ramya Srinivasan · Emily Denton · Jordan Famularo · Negar Rostamzadeh · Fernando Diaz · Beth Coleman -
2021 : Are We Learning Yet? A Meta Review of Evaluation Failures Across Machine Learning »
Thomas Liao · Rohan Taori · Deborah Raji · Ludwig Schmidt -
2022 Poster: Photorealistic Text-to-Image Diffusion Models with Deep Language Understanding »
Chitwan Saharia · William Chan · Saurabh Saxena · Lala Li · Jay Whang · Emily Denton · Kamyar Ghasemipour · Raphael Gontijo Lopes · Burcu Karagol Ayan · Tim Salimans · Jonathan Ho · David Fleet · Mohammad Norouzi -
2022 Social: Ethics Review - Open Discussion »
Deborah Raji · William Isaac · Cherie Poland · Alexandra Luccioni -
2021 : Evaluation as a Process for Engineering Responsibility in AI »
Deborah Raji -
2021 : Live panel: ImageNets of "x": ImageNet's Infrastructural Impact »
Emily Denton · Alex Hanna -
2021 : ImageNets of "x": ImageNet's Infrastructural Impact »
Emily Denton · Alex Hanna -
2021 : Career and Life: Panel Discussion - Bo Li, Adriana Romero-Soriano, Devi Parikh, and Emily Denton »
Emily Denton · Devi Parikh · Bo Li · Adriana Romero -
2021 : Reduced, Reused and Recycled: The Life of a Dataset in Machine Learning Research »
Bernard Koch · Emily Denton · Alex Hanna · Jacob G Foster -
2020 : How should researchers engage with controversial applications of AI? »
Logan Koepke · CATHERINE ONEIL · Tawana Petty · Cynthia Rudin · Deborah Raji · Shawn Bushway -
2020 : Harms from AI research »
Anna Lauren Hoffmann · Nyalleng Moorosi · Vinay Prabhu · Deborah Raji · Jacob Metcalf · Sherry Stanley -
2020 Workshop: Navigating the Broader Impacts of AI Research »
Carolyn Ashurst · Rosie Campbell · Deborah Raji · Solon Barocas · Stuart Russell -
2020 : Data and its (dis)contents: A survey of dataset development and use in machine learning research »
Amandalynne Paullada -
2020 : AI and the Everything in the Whole Wide World Benchmark »
Deborah Raji -
2020 : Invited Talk 3: Inioluwa Deborah Raji »
Deborah Raji -
2020 : Panel »
Kilian Weinberger · Maria De-Arteaga · Shibani Santurkar · Jonathan Frankle · Deborah Raji -
2019 : Emily Bender (University of Washington) "Making Stakeholder Impacts Visible in the Evaluation Cycle: Towards Fairness-Integrated Shared Tasks and Evaluation Metrics" »
Emily M. Bender -
2019 : AI's Blindspots and Where to Find Them »
Deborah Raji