Workshop: Socially Responsible Language Modelling Research (SoLaR)

Developing A Conceptual Framework for Analyzing People in Unstructured Data

Mark Díaz · Sunipa Dev · Emily Reif · Remi Denton · Vinodkumar Prabhakaran


Unstructured data used in foundation model development is a challenge for systematic analyses to make data use and documentation decisions. From a Responsible AI perspective, these decisions often rely upon understanding how people are represented in data. We propose a framework to guide analysis of human representation in unstructured data and identify downstream risks.

