NeurIPS Efficient Evaluation of Bias in Large Language Models through Prompt Tuning

Poster
in
Workshop: Socially Responsible Language Modelling Research (SoLaR)

Efficient Evaluation of Bias in Large Language Models through Prompt Tuning

Jacob-Junqi Tian · David B. Emerson · Deval Pandya · Laleh Seyyed-Kalantari · Faiza Khattak

[ Abstract ] [ Project Page ]

[ OpenReview]

Abstract:

Prompting large language models (LLMs) has gained substantial popularity as pre-trained LLMs are capable of performing downstream tasks without requiring large quantities of labelled data. It is, therefore, natural that prompting is also used to evaluate biases exhibited by these models. However, achieving good task-specific performance often requires manual prompt optimization. In this paper, we explore the use of soft-prompt tuning to quantify the biases of LLMs such as OPT and LLaMA. These models are trained on real-world data with potential implicit biases toward certain groups. Since LLMs are increasingly used across many industries and applications, it is crucial to accurately and efficiently identify such biases and their practical implications.In this paper, we use soft-prompt tuning to evaluate model bias across several sensitive attributes through the lens of group fairness (bias). In addition to improved task performance, using soft-prompt tuning provides the advantage of avoiding potential injection of human bias through manually designed prompts. Probing with prompt-tuning reveals important bias patterns, including disparities across age and sexuality.

Chat is not available.

Poster in Workshop: Socially Responsible Language Modelling Research (SoLaR)

Efficient Evaluation of Bias in Large Language Models through Prompt Tuning

Jacob-Junqi Tian · David B. Emerson · Deval Pandya · Laleh Seyyed-Kalantari · Faiza Khattak

Poster
in
Workshop: Socially Responsible Language Modelling Research (SoLaR)