NeurIPS Probing Explicit and Implicit Gender Bias through LLM Conditional Text Generation

Poster
in
Workshop: Socially Responsible Language Modelling Research (SoLaR)

Probing Explicit and Implicit Gender Bias through LLM Conditional Text Generation

Xiangjue Dong · Yibo Wang · Philip S Yu · James Caverlee

[ Abstract ] [ Project Page ]

[ OpenReview]

Abstract:

Large Language Models (LLMs) can generate biased and toxic responses. Yet most prior work on LLM gender bias evaluation requires predefined gender-related phrases or gender stereotypes, which are challenging to be comprehensively collected and are limited to explicit bias evaluation. In this work, we propose a conditional text generation mechanism without the need for predefined gender phrases and stereotypes. This approach employs three types of inputs generated through three distinct strategies to probe LLMs, aiming to show evidence of explicit and implicit gender biases in LLMs. We also utilize explicit and implicit evaluation metrics to evaluate gender bias in LLMs under different strategies. Our experiments demonstrate that an increased model size does not consistently lead to enhanced fairness and all tested LLMs demonstrate explicit and/or implicit gender bias.

Chat is not available.

Poster in Workshop: Socially Responsible Language Modelling Research (SoLaR)

Probing Explicit and Implicit Gender Bias through LLM Conditional Text Generation

Xiangjue Dong · Yibo Wang · Philip S Yu · James Caverlee

Poster
in
Workshop: Socially Responsible Language Modelling Research (SoLaR)