Timezone: »

Fairness Degrading Adversarial Attacks Against Clustering Algorithms
Anshuman Chhabra · Adish Singla · Prasant Mohapatra

Clustering algorithms are ubiquitous in modern data science pipelines, and are utilized in numerous fields ranging from biology to facility location. Due to their widespread use, especially in societal resource allocation problems, recent research has aimed at making clustering algorithms fair, with great success. Furthermore, it has also been shown that clustering algorithms, much like other machine learning algorithms, are susceptible to adversarial attacks where a malicious entity seeks to subvert the performance of the learning algorithm. However, despite these known vulnerabilities, there has been no research undertaken that investigates fairness degrading adversarial attacks for clustering. We seek to bridge this gap by formulating a generalized attack optimization problem aimed at worsening the group-level fairness of centroid-based clustering algorithms. As a first step, we propose a fairness degrading attack algorithm for k-median clustering that operates under a whitebox threat model-- where the clustering algorithm, fairness notion, and the input dataset are known to the adversary. We provide empirical results as well as theoretical analysis for our simple attack algorithm, and find that the addition of the generated adversarial samples can lead to significantly lower fairness values. In this manner, we aim to motivate fairness degrading adversarial attacks as a direction for future research in fair clustering.

Author Information

Anshuman Chhabra (University of California, Davis)
Anshuman Chhabra

Anshuman Chhabra is a Ph.D candidate at the University of California, Davis being advised by Prof. Prasant Mohapatra. Prior to that, he completed his B.Eng in Electronics and Communication Engineering from the University of Delhi, India. His research seeks to improve Machine Learning (ML) models and facilitate their adoption into society by analyzing model robustness along two dimensions: adversarial robustness (adversarial attacks/defenses against models) and social robustness (fair machine learning). His other research interests include designing Machine Learning and Reinforcement Learning based debiasing interventions for social media platforms such as YouTube and Twitter. He received the UC Davis Graduate Student Fellowship in 2018, and has held research positions at ESnet, Lawrence Berkeley National Laboratory, USA (2017), the Max Planck Institute for Software Systems, Germany (2020), and the University of Amsterdam, Netherlands (2022).

Adish Singla (MPI-SWS)
Prasant Mohapatra (University of California, Davis)

More from the Same Authors