Skip to yearly menu bar Skip to main content

Workshop: Workshop on Federated Learning in the Age of Foundation Models in Conjunction with NeurIPS 2023 (FL@FM-NeurIPS'23)

An Empirical Evaluation of Federated Contextual Bandit Algorithms

Alekh Agarwal · H. Brendan McMahan · Zheng Xu

Keywords: [ federated learning; contextual bandits; distribution shift ]


Fine-tuning (foundation) models with user feedback can be important for improving task-specific performance, as fine-grained supervision is generally unavailable. While the adoption of federated learning increases for learning from sensitive data local to user devices, it is unclear if learning can be done using implicit signals generated as users interact with the applications.We approach such problems with the framework of federated contextual bandits, and develop variants of prominent contextual bandit algorithms from the centralized seting for the federated setting. We carefully evaluate these algorithms in a range of scenarios simulated using publicly available datasets. Our simulations model typical setups encountered in the real-world, such as various misalignments between an initial pre-trained model and the subsequent user interactions due to non-stationarity in the data and/or heterogeneity across clients. Our experiments reveal the surprising effectiveness of the simple and commonly used softmax heuristic in balancing the well-know exploration-exploitation tradeoff across the breadth of our settings.

Chat is not available.