Data Source Adaptive Online Learning under Heteroscedastic Noise
Amith Bhat Hosadurga Anand · Aadirupa Saha · Thomas Kleine Buening · Haipeng Luo
Abstract
In this paper, we address the practical problem of online learning with multiple heterogeneous data sources, each exhibiting unknown and distinct noise variances. We propose SOAR (Source-Optimistic Adaptive Regret Minimization), a novel algorithm that adaptively balances exploration and exploitation by jointly constructing upper confidence bounds for arm rewards and lower confidence bounds for data source variances. Our theoretical analysis establishes that SOAR achieves a tight regret bound of $\tilde{O}\left(M\bar \mu (n + \tau) + \sum_{i=1}^K \tfrac{1}{\Delta_i}\right),$ where $n$ is the number of preprocessing iterations to remove high-variance sources, $\tau$ is the exploration parameter for SOAR, $B$ is the bound on the noise distribution, $\bar \mu$ is the bound on the arm means, $K$ is the number of arms, $M$ is the number of sources, and $\Delta_i$ denotes the reward gaps. Here, the $\tilde{O}$ notation hides polylogarithmic factors in the problem parameters.This near-optimal performance underscores SOAR effectiveness in dynamically managing heteroscedastic noise without incurring significant overhead. We believe this work opens a new direction for adaptively leveraging multiple heterogeneous data sources, extending beyond traditional bandit frameworks.
Chat is not available.
Successful Page Load