Poster
Gradient Methods for Online DR-Submodular Maximization with Stochastic Long Term Constraints
Guanyu Nie · Vaneet Aggarwal · Christopher Quinn
West Ballroom A-D #5802
[
Abstract
]
Thu 12 Dec 11 a.m. PST
— 2 p.m. PST
Abstract:
In this paper, we consider the problem of online DR-submodular maximization subject to long term stochastic constraints. Specifically, at each round $t\in [T]$, after committing an action $\mathbf{x}_t$, a random reward $f_t(\mathbf{x}_t)$ and an unbiased gradient estimate of the point $\widetilde{\nabla}f_t(\mathbf{x}_t)$ (semi-bandit feedback) are revealed. Meanwhile, a budget of $g_t(\mathbf{x}_t)$, which is linear and stochastic, is consumed of its total allotted budget $B$. We propose a gradient ascent based algorithm that achieves $\frac{1}{2}$-regret of $\mathcal{O}(\sqrt{T})$ with $\widetilde{\mathcal{O}}(\sqrt{T})$ constraint violation with high probability. Moreover, when first order full-information feedback is available, we propose an algorithm that achieves $(1-1/e)$-regret of $\mathcal{O}(\sqrt{T})$ with $\widetilde{\mathcal{O}}(\sqrt{T})$ constraint violation. These algorithms significantly improve over the state-of-the-art in terms of query complexity.
Live content is unavailable. Log in and register to view live content