Timezone: »

Combinatorial Bandits with Linear Constraints: Beyond Knapsacks and Fairness
Qingsong Liu · Weihang Xu · Siwei Wang · Zhixuan Fang

This paper proposes and studies for the first time the problem of combinatorial multi-armed bandits with linear long-term constraints. Our model generalizes and unifies several prominent lines of work, including bandits with fairness constraints, bandits with knapsacks (BwK), etc. We propose an upper-confidence bound LP-style algorithm for this problem, called UCB-LP, and prove that it achieves a logarithmic problem-dependent regret bound and zero constraint violations in expectation. In the special case of fairness constraints, we further provide a sharper constant regret bound for UCB-LP. Our regret bounds outperform the existing literature on BwK and bandits with fairness constraints simultaneously. We also develop another low-complexity version of UCB-LP and show that it yields $\tilde{O}(\sqrt{T})$ problem-independent regret and zero constraint violations with high-probability. Finally, we conduct numerical experiments to validate our theoretical results.

Author Information

Qingsong Liu (Tsinghua University)
Weihang Xu (Tsinghua University)
Siwei Wang (IIIS, Tsinghua University)
Zhixuan Fang (Tsinghua University)

More from the Same Authors