Personalizing Many Decisions with High-Dimensional Covariates
Nima Hamidi · Mohsen Bayati · Kapil Gupta
Keywords:
Bandit Algorithms
Algorithms
Algorithms -> Multitask and Transfer Learning; Applications -> Matrix and Tensor Factorization; Applications
Recommender Sys
2019 Poster
Abstract
We consider the k-armed stochastic contextual bandit problem with d dimensional features, when both k and d can be large. To the best of our knowledge, all existing algorithm for this problem have a regret bound that scale as polynomials of degree at least two, in k and d. The main contribution of this paper is to introduce and theoretically analyze a new algorithm (REAL Bandit) with a regret that scales by r^2(k+d) when r is rank of the k by d matrix of unknown parameters. REAL Bandit relies on ideas from low-rank matrix estimation literature and a new row-enhancement subroutine that yields sharper bounds for estimating each row of the parameter matrix that may be of independent interest.
Chat is not available.
Successful Page Load