Skip to yearly menu bar Skip to main content


Poster

Personalizing Many Decisions with High-Dimensional Covariates

Nima Hamidi · Mohsen Bayati · Kapil Gupta

East Exhibition Hall B, C #16

Keywords: [ Bandit Algorithms ] [ Algorithms ] [ Algorithms -> Multitask and Transfer Learning; Applications -> Matrix and Tensor Factorization; Applications ] [ Recommender Sys ]


Abstract:

We consider the k-armed stochastic contextual bandit problem with d dimensional features, when both k and d can be large. To the best of our knowledge, all existing algorithm for this problem have a regret bound that scale as polynomials of degree at least two, in k and d. The main contribution of this paper is to introduce and theoretically analyze a new algorithm (REAL Bandit) with a regret that scales by r^2(k+d) when r is rank of the k by d matrix of unknown parameters. REAL Bandit relies on ideas from low-rank matrix estimation literature and a new row-enhancement subroutine that yields sharper bounds for estimating each row of the parameter matrix that may be of independent interest.

Live content is unavailable. Log in and register to view live content