Session
Oral session 9: Sparse Learning and Ranking
Jean-Yves Audibert
Adaptive Forward-Backward Greedy Algorithm for Sparse Learning with Linear Models
Tong Zhang
Consider linear prediction models where the target function is a sparse linear combination of a set of basis functions. We are interested in the problem of identifying those basis functions with non-zero coefficients and reconstructing the target function from noisy observations. Two heuristics that are widely used in practice are forward and backward greedy algorithms. First, we show that neither idea is adequate. Second, we propose a novel combination that is based on the forward greedy algorithm but takes backward steps adaptively whenever beneficial. We prove strong theoretical results showing that this procedure is effective in learning sparse representations. Experimental results support our theory.
An interior-point stochastic approximation method and an L1-regularized delta rule
Peter Carbonetto · Mark Schmidt · Nando de Freitas
The stochastic approximation method is behind the solution to many important, actively-studied problems in machine learning. Despite its far-reaching application, there is almost no work on applying stochastic approximation to learning problems with constraints. The reason for this, we hypothesize, is that no robust, widely-applicable stochastic approximation method exists for handling such problems. We propose that interior-point methods are a natural solution. We establish the stability of a stochastic interior-point approximation method both analytically and empirically, and demonstrate its utility by deriving an on-line learning algorithm that also performs feature selection via L1 regularization.
Global Ranking Using Continuous Conditional Random Fields
Tao Qin · Tie-Yan Liu · Xu-Dong Zhang · De-Sheng Wang · Hang Li
This paper studies global ranking problem by learning to rank methods. Conventional learning to rank methods are usually designed for `local ranking', in the sense that the ranking model is defined on a single object, for example, a document in information retrieval. For many applications, this is a very loose approximation. Relations always exist between objects and it is better to define the ranking model as a function on all the objects to be ranked (i.e., the relations are also included). This paper refers to the problem as global ranking and proposes employing a Continuous Conditional Random Fields (CRF) for conducting the learning task. The Continuous CRF model is defined as a conditional probability distribution over ranking scores of objects conditioned on the objects. It can naturally represent the content information of objects as well as the relation information between objects, necessary for global ranking. Taking two specific information retrieval tasks as examples, the paper shows how the Continuous CRF method can perform global ranking better than baselines.
Inferring rankings under constrained sensing
Srikanth Jagabathula · Devavrat Shah
Motivated by applications such as elections, web-page ranking, revenue maximization in gambling, etc. we consider the question of inferring popular rankings. In many of these applications, the number of distinct popular rankings are usually very few. However, the data available for inferring them is highly constrained. As the first main result of this paper, we provide a simple and novel algorithm to recover the popular rankings and their popularity exactly under a natural set of assumptions. Specifically, under a natural stochastic model, our algorithm recovers them exactly as long as the number of distinct popular rankings up to $O(n)$ over $n$ candidates (or elements). In certain applications, like ranked election, the interest is only in recovering the most popular (or mode) ranking. As the second result, we provide an algorithm based on Fourier transform over symmetric group to recover the most popular ranking under natural majority condition. Interestingly enough, this algorithm becomes a maximum weight matching on an appropriately defined weighted bipartite graph. The questions considered in this paper are thematically related to the currently popular topic of compressed sensing. The existing results in the literature do not apply to our problem as the information is constrained and can not be sampled randomly. Thus, the results of this paper correspond to constrained compressed sensing.