Track: Spotlights

Agreement-Based Learning

Percy Liang · Dan Klein · Michael Jordan

The learning of probabilistic models with many hidden variables and non-decomposable dependencies is an important but challenging problem. In contrast to traditional approaches based on approximate inference in a single intractable model, our approach is to train a set of tractable component models by encouraging them to agree on the hidden variables. This allows us to capture non-decomposable aspects of the data while still maintaining tractability. We exhibit an objective function for our approach, derive EM-style algorithms for parameter estimation, and demonstrate their effectiveness on three challenging real-world learning tasks.

An Analysis of Inference with the Universum

Fabian H Sinz · Olivier Chapelle · Alekh Agarwal · Bernhard Schölkopf

We study a pattern classification algorithm which has recently been proposed by Vapnik and coworkers. It builds on a new inductive principle which assumes that in addition to positive and negative data, a third class of data is available, termed the Universum. We assay the behavior of the algorithm by establishing links with Fisher discriminant analysis and oriented PCA, as well as with an SVM in a projected subspace (or, equivalently, with a data-dependent reduced kernel). We also provide experimental results.

Augmented Functional Time Series Representation and Forecasting with Gaussian Processes

Nicolas Chapados · Yoshua Bengio

We introduce a functional representation of time series which allows forecasts to be performed over an unspecified horizon with progressively-revealed information sets. By virtue of using Gaussian processes, a complete covariance matrix between forecasts at several time-steps is available. This information is put to use in an application to actively trade price spreads between commodity futures contracts. The approach delivers impressive out-of-sample risk-adjusted returns after transaction costs on a portfolio of 30 spreads.

Bayesian Multi-View Learning

Shipeng Yu · Balaji R Krishnapuram · Romer E Rosales · Harald Steck · R. Bharat Rao

We propose a Bayesian undirected graphical model for semi-supervised multi-view learning (BMVL). This model makes explicit the previously unstated assumptions of a large class of semi-supervised multi-view learning algorithms, and clarifies the circumstances under which these assumptions fail. Building upon new insights from this model, we propose an improved method for BMVL: in particular, we derive a novel co-training kernel for Gaussian Process classifiers. Unlike some previous methods for multi-view learning, the resulting approach is convex and avoids local-maxima problems. Further, it automatically estimates how much each view should be trusted, and thus accommodates noisy or unreliable views. Experiments show that the approach is more accurate than previous multi-view learning algorithms.

Combined discriminative and generative articulated pose and non-rigid shape estimation

Leonid Sigal · Alexandru Balan · Michael J Black

Estimation of three-dimensional articulated human pose and motion from images is a central problem in computer vision. Much of the previous work has been limited by the use of crude generative models of humans represented as articulated collections of simple parts such as cylinders. Automatic initialization of such models has proved difficult and most approaches assume that the size and shape of the body parts are known a priori. In this paper we propose a method for automatically recovering a detailed parametric model of non-rigid body shape and pose from monocular imagery. Specifically, we represent the body using a parameterized triangulated mesh model that is learned from a database of human range scans. We demonstrate a discriminative method to directly recover the model parameters from monocular images using a mixture of regressors. This predicted pose and shape are used to initialize a generative model for more detailed pose and shape estimation. The resulting approach allows fully automatic pose and shape recovery from monocular and multi-camera imagery. Experimental results show that our method is capable of robustly recovering articulated pose, shape and biometric measurements (e.g. height, weight, etc.) in both calibrated and uncalibrated camera environments.

Cooled and Relaxed Survey Propagation for MRFs

Hai Leong Chieu · Wee Sun Lee · Yee Whye Teh

We describe a new algorithm, Relaxed Survey Propagation (RSP), for finding MAP configurations in Markov random fields. We compare its performance with state-of-the-art algorithms including the max-product belief propagation, its sequential tree-reweighted variant, residual (sum-product) belief propagation, and tree-structured expectation propagation. We show that it outperforms all approaches for Ising models with mixed couplings, as well as on a web person disambiguation task formulated as a supervised clustering problem.

Efficient Bayesian Inference for Dynamically Changing Graphs

Ozgur Sumer · Umut A Acar · Alexander T. Ihler · Ramgopal R. Mettu

Motivated by stochastic systems in which observed evidence and conditional dependencies between states of the network change over time, and certain quantities of interest (marginal distributions, likelihood estimates etc.) must be updated, we study the problem of dynamic inference in tree-structured Bayesian networks. We describe an algorithm for dynamic inference that handles a broad range of changes to the network and is able to maintain marginal distributions, MAP estimates, and data likelihoods in all expected logarithmic time. We give an implementation of our algorithm and provide experiments that show that the algorithm can yield up to two orders of magnitude speedups on answering queries and responding to dynamic changes over the sum-product algorithm.

Expectation Maximization, Posterior Constraints, and Statistical Alignment

Kuzman Ganchev · Joao V Graca · Ben Taskar

The expectation maximization (EM) algorithm is a widely used maximum likelihood estimation procedure for statistical models when the values of some of the variables in the model are hidden. Very often, however, our aim is primarily to find a model that assigns values to the latent variables that have intended meaning for our data and maximizing expected likelihood only sometimes accomplishes this. Unfortunately, it is often very indirect or difficult to add even simple a-priori information about latent variables without making models overly complex or intractable. In this paper, we present an efficient, principled way to inject constraints on the posteriors of latent variables into the EM algorithm. Our method can be viewed as a regularization of the posteriors of hidden variables, or alternatively as a restriction on the types of lower bounds used for maximizing data likelihood. Focusing on the alignment problem for statistical machine translation, we show that simple, intuitive posterior constraints can greatly improve the performance over standard baselines and be competitive with more complex, intractable models.

Multi-task Gaussian Process Prediction

Edwin Bonilla · Kian Ming A Chai · Chris Williams

In this paper we investigate multi-task learning in the context of Gaussian Processes (GP). We propose a model that learns a shared covariance function on input-dependent features and a ``free-form'' covariance matrix over tasks. This allows for good flexibility when modelling inter-task dependencies while avoiding the need for large amounts of data for training. We show that under the assumption of noise-free observations and block design, predictions for a given task only depend on its target values and therefore a cancellation of inter-task transfer occurs. We evaluate the benefits of our model on two practical applications: a compiler performance prediction problem and an exam score prediction task. Additionally, we make use of GP approximations and properties of our model in order to provide scalability to large data sets.

Simulated Annealing: Rigorous finite-time guarantees for optimization on continuous domains

Andrea Lecchini-Visintini · John Lygeros · Jan Maciejowski

Simulated annealing is a popular method for approaching the solution of a global optimization problem. Existing results on its performance apply to discrete combinatorial optimization where the optimization variables can assume only a finite set of possible values. We introduce a new general formulation of simulated annealing which allows one to guarantee finite-time performance in the optimization of functions of continuous variables. The results hold universally for any optimization problem on a bounded domain and establish a connection between simulated annealing and up-to-date theory of convergence of Markov chain Monte Carlo methods on continuous domains. This work is inspired by the concept of finite-time learning with known accuracy and confidence developed in statistical learning theory.

Main Navigation

Session

Spotlights

Ulrike von Luxburg