Energy-Based Models: Structured Learning Beyond Likelihoods
Yann LeCun

Mon Dec 4th 03:30 -- 05:30 PM @ Regency F
Event URL: http://www.cs.nyu.edu/~yann/talks/tutorial-nips-2006.html »

Energy-Based Models (EBM) capture dependencies between variables by associating a scalar energy to each configuration of the variables. Given a set of observed variables, an EBM inference consists in finding configurations of unobserved variables that minimize the energy. Training an EBM consists in designing a loss function whose minimization will shape the energy surface so that desired variable configurations have lower energies than undesired configurations. EBM approaches have been applied with considerable success to such problems as natural language processing, biological sequence analysis, computer vision (object detection and recognition), image segmentation, image restoration, unsupervised feature learning, and dimensionality reduction.

The first part of the tutorial will introduce the concepts of energy-based inference, will discuss the relationships with non-probabilistic forms of graphical models (un-normalized factor graphs), and will give the conditions that the loss function must satisfy so that its minimization will cause the model to produce good decisions. The second part will discuss the relative merits of EBM approaches and probabilistic approaches. EBMs provide more flexibility than probabilistic approaches in the design of the energy function because of the absence of normalization. More importantly, when training complex probabilistic models, one is often faced with the problem of evaluating (or approximating) intractable sums or integrals. EBMs trained with appropriate loss functions sidestep this problem altogether. The third part will present several popular learning models in the light of the EBM framework. In particular, discriminative learning methods for "structured" outputs will be discussed including: discriminative HMMs, Graph Transformer Networks, Conditional Random Fields, Maximum Margin Markov Networks, and related approaches. A simple interpretation will be given for several approximate maximum likelihood methods such as products of experts models, variational bound methods, and Hinton's Contrastive Divergence. Lastly, a number of applications to vision, NLP and bio-informatics will be discussed.

Author Information

Yann LeCun (Facebook AI Research and New York University)

Yann LeCun is Director of AI Research at Facebook, and Silver Professor of Data Science, Computer Science, Neural Science, and Electrical Engineering at New York University. He received the Electrical Engineer Diploma from ESIEE, Paris in 1983, and a PhD in Computer Science from Université Pierre et Marie Curie (Paris) in 1987. After a postdoc at the University of Toronto, he joined AT&T Bell Laboratories in Holmdel, NJ in 1988. He became head of the Image Processing Research Department at AT&T Labs-Research in 1996, and joined NYU as a professor in 2003, after a brief period as a Fellow of the NEC Research Institute in Princeton. From 2012 to 2014 he directed NYU's initiative in data science and became the founding director of the NYU Center for Data Science. He was named Director of AI Research at Facebook in late 2013 and retains a part-time position on the NYU faculty. His current interests include AI, machine learning, computer perception, mobile robotics, and computational neuroscience. He has published over 180 technical papers and book chapters on these topics as well as on neural networks, handwriting recognition, image processing and compression, and on dedicated circuits for computer perception.

More from the Same Authors