Log-Linear Models
Dimitri Kanevsky · Tony Jebara · Li Deng · Stephen Wright · Georg Heigold · Avishy Carmi

Sat Dec 8th 07:30 AM -- 06:30 PM @ Tahoe C, Harrah’s Special Events Center 2nd Floor
Event URL: »

Exponential functions are core mathematical constructs that are the key to many important applications, including speech recognition, pattern-search and logistic regression problems in statistics, machine translation, and natural language processing. Exponential functions are found in exponential families, log-linear models, conditional random fields (CRF), entropy functions, neural networks involving sigmoid and soft max functions, and Kalman filter or MMIE training of hidden Markov models. Many techniques have been developed in pattern recognition to construct formulations from exponential expressions and to optimize such functions, including growth transforms, EM, EBW, Rprop, bounds for log-linear models, large-margin formulations, and regularization. Optimization of log-linear models also provides important algorithmic tools for machine learning applications (including deep learning), leading to new research in such topics as stochastic gradient methods, sparse / regularized optimization methods, enhanced first-order methods, coordinate descent, and approximate second-order methods. Specific recent advances relevant to log-linear modeling include the following.

• Effective optimization approaches, including stochastic gradient and Hessian-free methods.
• Efficient algorithms for regularized optimization problems.
• Bounds for log-linear models and recent convergence results
• Recognition of modeling equivalences across different areas, such as the equivalence between Gaussian and log-linear models/HMM and HCRF, and the equivalence between transfer entropy and Granger causality for Gaussian parameters.

Though exponential functions and log-linear models are well established, research activity remains intense, due to the central importance of the area in front-line applications and the rapid expanding size of the data sets to be processed. Fundamental work is needed to transfer algorithmic ideas across different contexts and explore synergies between them, to assimilate the influx of ideas from optimization, to assemble better combinations of algorithmic elements for tackling such key tasks as deep learning, and to explore such key issues as parameter tuning.

The workshop will bring together researchers from the many fields that formulate, use, analyze, and optimize log-linear models, with a view to exposing and studying the issues discussed above.

Topics of possible interest for talks at the workshop include, but are not limited to, the following.

1. Log-linear models.
2. Using equivalences to transfer optimization and modeling methods across different applications and different classes of models.
3. Comparison of optimization / accuracy performance of equivalent model pairs.
4. Convex formulations.
5. Bounds and their applications.
6. Stochastic gradient, first-order, and approximate-second-order methods.
7. Efficient non-Gaussian filtering approach (that exploits equivalence of Gaussian generative and log-linear models and projecting on exponential manifold of densities).
8. Graphic and Network inference models.
9. Missing data and hidden variables in log-linear modeling.
10. Semi-supervised estimation in log-linear modeling.
11. Sparsity in log-linear models.
12. Block and novel regularization methods for log-linear models.
13. Parallel, distributed and large-scale methods for log-linear models.
14. Information geometry of Gaussian densities and exponential families.
15. Hybrid algorithms that combine different optimization strategies.
16. Connections between log-linear models and deep belief networks.
17. Connections with kernel methods.
18. Applications to speech / natural-language processing and other areas.
19. Empirical contributions that compare and contrast different approaches.
20. Theoretical contributions that relate to any of the above topics.

Author Information

Dimitri Kanevsky (IBM, T.J. Watson Research Center)
Tony Jebara (Netflix)
Li Deng (Microsoft Reserach, Redmond)
Stephen Wright (UW-Madison)

Steve Wright is a Professor of Computer Sciences at the University of Wisconsin-Madison. His research interests lie in computational optimization and its applications to science and engineering. Prior to joining UW-Madison in 2001, Wright was a Senior Computer Scientist (1997-2001) and Computer Scientist (1990-1997) at Argonne National Laboratory, and Professor of Computer Science at the University of Chicago (2000-2001). He is the past Chair of the Mathematical Optimization Society (formerly the Mathematical Programming Society), the leading professional society in optimization, and a member of the Board of the Society for Industrial and Applied Mathematics (SIAM). Wright is the author or co-author of four widely used books in numerical optimization, including "Primal Dual Interior-Point Methods" (SIAM, 1997) and "Numerical Optimization" (with J. Nocedal, Second Edition, Springer, 2006). He has also authored over 85 refereed journal papers on optimization theory, algorithms, software, and applications. He is coauthor of widely used interior-point software for linear and quadratic optimization. His recent research includes algorithms, applications, and theory for sparse optimization (including applications in compressed sensing and machine learning).

Georg Heigold (Google)
Avishy Carmi (NTU)

More from the Same Authors