Timezone: »
Poster
A Spectral Algorithm for Latent Dirichlet Allocation
Anima Anandkumar · Dean P Foster · Daniel Hsu · Sham M Kakade · Yi-Kai Liu
Wed Dec 05 07:00 PM -- 12:00 AM (PST) @ Harrah’s Special Events Center 2nd Floor #None
Topic modeling is a generalization of clustering that posits that observations (words in a document) are generated by \emph{multiple} latent factors (topics), as opposed to just one. This increased representational power comes at the cost of a more challenging unsupervised learning problem of estimating the topic-word distributions when only words are observed, and the topics are hidden. This work provides a simple and efficient learning procedure that is guaranteed to recover the parameters for a wide class of topic models, including Latent Dirichlet Allocation (LDA). For LDA, the procedure correctly recovers both the topic-word distributions and the parameters of the Dirichlet prior over the topic mixtures, using only trigram statistics (\emph{i.e.}, third order moments, which may be estimated with documents containing just three words). The method, called Excess Correlation Analysis, is based on a spectral decomposition of low-order moments via two singular value decompositions (SVDs). Moreover, the algorithm is scalable, since the SVDs are carried out only on $k \times k$ matrices, where $k$ is the number of latent factors (topics) and is typically much smaller than the dimension of the observation (word) space.
Author Information
Anima Anandkumar (Caltech)
Dean P Foster (University of Pennsylvania)
Daniel Hsu (Columbia University)
See <https://www.cs.columbia.edu/~djhsu/>
Sham M Kakade (Microsoft Research)
Yi-Kai Liu (National Institute of Standards and Technology)
Related Events (a corresponding poster, oral, or spotlight)
-
2012 Spotlight: A Spectral Algorithm for Latent Dirichlet Allocation »
Thu Dec 6th 01:56 -- 02:00 AM Room Harveys Convention Center Floor, CC
More from the Same Authors
-
2020 Tutorial: (Track3) Policy Optimization in Reinforcement Learning Q&A »
Sham M Kakade · Martha White · Nicolas Le Roux -
2020 Poster: Ensuring Fairness Beyond the Training Data »
Debmalya Mandal · Samuel Deng · Suman Jana · Jeannette Wing · Daniel Hsu -
2020 Tutorial: (Track3) Policy Optimization in Reinforcement Learning »
Sham M Kakade · Martha White · Nicolas Le Roux -
2019 Poster: On the number of variables to use in principal component regression »
Ji Xu · Daniel Hsu -
2018 Poster: Overfitting or perfect fitting? Risk bounds for classification and regression rules that interpolate »
Mikhail Belkin · Daniel Hsu · Partha Mitra -
2018 Poster: Benefits of over-parameterization with EM »
Ji Xu · Daniel Hsu · Arian Maleki -
2018 Poster: Leveraged volume sampling for linear regression »
Michal Derezinski · Manfred K. Warmuth · Daniel Hsu -
2018 Spotlight: Leveraged volume sampling for linear regression »
Michal Derezinski · Manfred K. Warmuth · Daniel Hsu -
2017 Poster: Linear regression without correspondence »
Daniel Hsu · Kevin Shi · Xiaorui Sun -
2016 Workshop: Learning with Tensors: Why Now and How? »
Anima Anandkumar · Rong Ge · Yan Liu · Maximilian Nickel · Qi (Rose) Yu -
2016 Workshop: Nonconvex Optimization for Machine Learning: Theory and Practice »
Hossein Mobahi · Anima Anandkumar · Percy Liang · Stefanie Jegelka · Anna Choromanska -
2016 Poster: Global Analysis of Expectation Maximization for Mixtures of Two Gaussians »
Ji Xu · Daniel Hsu · Arian Maleki -
2016 Oral: Global Analysis of Expectation Maximization for Mixtures of Two Gaussians »
Ji Xu · Daniel Hsu · Arian Maleki -
2016 Poster: Online and Differentially-Private Tensor Decomposition »
Yining Wang · Anima Anandkumar -
2016 Poster: Search Improves Label for Active Learning »
Alina Beygelzimer · Daniel Hsu · John Langford · Chicheng Zhang -
2015 Workshop: Non-convex Optimization for Machine Learning: Theory and Practice »
Anima Anandkumar · Niranjan Uma Naresh · Kamalika Chaudhuri · Percy Liang · Sewoong Oh -
2015 Poster: Mixing Time Estimation in Reversible Markov Chains from a Single Sample Path »
Daniel Hsu · Aryeh Kontorovich · Csaba Szepesvari -
2015 Poster: Fast and Guaranteed Tensor Decomposition via Sketching »
Yining Wang · Hsiao-Yu Tung · Alexander Smola · Anima Anandkumar -
2015 Poster: Efficient and Parsimonious Agnostic Active Learning »
Tzu-Kuo Huang · Alekh Agarwal · Daniel Hsu · John Langford · Robert Schapire -
2015 Spotlight: Fast and Guaranteed Tensor Decomposition via Sketching »
Yining Wang · Hsiao-Yu Tung · Alexander Smola · Anima Anandkumar -
2015 Spotlight: Efficient and Parsimonious Agnostic Active Learning »
Tzu-Kuo Huang · Alekh Agarwal · Daniel Hsu · John Langford · Robert Schapire -
2014 Poster: Multi-Step Stochastic ADMM in High Dimensions: Applications to Sparse Optimization and Matrix Decomposition »
Hanie Sedghi · Anima Anandkumar · Edmond A Jonckheere -
2014 Poster: large scale canonical correlation analysis with iterative least squares »
Yichao Lu · Dean P Foster -
2014 Poster: Scalable Non-linear Learning with Adaptive Polynomial Expansions »
Alekh Agarwal · Alina Beygelzimer · Daniel Hsu · John Langford · Matus J Telgarsky -
2014 Poster: The Large Margin Mechanism for Differentially Private Maximization »
Kamalika Chaudhuri · Daniel Hsu · Shuang Song -
2013 Workshop: Workshop on Spectral Learning »
Byron Boots · Daniel Hsu · Borja Balle -
2013 Workshop: Topic Models: Computation, Application, and Evaluation »
David Mimno · Amr Ahmed · Jordan Boyd-Graber · Ankur Moitra · Hanna Wallach · Alexander Smola · David Blei · Anima Anandkumar -
2013 Poster: One-shot learning and big data with n=2 »
Lee H Dicker · Dean P Foster -
2013 Poster: New Subsampling Algorithms for Fast Least Squares Regression »
Paramveer Dhillon · Yichao Lu · Dean P Foster · Lyle Ungar -
2013 Poster: Faster Ridge Regression via the Subsampled Randomized Hadamard Transform »
Yichao Lu · Paramveer Dhillon · Dean P Foster · Lyle Ungar -
2013 Poster: When are Overcomplete Topic Models Identifiable? Uniqueness of Tensor Tucker Decompositions with Structured Sparsity »
Anima Anandkumar · Daniel Hsu · Majid Janzamin · Sham M Kakade -
2013 Poster: Contrastive Learning Using Spectral Methods »
James Y Zou · Daniel Hsu · David Parkes · Ryan Adams -
2012 Poster: Learning Mixtures of Tree Graphical Models »
Anima Anandkumar · Daniel Hsu · Furong Huang · Sham M Kakade -
2012 Poster: Identifiability and Unmixing of Latent Parse Trees »
Percy Liang · Sham M Kakade · Daniel Hsu -
2012 Poster: Latent Graphical Model Selection: Efficient Methods for Locally Tree-like Graphs »
Anima Anandkumar · Ragupathyraj Valluvan -
2011 Poster: Stochastic convex optimization with bandit feedback »
Alekh Agarwal · Dean P Foster · Daniel Hsu · Sham M Kakade · Sasha Rakhlin -
2011 Poster: Spectral Methods for Learning Multivariate Latent Tree Structure »
Anima Anandkumar · Kamalika Chaudhuri · Daniel Hsu · Sham M Kakade · Le Song · Tong Zhang -
2011 Poster: Universal low-rank matrix recovery from Pauli measurements »
Yi-Kai Liu -
2011 Spotlight: Universal low-rank matrix recovery from Pauli measurements »
Yi-Kai Liu -
2011 Poster: Efficient Learning of Generalized Linear and Single Index Models with Isotonic Regression »
Sham M Kakade · Adam Kalai · Varun Kanade · Ohad Shamir -
2011 Poster: Multi-View Learning of Word Embeddings via CCA »
Paramveer Dhillon · Dean P Foster · Lyle Ungar -
2010 Spotlight: Learning from Logged Implicit Exploration Data »
Alex Strehl · Lihong Li · John Langford · Sham M Kakade -
2010 Poster: Learning from Logged Implicit Exploration Data »
Alexander L Strehl · John Langford · Lihong Li · Sham M Kakade -
2010 Poster: Agnostic Active Learning Without Constraints »
Alina Beygelzimer · Daniel Hsu · John Langford · Tong Zhang -
2009 Poster: A Parameter-free Hedging Algorithm »
Kamalika Chaudhuri · Yoav Freund · Daniel Hsu -
2009 Poster: Multi-Label Prediction via Compressed Sensing »
Daniel Hsu · Sham M Kakade · John Langford · Tong Zhang -
2009 Oral: Multi-Label Prediction via Compressed Sensing »
Daniel Hsu · Sham M Kakade · John Langford · Tong Zhang -
2008 Poster: Mind the Duality Gap: Logarithmic regret algorithms for online optimization »
Shai Shalev-Shwartz · Sham M Kakade -
2008 Poster: On the Generalization Ability of Online Strongly Convex Programming Algorithms »
Sham M Kakade · Ambuj Tewari -
2008 Spotlight: On the Generalization Ability of Online Strongly Convex Programming Algorithms »
Sham M Kakade · Ambuj Tewari -
2008 Spotlight: Mind the Duality Gap: Logarithmic regret algorithms for online optimization »
Shai Shalev-Shwartz · Sham M Kakade -
2008 Poster: On the Complexity of Linear Prediction: Risk Bounds, Margin Bounds, and Regularization »
Sham M Kakade · Karthik Sridharan · Ambuj Tewari -
2007 Spotlight: A general agnostic active learning algorithm »
Sanjoy Dasgupta · Daniel Hsu · Claire Monteleoni -
2007 Oral: The Price of Bandit Information for Online Optimization »
Varsha Dani · Thomas P Hayes · Sham M Kakade -
2007 Poster: The Price of Bandit Information for Online Optimization »
Varsha Dani · Thomas P Hayes · Sham M Kakade -
2007 Poster: A general agnostic active learning algorithm »
Sanjoy Dasgupta · Daniel Hsu · Claire Monteleoni