Timezone: »
We present a generative probabilistic model for learning general graph structures, which we term concept graphs, from text. Concept graphs provide a visual summary of the thematic content of a collection of documents-a task that is difficult to accomplish using only keyword search. The proposed model can learn different types of concept graph structures and is capable of utilizing partial prior knowledge about graph structure as well as labeled documents. We describe a generative model that is based on a stick-breaking process for graphs, and a Markov Chain Monte Carlo inference procedure. Experiments on simulated data show that the model can recover known graph structure when learning in both unsupervised and semi-supervised modes. We also show that the proposed model is competitive in terms of empirical log likelihood with existing structure-based topic models (such as hPAM and hLDA) on real-world text data sets. Finally, we illustrate the application of the model to the problem of updating Wikipedia category graphs.
Author Information
America Chambers (Univ. of California, Irvine)
Padhraic Smyth (University of California, Irvine)
Mark Steyvers (UC Irvine)
Related Events (a corresponding poster, oral, or spotlight)
-
2010 Spotlight: Learning concept graphs from text with stick-breaking priors »
Tue. Dec 7th 11:25 -- 11:30 PM Room Regency Ballroom
More from the Same Authors
-
2022 : Probabilistic Querying of Continuous-Time Sequential Events »
Alex Boyd · Yuxin Chang · Stephan Mandt · Padhraic Smyth -
2022 Poster: Predictive Querying for Autoregressive Neural Sequence Models »
Alex Boyd · Samuel Showalter · Stephan Mandt · Padhraic Smyth -
2021 Poster: Detecting and Adapting to Irregular Distribution Shifts in Bayesian Online Learning »
Aodong Li · Alex Boyd · Padhraic Smyth · Stephan Mandt -
2021 Poster: Combining Human Predictions with Model Probabilities via Confusion Matrices and Calibration »
Gavin Kerrigan · Padhraic Smyth · Mark Steyvers -
2020 Poster: Can I Trust My Fairness Metric? Assessing Fairness with Unlabeled Data and Bayesian Inference »
Disi Ji · Padhraic Smyth · Mark Steyvers -
2020 Poster: User-Dependent Neural Sequence Models for Continuous-Time Event Data »
Alex Boyd · Robert Bamler · Stephan Mandt · Padhraic Smyth -
2017 : Coffee break and Poster Session II »
Mohamed Kane · Albert Haque · Vagelis Papalexakis · John Guibas · Peter Li · Carlos Arias · Eric Nalisnick · Padhraic Smyth · Frank Rudzicz · Xia Zhu · Theodore Willke · Noemie Elhadad · Hans Raffauf · Harini Suresh · Paroma Varma · Yisong Yue · Ognjen (Oggi) Rudovic · Luca Foschini · Syed Rameel Ahmad · Hasham ul Haq · Valerio Maggio · Giuseppe Jurman · Sonali Parbhoo · Pouya Bashivan · Jyoti Islam · Mirco Musolesi · Chris Wu · Alexander Ratner · Jared Dunnmon · Cristóbal Esteban · Aram Galstyan · Greg Ver Steeg · Hrant Khachatrian · Marc Górriz · Mihaela van der Schaar · Anton Nemchenko · Manasi Patwardhan · Tanay Tandon -
2016 Workshop: Towards an Artificial Intelligence for Data Science »
Charles Sutton · James Geddes · Zoubin Ghahramani · Padhraic Smyth · Chris Williams -
2013 Poster: Scoring Workers in Crowdsourcing: How Many Control Questions are Enough? »
Qiang Liu · Alexander Ihler · Mark Steyvers -
2013 Spotlight: Scoring Workers in Crowdsourcing: How Many Control Questions are Enough? »
Qiang Liu · Alexander Ihler · Mark Steyvers -
2012 Workshop: Algorithmic and Statistical Approaches for Large Social Network Data Sets »
Michael Goodrich · Pavel N Krivitsky · David M Mount · Christopher DuBois · Padhraic Smyth -
2011 Oral: Continuous-Time Regression Models for Longitudinal Networks »
Duy Q Vu · Arthur Asuncion · David Hunter · Padhraic Smyth -
2011 Poster: Continuous-Time Regression Models for Longitudinal Networks »
Duy Q Vu · Arthur Asuncion · David Hunter · Padhraic Smyth -
2009 Poster: Particle-based Variational Inference for Continuous Systems »
Alexander Ihler · Andrew Frank · Padhraic Smyth -
2009 Poster: The Wisdom of Crowds in the Recollection of Order Information »
Mark Steyvers · Michael D Lee · Brent Miller · Pernille Hemmer -
2008 Poster: Asynchronous Distributed Learning of Topic Models »
Arthur Asuncion · Padhraic Smyth · Max Welling -
2007 Spotlight: Distributed Inference for Latent Dirichlet Allocation »
David Newman · Arthur Asuncion · Padhraic Smyth · Max Welling -
2007 Poster: Distributed Inference for Latent Dirichlet Allocation »
David Newman · Arthur Asuncion · Padhraic Smyth · Max Welling -
2006 Poster: Modeling General and Specific Aspects of Documents with a Probabilistic Topic Model »
Chaitanya Chemudugunta · Padhraic Smyth · Mark Steyvers -
2006 Poster: Learning Time-Intensity Profiles of Human Activity using Non-Parametric Bayesian Models »
Alexander Ihler · Padhraic Smyth -
2006 Poster: Hierarchical Dirichlet Processes with Random Effects »
Seyoung Kim · Padhraic Smyth