Timezone: »
We introduce hierarchically supervised latent Dirichlet allocation (HSLDA), a model for hierarchically and multiply labeled bag-of-word data. Examples of such data include web pages and their placement in directories, product descriptions and associated categories from product hierarchies, and free-text clinical records and their assigned diagnosis codes. Out-of-sample label prediction is the primary goal of this work, but improved lower-dimensional representations of the bag-of-word data are also of interest. We demonstrate HSLDA on large-scale data from clinical document labeling and retail product categorization tasks. We show that leveraging the structure from hierarchical labels improves out-of-sample label prediction substantially when compared to models that do not.
Author Information
Adler J Perotte (Columbia University)
Frank Wood (University of British Columbia)
Dr. Wood is an associate professor in the Department of Engineering Science at the University of Oxford. Before that he was an assistant professor of Statistics at Columbia University and a research scientist at the Columbia Center for Computational Learning Systems. He formerly was a postdoctoral fellow of the Gatsby Computational Neuroscience Unit of the University College London. He holds a PhD from Brown University (â07) and BS from Cornell University (â96), both in computer science. Dr. Wood is the original architect of both the Anglican and Probabilistic-C probabilistic programming systems. He conducts AI-driven research at the boundary of probabilistic programming, Bayesian modeling, and Monte Carlo methods. Dr. Wood holds 6 patents, has authored over 50 papers, received the AISTATS best paper award in 2009, and has been awarded faculty research awards from Xerox, Google and Amazon. Prior to his academic career he was a successful entrepreneur having run and sold the content-based image retrieval company ToFish! to AOL/Time Warner and served as CEO of Interfolio.
Noemie Elhadad (Columbia University)
Nicholas Bartlett (Columbia)
More from the Same Authors
-
2021 : A Closer Look at Gradient Estimators with Reinforcement Learning as Inference »
Jonathan Lavington · Michael Teng · Mark Schmidt · Frank Wood -
2018 : TBC 1 »
Frank Wood -
2017 Workshop: Deep Learning for Physical Sciences »
Atilim Gunes Baydin · Mr. Prabhat · Kyle Cranmer · Frank Wood -
2017 Poster: Learning Disentangled Representations with Semi-Supervised Deep Generative Models »
Siddharth Narayanaswamy · Brooks Paige · Jan-Willem van de Meent · Alban Desmaison · Noah Goodman · Pushmeet Kohli · Frank Wood · Philip Torr -
2016 Poster: Bayesian Optimization for Probabilistic Programs »
Thomas Rainforth · Tuan Anh Le · Jan-Willem van de Meent · Michael A Osborne · Frank Wood -
2015 Workshop: Black box learning and inference »
Josh Tenenbaum · Jan-Willem van de Meent · Tejas Kulkarni · S. M. Ali Eslami · Brooks Paige · Frank Wood · Zoubin Ghahramani -
2015 Tutorial: Probabilistic Programming »
Frank Wood -
2014 Workshop: 3rd NIPS Workshop on Probabilistic Programming »
Daniel Roy · Josh Tenenbaum · Thomas Dietterich · Stuart J Russell · YI WU · Ulrik R Beierholm · Alp Kucukelbir · Zenna Tavares · Yura Perov · Daniel Lee · Brian Ruttenberg · Sameer Singh · Michael Hughes · Marco Gaboardi · Alexey Radul · Vikash Mansinghka · Frank Wood · Sebastian Riedel · Prakash Panangaden -
2014 Poster: Asynchronous Anytime Sequential Monte Carlo »
Brooks Paige · Frank Wood · Arnaud Doucet · Yee Whye Teh -
2014 Oral: Asynchronous Anytime Sequential Monte Carlo »
Brooks Paige · Frank Wood · Arnaud Doucet · Yee Whye Teh -
2010 Spotlight: Probabilistic Deterministic Infinite Automata »
David Pfau · Nicholas Bartlett · Frank Wood -
2010 Poster: Probabilistic Deterministic Infinite Automata »
David Pfau · Nicholas Bartlett · Frank Wood -
2008 Poster: Characterizing neural dependencies with Poisson copula models »
Pietro Berkes · Frank Wood · Jonathan W Pillow -
2008 Spotlight: Characterizing neural dependencies with Poisson copula models »
Pietro Berkes · Frank Wood · Jonathan W Pillow -
2008 Poster: Dependent Dirichlet Process Spike Sorting »
Jan Gasthaus · Frank Wood · Dilan Gorur · Yee Whye Teh -
2006 Poster: Particle Filtering for Nonparametric Bayesian Matrix Factorization »
Frank Wood · Tom Griffiths