Poster
Hierarchically Supervised Latent Dirichlet Allocation
Adler J Perotte · Frank Wood · Noemie Elhadad · Nicholas Bartlett

Wed Dec 14th 05:45 -- 11:59 PM @ None #None

We introduce hierarchically supervised latent Dirichlet allocation (HSLDA), a model for hierarchically and multiply labeled bag-of-word data. Examples of such data include web pages and their placement in directories, product descriptions and associated categories from product hierarchies, and free-text clinical records and their assigned diagnosis codes. Out-of-sample label prediction is the primary goal of this work, but improved lower-dimensional representations of the bag-of-word data are also of interest. We demonstrate HSLDA on large-scale data from clinical document labeling and retail product categorization tasks. We show that leveraging the structure from hierarchical labels improves out-of-sample label prediction substantially when compared to models that do not.

Author Information

Adler J Perotte (Columbia University)
Frank Wood (University of British Columbia)

Dr. Wood is an associate professor in the Department of Engineering Science at the University of Oxford. Before that he was an assistant professor of Statistics at Columbia University and a research scientist at the Columbia Center for Computational Learning Systems. He formerly was a postdoctoral fellow of the Gatsby Computational Neuroscience Unit of the University College London. He holds a PhD from Brown University (’07) and BS from Cornell University (’96), both in computer science. Dr. Wood is the original architect of both the Anglican and Probabilistic-C probabilistic programming systems. He conducts AI-driven research at the boundary of probabilistic programming, Bayesian modeling, and Monte Carlo methods. Dr. Wood holds 6 patents, has authored over 50 papers, received the AISTATS best paper award in 2009, and has been awarded faculty research awards from Xerox, Google and Amazon. Prior to his academic career he was a successful entrepreneur having run and sold the content-based image retrieval company ToFish! to AOL/Time Warner and served as CEO of Interfolio.

Noemie Elhadad (Columbia University)
Nicholas Bartlett (Columbia)

More from the Same Authors