Skip to yearly menu bar Skip to main content


Poster

Hierarchically Supervised Latent Dirichlet Allocation

Adler J Perotte · Frank Wood · Noemie Elhadad · Nicholas Bartlett


Abstract:

We introduce hierarchically supervised latent Dirichlet allocation (HSLDA), a model for hierarchically and multiply labeled bag-of-word data. Examples of such data include web pages and their placement in directories, product descriptions and associated categories from product hierarchies, and free-text clinical records and their assigned diagnosis codes. Out-of-sample label prediction is the primary goal of this work, but improved lower-dimensional representations of the bag-of-word data are also of interest. We demonstrate HSLDA on large-scale data from clinical document labeling and retail product categorization tasks. We show that leveraging the structure from hierarchical labels improves out-of-sample label prediction substantially when compared to models that do not.

Live content is unavailable. Log in and register to view live content