Timezone: »

Learning a Concept Hierarchy from Multi-labeled Documents
Viet-An Nguyen · Jordan Boyd-Graber · Philip Resnik · Jonathan Chang

Mon Dec 08 04:00 PM -- 08:59 PM (PST) @ Level 2, room 210D

While topic models can discover patterns of word usage in large corpora, it is difficult to meld this unsupervised structure with noisy, human-provided labels, especially when the label space is large. In this paper, we present a model-Label to Hierarchy (L2H)-that can induce a hierarchy of user-generated labels and the topics associated with those labels from a set of multi-labeled documents. The model is robust enough to account for missing labels from untrained, disparate annotators and provide an interpretable summary of an otherwise unwieldy label set. We show empirically the effectiveness of L2H in predicting held-out words and labels for unseen documents.

Author Information

Viet-An Nguyen (Facebook)
Jordan Boyd-Graber (University of Maryland)
Philip Resnik (University of Maryland)
Jonathan Chang (Facebook)

More from the Same Authors