Spotlight
Large Margin Taxonomy Embedding for Document Categorization
Kilian Q Weinberger · Olivier Chapelle

Mon Dec 8th 08:33 -- 08:34 PM @ None

Applications of multi-class classification, such as document categorization, often appear in cost-sensitive settings. Recent work has significantly improved the state of the art by moving beyond ``flat'' classification through incorporation of class hierarchies [Cai and Hoffman 04]. We present a novel algorithm that goes beyond hierarchical classification and estimates the latent semantic space that underlies the class hierarchy. In this space, each class is represented by a prototype and classification is done with the simple nearest neighbor rule. The optimization of the semantic space incorporates large margin constraints that ensure that for each instance the correct class prototype is closer than any other. We show that our optimization is convex and can be solved efficiently for large data sets. Experiments on the OHSUMED medical journal data base yield state-of-the-art results on topic categorization.

Author Information

Kilian Q Weinberger (Washington University in St. Louis)
Olivier Chapelle (Google)

More from the Same Authors