Timezone: »

Analyzing the Monotonic Feature Abstraction for Text Classification
Doug Downey · Oren Etzioni

Mon Dec 08 08:34 PM -- 08:35 PM (PST) @

Is accurate classification possible in the absence of hand-labeled data? This paper introduces the Monotonic Feature (MF) abstraction---where the probability of class membership increases monotonically with the MF's value. The paper proves that when an MF is given, PAC learning is possible with no hand-labeled data under certain assumptions. We argue that MFs arise naturally in a broad range of textual classification applications. On the classic "20 Newsgroups" data set, a learner given an MF and unlabeled data achieves classification accuracy equal to that of a state-of-the-art semi-supervised learner relying on 160 hand-labeled examples. Even when MFs are not given as input, their presence or absence can be determined from a small amount of hand-labeled data, which yields a new semi-supervised learning method that reduces error by 15% on the 20 Newsgroups data.

Author Information

Doug Downey (Northwestern University)
Oren Etzioni (University of Washington)

Related Events (a corresponding poster, oral, or spotlight)

More from the Same Authors

  • 2014 Workshop: 4th Workshop on Automated Knowledge Base Construction (AKBC) »
    Sameer Singh · Fabian M Suchanek · Sebastian Riedel · Partha Pratim Talukdar · Kevin Murphy · Christopher RĂ© · William Cohen · Tom Mitchell · Andrew McCallum · Jason E Weston · Ramanathan Guha · Boyan Onyshkevych · Hoifung Poon · Oren Etzioni · Ari Kobren · Arvind Neelakantan · Peter Clark