Timezone: »
Spotlight
A Bayesian LDA-based model for semi-supervised part-of-speech tagging
Kristina N Toutanova · Mark Johnson
We present a novel Bayesian statistical model for semi-supervised part-of-speech tagging. Our model extends the Latent Dirichlet Allocation (LDA) model and incorporates the intuition that words' distributions over tags, p(t|w), are sparse. In addition we introduce a model for determining the set of possible tags of a word which captures important dependencies in the ambiguity classes of words. Our model outperforms the best previously proposed model for this task on a standard dataset.
Author Information
Kristina N Toutanova (Microsoft Research)
Mark Johnson (Macquarie University)
Related Events (a corresponding poster, oral, or spotlight)
-
2007 Poster: A Bayesian LDA-based model for semi-supervised part-of-speech tagging »
Mon Dec 3rd 06:30 -- 06:40 PM Room None
More from the Same Authors
-
2018 Poster: Partially-Supervised Image Captioning »
Peter Anderson · Stephen Gould · Mark Johnson -
2010 Spotlight: Synergies in learning words and their referents »
Mark Johnson · Katherine Demuth · Michael C Frank · Bevan K Jones -
2010 Poster: Synergies in learning words and their referents »
Mark Johnson · Katherine Demuth · Michael C Frank · Bevan K Jones -
2006 Poster: Adaptor Grammars: A Framework for Specifying Compositional Nonparametric Bayesian Mod »
Mark Johnson · Tom Griffiths · Sharon Goldwater