Timezone: »

Binary to Bushy: Bayesian Hierarchical Clustering with the Beta Coalescent
Yuening Hu · Jordan Boyd-Graber · Hal Daumé III · Z. Irene Ying

Sun Dec 08 02:00 PM -- 06:00 PM (PST) @ Harrah's Special Events Center, 2nd Floor

Discovering hierarchical regularities in data is a key problem in interacting with large datasets, modeling cognition, and encoding knowledge. A previous Bayesian solution---Kingman's coalescent---provides a convenient probabilistic model for data represented as a binary tree. Unfortunately, this is inappropriate for data better described by bushier trees. We generalize an existing belief propagation framework of Kingman's coalescent to the beta coalescent, which models a wider range of tree structures. Because of the complex combinatorial search over possible structures, we develop new sampling schemes using sequential Monte Carlo and Dirichlet process mixture models, which render inference efficient and tractable. We present results on both synthetic and real data that show the beta coalescent outperforms Kingman's coalescent on real datasets and is qualitatively better at capturing data in bushy hierarchies.

Author Information

Yuening Hu (University of Maryland)
Jordan Boyd-Graber (University of Maryland)
Hal Daumé III (University of Maryland - College Park)

Hal Daumé III wields a professor appointment in Computer Science and Language Science at the University of Maryland, and spends time as a principal researcher in the machine learning group and fairness group at Microsoft Research in New York City. He and his wonderful advisees study questions related to how to get machines to become more adept at human language, by developing models and algorithms that allow them to learn from data. The two major questions that really drive their research these days are: (1) how can we get computers to learn language through natural interaction with people/users? and (2) how can we do this in a way that promotes fairness, transparency and explainability in the learned models?

Z. Irene Ying (US Department of Agriculture)

More from the Same Authors