Timezone: »
For many classic structured prediction problems, probability distributions over the dependent variables can be efficiently computed using widely-known algorithms and data structures (such as forward-backward, and its corresponding trellis for exact probability distributions in Markov models). However, we know of no previous work studying efficient representations of exact distributions over clusterings. This paper presents definitions and proofs for a dynamic-programming inference procedure that computes the partition function, the marginal probability of a cluster, and the MAP clustering---all exactly. Rather than the Nth Bell number, these exact solutions take time and space proportional to the substantially smaller powerset of N. Indeed, we improve upon the time complexity of the algorithm introduced by Kohonen and Corander (2016) for this problem by a factor of N. While still large, this previously unknown result is intellectually interesting in its own right, makes feasible exact inference for important real-world small data applications (such as medicine), and provides a natural stepping stone towards sparse-trellis approximations that enable further scalability (which we also explore). In experiments, we demonstrate the superiority of our approach over approximate methods in analyzing real-world gene expression data used in cancer treatment.
Author Information
Craig Greenberg (University of Massachusetts Amherst / NIST)
Nicholas Monath (University of Massachusetts Amherst)
Ari Kobren (UMass Amherst)
Patrick Flaherty (University of Massachusetts, Amherst)
Andrew McGregor (University of Massachusetts Amherst)
Andrew McCallum (UMass Amherst)
More from the Same Authors
-
2020 Poster: Improving Local Identifiability in Probabilistic Box Embeddings »
Shib Dasgupta · Michael Boratko · Dongxu Zhang · Luke Vilnis · Xiang Li · Andrew McCallum -
2019 Workshop: Sets and Partitions »
Nicholas Monath · Manzil Zaheer · Andrew McCallum · Ari Kobren · Junier Oliva · Barnabas Poczos · Ruslan Salakhutdinov -
2019 Poster: Sample Complexity of Learning Mixture of Sparse Linear Regressions »
Akshay Krishnamurthy · Arya Mazumdar · Andrew McGregor · Soumyabrata Pal -
2019 Poster: Search-Guided, Lightly-Supervised Training of Structured Prediction Energy Networks »
Amirmohammad Rooshenas · Dongxu Zhang · Gopal Sharma · Andrew McCallum -
2019 Poster: Offline Contextual Bandits with High Probability Fairness Guarantees »
Blossom Metevier · Stephen Giguere · Sarah Brockman · Ari Kobren · Yuriy Brun · Emma Brunskill · Philip Thomas -
2017 Poster: Active Bias: Training More Accurate Neural Networks by Emphasizing High Variance Samples »
Haw-Shiuan Chang · Erik Learned-Miller · Andrew McCallum -
2014 Workshop: 4th Workshop on Automated Knowledge Base Construction (AKBC) »
Sameer Singh · Fabian M Suchanek · Sebastian Riedel · Partha Pratim Talukdar · Kevin P Murphy · Christopher RĂ© · William Cohen · Tom Mitchell · Andrew McCallum · Jason E Weston · Ramanathan Guha · Boyan Onyshkevych · Hoifung Poon · Oren Etzioni · Ari Kobren · Arvind Neelakantan · Peter Clark -
2012 Poster: MAP Inference in Chains using Column Generation »
David Belanger · Alexandre T Passos · Sebastian Riedel · Andrew McCallum -
2011 Workshop: Big Learning: Algorithms, Systems, and Tools for Learning at Scale »
Joseph E Gonzalez · Sameer Singh · Graham Taylor · James Bergstra · Alice Zheng · Misha Bilenko · Yucheng Low · Yoshua Bengio · Michael Franklin · Carlos Guestrin · Andrew McCallum · Alexander Smola · Michael Jordan · Sugato Basu -
2011 Poster: Query-Aware MCMC »
Michael Wick · Andrew McCallum -
2009 Poster: FACTORIE: Probabilistic Programming via Imperatively Defined Factor Graphs »
Andrew McCallum · Karl Schultz · Sameer Singh -
2009 Poster: Training Factor Graphs with Reinforcement Learning for Efficient MAP Inference »
Michael Wick · Khashayar Rohanimanesh · Sameer Singh · Andrew McCallum -
2009 Spotlight: Training Factor Graphs with Reinforcement Learning for Efficient MAP Inference »
Michael Wick · Khashayar Rohanimanesh · Sameer Singh · Andrew McCallum -
2009 Poster: Rethinking LDA: Why Priors Matter »
Hanna Wallach · David Mimno · Andrew McCallum -
2009 Spotlight: Rethinking LDA: Why Priors Matter »
Hanna Wallach · David Mimno · Andrew McCallum