We give a polynomialtime algorithm for provably learning the structure and parameters of bipartite noisyor Bayesian networks of binary variables where the top layer is completely hidden. Unsupervised learning of these models is a form of discrete factor analysis, enabling the discovery of hidden variables and their causal relationships with observed data. We obtain an efficient learning algorithm for a family of Bayesian networks that we call quartetlearnable, meaning that every latent variable has four children that do not have any other parents in common. We show that the existence of such a quartet allows us to uniquely identify each latent variable and to learn all parameters involving that latent variable. Underlying our algorithm are two new techniques for structure learning: a quartet test to determine whether a set of binary variables are singly coupled, and a conditional mutual information test that we use to learn parameters. We also show how to subtract already learned latent variables from the model to create new singlycoupled quartets, which substantially expands the class of structures that we can learn. Finally, we give a proof of the polynomial sample complexity of our learning algorithm, and experimentally compare it to variational EM.
Author Information
Yacine Jernite (Facebook FAIR NYC)
Yoni Halpern (Google)
David Sontag (MIT)
More from the Same Authors

2018 Poster: Why Is My Classifier Discriminatory? »
Irene Chen · Fredrik Johansson · David Sontag 
2018 Spotlight: Why Is My Classifier Discriminatory? »
Irene Chen · Fredrik Johansson · David Sontag 
2017 Poster: Causal Effect Inference with Deep LatentVariable Models »
Christos Louizos · Uri Shalit · Joris M Mooij · David Sontag · Richard Zemel · Max Welling 
2015 Workshop: Machine Learning For Healthcare (MLHC) »
Theofanis Karaletsos · Rajesh Ranganath · Suchi Saria · David Sontag 
2015 Poster: Barrier FrankWolfe for Marginal Inference »
Rahul G Krishnan · Simon LacosteJulien · David Sontag 
2011 Poster: Complexity of Inference in Latent Dirichlet Allocation »
David Sontag · Daniel Roy 
2011 Spotlight: Complexity of Inference in Latent Dirichlet Allocation »
David Sontag · Daniel Roy 
2010 Spotlight: More data means less inference: A pseudomax approach to structured learning »
David Sontag · Ofer Meshi · Tommi Jaakkola · Amir Globerson 
2010 Poster: More data means less inference: A pseudomax approach to structured learning »
David Sontag · Ofer Meshi · Tommi Jaakkola · Amir Globerson 
2009 Workshop: Approximate Learning of Large Scale Graphical Models »
Russ Salakhutdinov · Amir Globerson · David Sontag 
2008 Workshop: Approximate inference  how far have we come? »
Amir Globerson · David Sontag · Tommi Jaakkola 
2008 Poster: Clusters and Coarse Partitions in LP Relaxations »
David Sontag · Amir Globerson · Tommi Jaakkola 
2008 Spotlight: Clusters and Coarse Partitions in LP Relaxations »
David Sontag · Amir Globerson · Tommi Jaakkola 
2007 Oral: New Outer Bounds on the Marginal Polytope »
David Sontag · Tommi Jaakkola 
2007 Poster: New Outer Bounds on the Marginal Polytope »
David Sontag · Tommi Jaakkola