Timezone: »
Poster
Learning Distributions Generated by One-Layer ReLU Networks
Shanshan Wu · Alexandros Dimakis · Sujay Sanghavi
Tue Dec 10 10:45 AM -- 12:45 PM (PST) @ East Exhibition Hall B + C #49
We consider the problem of estimating the parameters of a $d$-dimensional rectified Gaussian distribution from i.i.d. samples. A rectified Gaussian distribution is defined by passing a standard Gaussian distribution through a one-layer ReLU neural network. We give a simple algorithm to estimate the parameters (i.e., the weight matrix and bias vector of the ReLU neural network) up to an error $\eps\norm{W}_F$ using $\widetilde{O}(1/\eps^2)$ samples and $\widetilde{O}(d^2/\eps^2)$ time (log factors are ignored for simplicity). This implies that we can estimate the distribution up to $\eps$ in total variation distance using $\widetilde{O}(\kappa^2d^2/\eps^2)$ samples, where $\kappa$ is the condition number of the covariance matrix. Our only assumption is that the bias vector is non-negative. Without this non-negativity assumption, we show that estimating the bias vector within any error requires the number of samples at least exponential in the infinity norm of the bias vector. Our algorithm is based on the key observation that vector norms and pairwise angles can be estimated separately. We use a recent result on learning from truncated samples. We also prove two sample complexity lower bounds: $\Omega(1/\eps^2)$ samples are required to estimate the parameters up to error $\eps$, while $\Omega(d/\eps^2)$ samples are necessary to estimate the distribution up to $\eps$ in total variation distance. The first lower bound implies that our algorithm is optimal for parameter estimation. Finally, we show an interesting connection between learning a two-layer generative model and non-negative matrix factorization. Experimental results are provided to support our analysis.
Author Information
Shanshan Wu (University of Texas at Austin)
Here is my homepage: http://wushanshan.github.io/
Alex Dimakis (University of Texas, Austin)
Sujay Sanghavi (UT-Austin)
More from the Same Authors
-
2020 Poster: Implicit Regularization and Convergence for Weight Normalization »
Xiaoxia Wu · Edgar Dobriban · Tongzheng Ren · Shanshan Wu · Zhiyuan Li · Suriya Gunasekar · Rachel Ward · Qiang Liu -
2020 Poster: SMYRF - Efficient Attention using Asymmetric Clustering »
Giannis Daras · Nikita Kitaev · Augustus Odena · Alexandros Dimakis -
2020 Poster: Applications of Common Entropy for Causal Inference »
Murat Kocaoglu · Sanjay Shakkottai · Alexandros Dimakis · Constantine Caramanis · Sriram Vishwanath -
2020 Poster: Exactly Computing the Local Lipschitz Constant of ReLU Networks »
Matt Jordan · Alexandros Dimakis -
2020 Poster: Robust compressed sensing using generative models »
Ajil Jalal · Liu Liu · Alexandros Dimakis · Constantine Caramanis -
2019 Workshop: Information Theory and Machine Learning »
Shengjia Zhao · Jiaming Song · Yanjun Han · Kristy Choi · Pratyusha Kalluri · Ben Poole · Alexandros Dimakis · Jiantao Jiao · Tsachy Weissman · Stefano Ermon -
2019 Workshop: Solving inverse problems with deep networks: New architectures, theoretical foundations, and applications »
Reinhard Heckel · Paul Hand · Richard Baraniuk · Joan Bruna · Alexandros Dimakis · Deanna Needell -
2019 Poster: Interaction Hard Thresholding: Consistent Sparse Quadratic Regression in Sub-quadratic Time and Space »
Shuo Yang · Yanyao Shen · Sujay Sanghavi -
2019 Poster: Inverting Deep Generative models, One layer at a time »
Qi Lei · Ajil Jalal · Inderjit Dhillon · Alexandros Dimakis -
2019 Poster: Provable Certificates for Adversarial Examples: Fitting a Ball in the Union of Polytopes »
Matt Jordan · Justin Lewis · Alexandros Dimakis -
2019 Poster: Primal-Dual Block Generalized Frank-Wolfe »
Qi Lei · JIACHENG ZHUO · Constantine Caramanis · Inderjit Dhillon · Alexandros Dimakis -
2019 Poster: Sparse Logistic Regression Learns All Discrete Pairwise Graphical Models »
Shanshan Wu · Sujay Sanghavi · Alexandros Dimakis -
2019 Spotlight: Sparse Logistic Regression Learns All Discrete Pairwise Graphical Models »
Shanshan Wu · Sujay Sanghavi · Alexandros Dimakis -
2019 Poster: Iterative Least Trimmed Squares for Mixed Linear Regression »
Yanyao Shen · Sujay Sanghavi -
2019 Poster: Blocking Bandits »
Soumya Basu · Rajat Sen · Sujay Sanghavi · Sanjay Shakkottai -
2018 Poster: Experimental Design for Cost-Aware Learning of Causal Graphs »
Erik Lindgren · Murat Kocaoglu · Alexandros Dimakis · Sriram Vishwanath -
2017 Workshop: NIPS Highlights (MLTrain), Learn How to code a paper with state of the art frameworks »
Alexandros Dimakis · Nikolaos Vasiloglou · Guy Van den Broeck · Alexander Ihler · Assaf Araki -
2017 Poster: Streaming Weak Submodularity: Interpreting Neural Networks on the Fly »
Ethan Elenberg · Alexandros Dimakis · Moran Feldman · Amin Karbasi -
2017 Oral: Streaming Weak Submodularity: Interpreting Neural Networks on the Fly »
Ethan Elenberg · Alexandros Dimakis · Moran Feldman · Amin Karbasi -
2017 Poster: Model-Powered Conditional Independence Test »
Rajat Sen · Ananda Theertha Suresh · Karthikeyan Shanmugam · Alexandros Dimakis · Sanjay Shakkottai -
2016 Poster: Leveraging Sparsity for Efficient Submodular Data Summarization »
Erik Lindgren · Shanshan Wu · Alexandros Dimakis -
2016 Poster: Single Pass PCA of Matrix Products »
Shanshan Wu · Srinadh Bhojanapalli · Sujay Sanghavi · Alexandros Dimakis -
2016 Poster: Normalized Spectral Map Synchronization »
Yanyao Shen · Qixing Huang · Nati Srebro · Sujay Sanghavi -
2015 Poster: Orthogonal NMF through Subspace Exploration »
Megasthenis Asteris · Dimitris Papailiopoulos · Alexandros Dimakis -
2015 Poster: Sparse PCA via Bipartite Matchings »
Megasthenis Asteris · Dimitris Papailiopoulos · Anastasios Kyrillidis · Alexandros Dimakis -
2015 Poster: Convergence Rates of Active Learning for Maximum Likelihood Estimation »
Kamalika Chaudhuri · Sham Kakade · Praneeth Netrapalli · Sujay Sanghavi -
2015 Poster: Learning Causal Graphs with Small Interventions »
Karthikeyan Shanmugam · Murat Kocaoglu · Alexandros Dimakis · Sriram Vishwanath -
2014 Poster: Sparse Polynomial Learning and Graph Sketching »
Murat Kocaoglu · Karthikeyan Shanmugam · Alexandros Dimakis · Adam Klivans -
2014 Poster: Non-convex Robust PCA »
Praneeth Netrapalli · Niranjan Uma Naresh · Sujay Sanghavi · Animashree Anandkumar · Prateek Jain -
2014 Poster: On the Information Theoretic Limits of Learning Ising Models »
Rashish Tandon · Karthikeyan Shanmugam · Pradeep Ravikumar · Alexandros Dimakis -
2014 Spotlight: Non-convex Robust PCA »
Praneeth Netrapalli · Niranjan Uma Naresh · Sujay Sanghavi · Animashree Anandkumar · Prateek Jain -
2014 Oral: Sparse Polynomial Learning and Graph Sketching »
Murat Kocaoglu · Karthikeyan Shanmugam · Alexandros Dimakis · Adam Klivans -
2014 Poster: Greedy Subspace Clustering »
Dohyung Park · Constantine Caramanis · Sujay Sanghavi -
2013 Poster: Phase Retrieval using Alternating Minimization »
Praneeth Netrapalli · Prateek Jain · Sujay Sanghavi -
2012 Poster: Clustering Sparse Graphs »
Yudong Chen · Sujay Sanghavi · Huan Xu -
2010 Workshop: Robust Statistical Learning »
Pradeep Ravikumar · Constantine Caramanis · Sujay Sanghavi -
2010 Oral: A Dirty Model for Multi-task Learning »
Ali Jalali · Pradeep Ravikumar · Sujay Sanghavi · Chao Ruan -
2010 Poster: Robust PCA via Outlier Pursuit »
Huan Xu · Constantine Caramanis · Sujay Sanghavi -
2010 Poster: A Dirty Model for Multi-task Learning »
Ali Jalali · Pradeep Ravikumar · Sujay Sanghavi · Chao Ruan -
2007 Spotlight: Message Passing for Max-weight Independent Set »
Sujay Sanghavi · Devavrat Shah · Alan S Willsky -
2007 Poster: Message Passing for Max-weight Independent Set »
Sujay Sanghavi · Devavrat Shah · Alan S Willsky -
2007 Poster: Linear programming analysis of loopy belief propagation for weighted matching »
Sujay Sanghavi · Dmitry Malioutov · Alan S Willsky