Timezone: »
Poster
List-Decodable Mean Estimation in Nearly-PCA Time
Ilias Diakonikolas · Daniel Kane · Daniel Kongsgaard · Jerry Li · Kevin Tian
Robust statistics has traditionally focused on designing estimators tolerant to a minority of contaminated data. {\em List-decodable learning}~\cite{CharikarSV17} studies the more challenging regime where only a minority $\tfrac 1 k$ fraction of the dataset, $k \geq 2$, is drawn from the distribution of interest, and no assumptions are made on the remaining data. We study the fundamental task of list-decodable mean estimation in high dimensions. Our main result is a new algorithm for bounded covariance distributions with optimal sample complexity and near-optimal error guarantee, running in {\em nearly-PCA time}. Assuming the ground truth distribution on $\mathbb{R}^d$ has identity-bounded covariance, our algorithm outputs $O(k)$ candidate means, one of which is within distance $O(\sqrt{k\log k})$ from the truth. Our algorithm runs in time $\widetilde{O}(ndk)$, where $n$ is the dataset size. This runtime nearly matches the cost of performing $k$-PCA on the data, a natural bottleneck of known algorithms for (very) special cases of our problem, such as clustering well-separated mixtures. Prior to our work, the fastest runtimes were $\widetilde{O}(n^2 d k^2)$~\cite{DiakonikolasKK20}, and $\widetilde{O}(nd k^C)$ \cite{CherapanamjeriMY20} for an unspecified constant $C \geq 6$. Our approach builds on a novel soft downweighting method we term SIFT, arguably the simplest known polynomial-time mean estimator in the list-decodable setting. To develop our fast algorithms, we boost the computational cost of SIFT via a careful ``win-win-win'' analysis of an approximate Ky Fan matrix multiplicative weights procedure we develop, which may be of independent interest.
Author Information
Ilias Diakonikolas (UW Madison)
Daniel Kane (University of California, San Diego)
Daniel Kongsgaard (UCSD)
Jerry Li (Microsoft)
Kevin Tian (Stanford University)
Related Events (a corresponding poster, oral, or spotlight)
-
2021 Spotlight: List-Decodable Mean Estimation in Nearly-PCA Time »
Dates n/a. Room
More from the Same Authors
-
2021 Spotlight: Forster Decomposition and Learning Halfspaces with Noise »
Ilias Diakonikolas · Daniel Kane · Christos Tzamos -
2021 Spotlight: Statistical Query Lower Bounds for List-Decodable Linear Regression »
Ilias Diakonikolas · Daniel Kane · Ankit Pensia · Thanasis Pittas · Alistair Stewart -
2022 : Semi-Random Sparse Recovery in Nearly-Linear Time »
Jonathan Kelner · Jerry Li · Allen Liu · Aaron Sidford · Kevin Tian -
2022 : Sampling is as easy as learning the score: theory for diffusion models with minimal data assumptions »
Sitan Chen · Sinho Chewi · Jerry Li · Yuanzhi Li · Adil Salim · Anru Zhang -
2022 : REAP: A Large-Scale Realistic Adversarial Patch Benchmark »
Nabeel Hingun · Chawin Sitawarin · Jerry Li · David Wagner -
2023 Poster: Structured Semidefinite Programming for Recovering Structured Preconditioners »
Arun Jambulapati · Jerry Li · Christopher Musco · Kirankumar Shiragur · Aaron Sidford · Kevin Tian -
2022 Poster: Robust Model Selection and Nearly-Proper Learning for GMMs »
Allen Liu · Jerry Li · Ankur Moitra -
2022 Poster: Learning (Very) Simple Generative Models Is Hard »
Sitan Chen · Jerry Li · Yuanzhi Li -
2021 Poster: ReLU Regression with Massart Noise »
Ilias Diakonikolas · Jong Ho Park · Christos Tzamos -
2021 Poster: Lower Bounds on Metropolized Sampling Methods for Well-Conditioned Distributions »
Yin Tat Lee · Ruoqi Shen · Kevin Tian -
2021 Poster: Statistical Query Lower Bounds for List-Decodable Linear Regression »
Ilias Diakonikolas · Daniel Kane · Ankit Pensia · Thanasis Pittas · Alistair Stewart -
2021 Poster: Forster Decomposition and Learning Halfspaces with Noise »
Ilias Diakonikolas · Daniel Kane · Christos Tzamos -
2021 Poster: Robust Regression Revisited: Acceleration and Improved Estimation Rates »
Arun Jambulapati · Jerry Li · Tselil Schramm · Kevin Tian -
2021 Oral: Lower Bounds on Metropolized Sampling Methods for Well-Conditioned Distributions »
Yin Tat Lee · Ruoqi Shen · Kevin Tian -
2020 Poster: List-Decodable Mean Estimation via Iterative Multi-Filtering »
Ilias Diakonikolas · Daniel Kane · Daniel Kongsgaard -
2020 Poster: Acceleration with a Ball Optimization Oracle »
Yair Carmon · Arun Jambulapati · Qijia Jiang · Yujia Jin · Yin Tat Lee · Aaron Sidford · Kevin Tian -
2020 Oral: Acceleration with a Ball Optimization Oracle »
Yair Carmon · Arun Jambulapati · Qijia Jiang · Yujia Jin · Yin Tat Lee · Aaron Sidford · Kevin Tian -
2020 Poster: Robust Gaussian Covariance Estimation in Nearly-Matrix Multiplication Time »
Jerry Li · Guanghao Ye -
2020 Poster: Robust Sub-Gaussian Principal Component Analysis and Width-Independent Schatten Packing »
Arun Jambulapati · Jerry Li · Kevin Tian -
2020 Spotlight: Robust Sub-Gaussian Principal Component Analysis and Width-Independent Schatten Packing »
Arun Jambulapati · Jerry Li · Kevin Tian -
2020 Poster: Near-Optimal SQ Lower Bounds for Agnostically Learning Halfspaces and ReLUs under Gaussian Marginals »
Ilias Diakonikolas · Daniel Kane · Nikos Zarifis -
2020 Poster: Non-Convex SGD Learns Halfspaces with Adversarial Label Noise »
Ilias Diakonikolas · Vasilis Kontonis · Christos Tzamos · Nikos Zarifis -
2020 Poster: Robust and Heavy-Tailed Mean Estimation Made Simple, via Regret Minimization »
Sam Hopkins · Jerry Li · Fred Zhang -
2020 Poster: Learning Structured Distributions From Untrusted Batches: Faster and Simpler »
Sitan Chen · Jerry Li · Ankur Moitra -
2020 Poster: The Complexity of Adversarially Robust Proper Learning of Halfspaces with Agnostic Noise »
Ilias Diakonikolas · Daniel M. Kane · Pasin Manurangsi -
2020 Poster: Outlier Robust Mean Estimation with Subgaussian Rates via Stability »
Ilias Diakonikolas · Daniel M. Kane · Ankit Pensia -
2019 Poster: Private Testing of Distributions via Sample Permutations »
Maryam Aliakbarpour · Ilias Diakonikolas · Daniel Kane · Ronitt Rubinfeld -
2019 Poster: Nearly Tight Bounds for Robust Proper Learning of Halfspaces with a Margin »
Ilias Diakonikolas · Daniel Kane · Pasin Manurangsi -
2019 Poster: Variance Reduction for Matrix Games »
Yair Carmon · Yujia Jin · Aaron Sidford · Kevin Tian -
2019 Poster: Provably Robust Deep Learning via Adversarially Trained Smoothed Classifiers »
Hadi Salman · Jerry Li · Ilya Razenshteyn · Pengchuan Zhang · Huan Zhang · Sebastien Bubeck · Greg Yang -
2019 Spotlight: Provably Robust Deep Learning via Adversarially Trained Smoothed Classifiers »
Hadi Salman · Jerry Li · Ilya Razenshteyn · Pengchuan Zhang · Huan Zhang · Sebastien Bubeck · Greg Yang -
2019 Oral: Variance Reduction for Matrix Games »
Yair Carmon · Yujia Jin · Aaron Sidford · Kevin Tian -
2019 Poster: Distribution-Independent PAC Learning of Halfspaces with Massart Noise »
Ilias Diakonikolas · Themis Gouleakis · Christos Tzamos -
2019 Poster: Equipping Experts/Bandits with Long-term Memory »
Kai Zheng · Haipeng Luo · Ilias Diakonikolas · Liwei Wang -
2019 Spotlight: Nearly Tight Bounds for Robust Proper Learning of Halfspaces with a Margin »
Ilias Diakonikolas · Daniel Kane · Pasin Manurangsi -
2019 Oral: Distribution-Independent PAC Learning of Halfspaces with Massart Noise »
Ilias Diakonikolas · Themis Gouleakis · Christos Tzamos -
2019 Poster: Outlier-Robust High-Dimensional Sparse Estimation via Iterative Filtering »
Ilias Diakonikolas · Daniel Kane · Sushrut Karmalkar · Eric Price · Alistair Stewart -
2019 Poster: A Polynomial Time Algorithm for Log-Concave Maximum Likelihood via Locally Exponential Families »
Brian Axelrod · Ilias Diakonikolas · Alistair Stewart · Anastasios Sidiropoulos · Gregory Valiant -
2019 Poster: A Direct tilde{O}(1/epsilon) Iteration Parallel Algorithm for Optimal Transport »
Arun Jambulapati · Aaron Sidford · Kevin Tian -
2019 Poster: Quantum Entropy Scoring for Fast Robust Mean Estimation and Improved Outlier Detection »
Yihe Dong · Samuel Hopkins · Jerry Li -
2019 Spotlight: Quantum Entropy Scoring for Fast Robust Mean Estimation and Improved Outlier Detection »
Yihe Dong · Samuel Hopkins · Jerry Li -
2017 Poster: Learning Populations of Parameters »
Kevin Tian · Weihao Kong · Gregory Valiant