Timezone: »
We study fast algorithms for statistical regression problems under the strong contamination model, where the goal is to approximately optimize a generalized linear model (GLM) given adversarially corrupted samples. Prior works in this line of research were based on the \emph{robust gradient descent} framework of \cite{PrasadSBR20}, a first-order method using biased gradient queries, or the \emph{Sever} framework of \cite{DiakonikolasKK019}, an iterative outlier-removal method calling a stationary point finder. We present nearly-linear time algorithms for robust regression problems with improved runtime or estimation guarantees compared to the state-of-the-art. For the general case of smooth GLMs (e.g.\ logistic regression), we show that the robust gradient descent framework of \cite{PrasadSBR20} can be \emph{accelerated}, and show our algorithm extends to optimizing the Moreau envelopes of Lipschitz GLMs (e.g.\ support vector machines), answering several open questions in the literature. For the well-studied case of robust linear regression, we present an alternative approach obtaining improved estimation rates over prior nearly-linear time algorithms. Interestingly, our algorithm starts with an identifiability proof introduced in the context of the sum-of-squares algorithm of \cite{BakshiP21}, which achieved optimal error rates while requiring large polynomial runtime and sample complexity. We reinterpret their proof within the Sever framework and obtain a dramatically faster and more sample-efficient algorithm under fewer distributional assumptions.
Author Information
Arun Jambulapati (Stanford University)
Jerry Li (Microsoft)
Tselil Schramm (Stanford University)
Kevin Tian (Stanford University)
More from the Same Authors
-
2021 Spotlight: List-Decodable Mean Estimation in Nearly-PCA Time »
Ilias Diakonikolas · Daniel Kane · Daniel Kongsgaard · Jerry Li · Kevin Tian -
2022 : Semi-Random Sparse Recovery in Nearly-Linear Time »
Jonathan Kelner · Jerry Li · Allen Liu · Aaron Sidford · Kevin Tian -
2022 : Sampling is as easy as learning the score: theory for diffusion models with minimal data assumptions »
Sitan Chen · Sinho Chewi · Jerry Li · Yuanzhi Li · Adil Salim · Anru Zhang -
2022 : REAP: A Large-Scale Realistic Adversarial Patch Benchmark »
Nabeel Hingun · Chawin Sitawarin · Jerry Li · David Wagner -
2022 Poster: The Franz-Parisi Criterion and Computational Trade-offs in High Dimensional Statistics »
Afonso S Bandeira · Ahmed El Alaoui · Samuel Hopkins · Tselil Schramm · Alexander S Wein · Ilias Zadik -
2022 Poster: Robust Model Selection and Nearly-Proper Learning for GMMs »
Allen Liu · Jerry Li · Ankur Moitra -
2022 Poster: Learning (Very) Simple Generative Models Is Hard »
Sitan Chen · Jerry Li · Yuanzhi Li -
2021 Poster: Lower Bounds on Metropolized Sampling Methods for Well-Conditioned Distributions »
Yin Tat Lee · Ruoqi Shen · Kevin Tian -
2021 Poster: Stochastic Bias-Reduced Gradient Methods »
Hilal Asi · Yair Carmon · Arun Jambulapati · Yujia Jin · Aaron Sidford -
2021 Poster: List-Decodable Mean Estimation in Nearly-PCA Time »
Ilias Diakonikolas · Daniel Kane · Daniel Kongsgaard · Jerry Li · Kevin Tian -
2021 Oral: Lower Bounds on Metropolized Sampling Methods for Well-Conditioned Distributions »
Yin Tat Lee · Ruoqi Shen · Kevin Tian -
2020 Poster: Acceleration with a Ball Optimization Oracle »
Yair Carmon · Arun Jambulapati · Qijia Jiang · Yujia Jin · Yin Tat Lee · Aaron Sidford · Kevin Tian -
2020 Oral: Acceleration with a Ball Optimization Oracle »
Yair Carmon · Arun Jambulapati · Qijia Jiang · Yujia Jin · Yin Tat Lee · Aaron Sidford · Kevin Tian -
2020 Poster: Robust Gaussian Covariance Estimation in Nearly-Matrix Multiplication Time »
Jerry Li · Guanghao Ye -
2020 Poster: Robust Sub-Gaussian Principal Component Analysis and Width-Independent Schatten Packing »
Arun Jambulapati · Jerry Li · Kevin Tian -
2020 Spotlight: Robust Sub-Gaussian Principal Component Analysis and Width-Independent Schatten Packing »
Arun Jambulapati · Jerry Li · Kevin Tian -
2020 Poster: Robust and Heavy-Tailed Mean Estimation Made Simple, via Regret Minimization »
Sam Hopkins · Jerry Li · Fred Zhang -
2020 Poster: Learning Structured Distributions From Untrusted Batches: Faster and Simpler »
Sitan Chen · Jerry Li · Ankur Moitra -
2019 Poster: Variance Reduction for Matrix Games »
Yair Carmon · Yujia Jin · Aaron Sidford · Kevin Tian -
2019 Poster: Provably Robust Deep Learning via Adversarially Trained Smoothed Classifiers »
Hadi Salman · Jerry Li · Ilya Razenshteyn · Pengchuan Zhang · Huan Zhang · Sebastien Bubeck · Greg Yang -
2019 Spotlight: Provably Robust Deep Learning via Adversarially Trained Smoothed Classifiers »
Hadi Salman · Jerry Li · Ilya Razenshteyn · Pengchuan Zhang · Huan Zhang · Sebastien Bubeck · Greg Yang -
2019 Oral: Variance Reduction for Matrix Games »
Yair Carmon · Yujia Jin · Aaron Sidford · Kevin Tian -
2019 Poster: A Direct tilde{O}(1/epsilon) Iteration Parallel Algorithm for Optimal Transport »
Arun Jambulapati · Aaron Sidford · Kevin Tian -
2019 Poster: Quantum Entropy Scoring for Fast Robust Mean Estimation and Improved Outlier Detection »
Yihe Dong · Samuel Hopkins · Jerry Li -
2019 Spotlight: Quantum Entropy Scoring for Fast Robust Mean Estimation and Improved Outlier Detection »
Yihe Dong · Samuel Hopkins · Jerry Li -
2017 Poster: Learning Populations of Parameters »
Kevin Tian · Weihao Kong · Gregory Valiant