Timezone: »
Poster
Coresets for Regressions with Panel Data
Lingxiao Huang · K Sudhir · Nisheeth Vishnoi
A panel dataset contains features or observations for multiple individuals over multiple time periods and regression problems with panel data are common in statistics and applied ML. When dealing with massive datasets, coresets have emerged as a valuable tool from a computational, storage and privacy perspective, as one needs to work with and share much smaller datasets. However, results on coresets for regression problems thus far have only been available for cross-sectional data ($N$ individuals each observed for a single time unit) or longitudinal data (a single individual observed for $T>1$ time units), but there are no results for panel data ($N>1$, $T>1$). This paper introduces the problem of coresets to panel data settings; we first define coresets for several variants of regression problems with panel data and then present efficient algorithms to construct coresets of size that are independent of $N$ and $T$, and only polynomially depend on $1/\varepsilon$ (where $\varepsilon$ is the error parameter) and the number of regression parameters. Our approach is based on the Feldman-Langberg framework in which a key step is to upper bound the “total sensitivity” that is roughly the sum of maximum influences of all individual-time pairs taken over all possible choices of regression parameters. Empirically, we assess our approach with a synthetic and a real-world datasets; the coreset sizes constructed using our approach are much smaller than the full dataset and coresets indeed accelerate the running time of computing the regression objective.
Author Information
Lingxiao Huang (Huawei TCS Lab)
K Sudhir (Yale University)
Nisheeth Vishnoi (Yale University)
More from the Same Authors
-
2021 Spotlight: Coresets for Time Series Clustering »
Lingxiao Huang · K Sudhir · Nisheeth Vishnoi -
2023 Poster: Sampling from Structured Log-Concave Distributions via a Soft-Threshold Dikin Walk »
Oren Mangoubi · Nisheeth Vishnoi -
2023 Poster: Bias in Evaluation Processes: An Optimization-Based Model »
L. Elisa Celis · Amit Kumar · Anay Mehrotra · Nisheeth Vishnoi -
2022 Spotlight: Lightning Talks 2A-2 »
Harikrishnan N B · Jianhao Ding · Juha Harviainen · Yizhen Wang · Lue Tao · Oren Mangoubi · Tong Bu · Nisheeth Vishnoi · Mohannad Alhanahnah · Mikko Koivisto · Aditi Kathpalia · Lei Feng · Nithin Nagaraj · Hongxin Wei · Xiaozhu Meng · Petteri Kaski · Zhaofei Yu · Tiejun Huang · Ke Wang · Jinfeng Yi · Jian Liu · Sheng-Jun Huang · Mihai Christodorescu · Songcan Chen · Somesh Jha -
2022 Spotlight: Re-Analyze Gauss: Bounds for Private Matrix Approximation via Dyson Brownian Motion »
Oren Mangoubi · Nisheeth Vishnoi -
2022 Spotlight: Sampling from Log-Concave Distributions with Infinity-Distance Guarantees »
Oren Mangoubi · Nisheeth Vishnoi -
2022 Spotlight: Lightning Talks 2A-1 »
Caio Kalil Lauand · Ryan Strauss · Yasong Feng · lingyu gu · Alireza Fathollah Pour · Oren Mangoubi · Jianhao Ma · Binghui Li · Hassan Ashtiani · Yongqi Du · Salar Fattahi · Sean Meyn · Jikai Jin · Nisheeth Vishnoi · zengfeng Huang · Junier B Oliva · yuan zhang · Han Zhong · Tianyu Wang · John Hopcroft · Di Xie · Shiliang Pu · Liwei Wang · Robert Qiu · Zhenyu Liao -
2022 Poster: Sampling from Log-Concave Distributions with Infinity-Distance Guarantees »
Oren Mangoubi · Nisheeth Vishnoi -
2022 Poster: Fair Ranking with Noisy Protected Attributes »
Anay Mehrotra · Nisheeth Vishnoi -
2022 Poster: Re-Analyze Gauss: Bounds for Private Matrix Approximation via Dyson Brownian Motion »
Oren Mangoubi · Nisheeth Vishnoi -
2022 Poster: Efficient Submodular Optimization under Noise: Local Search is Robust »
Lingxiao Huang · Yuyi Wang · Chunxue Yang · Huanjian Zhou -
2022 Poster: Coresets for Vertical Federated Learning: Regularized Linear Regression and $K$-Means Clustering »
Lingxiao Huang · Zhize Li · Jialin Sun · Haoyu Zhao -
2021 Poster: Fair Classification with Adversarial Perturbations »
L. Elisa Celis · Anay Mehrotra · Nisheeth Vishnoi -
2021 Poster: Coresets for Time Series Clustering »
Lingxiao Huang · K Sudhir · Nisheeth Vishnoi -
2019 Poster: Online sampling from log-concave distributions »
Holden Lee · Oren Mangoubi · Nisheeth Vishnoi -
2019 Poster: Coresets for Clustering with Fairness Constraints »
Lingxiao Huang · Shaofeng Jiang · Nisheeth Vishnoi