Timezone: »
Poster
Total Least Squares Regression in Input Sparsity Time
Huaian Diao · Zhao Song · David Woodruff · Xin Yang
Wed Dec 11 10:45 AM  12:45 PM (PST) @ East Exhibition Hall B + C #23
In the total least squares problem, one is given an $m \times n$ matrix $A$, and an $m \times d$ matrix $B$, and one seeks to ``correct'' both $A$ and $B$, obtaining matrices $\hat{A}$ and $\hat{B}$, so that there exists an $X$ satisfying the equation $\hat{A}X = \hat{B}$. Typically the problem is overconstrained, meaning that $m \gg \max(n,d)$. The cost of the solution $\hat{A}, \hat{B}$ is given by $\A\hat{A}\_F^2 + \B  \hat{B}\_F^2$. We give an algorithm for finding a solution $X$ to the linear system $\hat{A}X=\hat{B}$ for which the cost $\A\hat{A}\_F^2 + \B\hat{B}\_F^2$ is at most a multiplicative $(1+\epsilon)$ factor times the optimal cost, up to an additive error $\eta$ that may be an arbitrarily small function of $n$. Importantly, our running time is $\tilde{O}(\nnz(A) + \nnz(B)) + \poly(n/\epsilon) \cdot d$, where for a matrix $C$, $\nnz(C)$ denotes its number of nonzero entries. Importantly, our running time does not directly depend on the large parameter $m$. As total least squares regression is known to be solvable via low rank approximation, a natural approach is to invoke fast algorithms for approximate low rank approximation, obtaining matrices $\hat{A}$ and $\hat{B}$ from this low rank approximation, and then solving for $X$ so that $\hat{A}X = \hat{B}$. However, existing algorithms do not apply since in total least squares the rank of the low rank approximation needs to be $n$, and so the running time of known methods would be at least $mn^2$. In contrast, we are able to achieve a much faster running time for finding $X$ by never explicitly forming the equation $\hat{A} X = \hat{B}$, but instead solving for an $X$ which is a solution to an implicit such equation.
Finally, we generalize our algorithm to the total least squares problem with regularization.
Author Information
Huaian Diao (Northeast Normal University)
Zhao Song (Harvard University & University of Washington)
David Woodruff (Carnegie Mellon University)
Xin Yang (University of Washington)
More from the Same Authors

2022 Poster: Optimal Query Complexities for Dynamic Trace Estimation »
David Woodruff · Fred Zhang · Richard Zhang 
2019 Poster: Tight Dimensionality Reduction for Sketching Low Degree Polynomial Kernels »
Michela Meister · Tamas Sarlos · David Woodruff 
2019 Poster: Average Case Column Subset Selection for Entrywise $\ell_1$Norm Loss »
Zhao Song · David Woodruff · Peilin Zhong 
2019 Poster: Efficient and Thrifty Voting by Any Means Necessary »
Debmalya Mandal · Ariel Procaccia · Nisarg Shah · David Woodruff 
2019 Poster: Regularized Weighted Low Rank Approximation »
Frank Ban · David Woodruff · Richard Zhang 
2019 Oral: Efficient and Thrifty Voting by Any Means Necessary »
Debmalya Mandal · Ariel Procaccia · Nisarg Shah · David Woodruff 
2019 Poster: Optimal Sketching for Kronecker Product Regression and Low Rank Approximation »
Huaian Diao · Rajesh Jayaram · Zhao Song · Wen Sun · David Woodruff 
2019 Poster: Towards a ZeroOne Law for Column Subset Selection »
Zhao Song · David Woodruff · Peilin Zhong 
2018 Poster: Robust Subspace Approximation in a Stream »
Roie Levin · Anish Prasad Sevekari · David Woodruff 
2018 Spotlight: Robust Subspace Approximation in a Stream »
Roie Levin · Anish Prasad Sevekari · David Woodruff 
2018 Poster: On Coresets for Logistic Regression »
Alexander Munteanu · Chris Schwiegelshohn · Christian Sohler · David Woodruff 
2018 Spotlight: On Coresets for Logistic Regression »
Alexander Munteanu · Chris Schwiegelshohn · Christian Sohler · David Woodruff 
2018 Poster: Sublinear Time LowRank Approximation of Distance Matrices »
Ainesh Bakshi · David Woodruff 
2018 Spotlight: Sublinear Time LowRank Approximation of Distance Matrices »
Ainesh Bakshi · David Woodruff