Skip to yearly menu bar Skip to main content


Orthogonal Random Features

Felix Xinnan Yu · Ananda Theertha Suresh · Krzysztof M Choromanski · Daniel Holtmann-Rice · Sanjiv Kumar

Area 5+6+7+8 #184

Keywords: [ Kernel Methods ] [ Large Scale Learning and Big Data ] [ Nonlinear Dimension Reduction and Manifold Learning ]

Abstract: We present an intriguing discovery related to Random Fourier Features: replacing multiplication by a random Gaussian matrix with multiplication by a properly scaled random orthogonal matrix significantly decreases kernel approximation error. We call this technique Orthogonal Random Features (ORF), and provide theoretical and empirical justification for its effectiveness. Motivated by the discovery, we further propose Structured Orthogonal Random Features (SORF), which uses a class of structured discrete orthogonal matrices to speed up the computation. The method reduces the time cost from $\mathcal{O}(d^2)$ to $\mathcal{O}(d \log d)$, where $d$ is the data dimensionality, with almost no compromise in kernel approximation quality compared to ORF. Experiments on several datasets verify the effectiveness of ORF and SORF over the existing methods. We also provide discussions on using the same type of discrete orthogonal structure for a broader range of kernels and applications.

Live content is unavailable. Log in and register to view live content