Poster
Multi-Stage Monte Carlo Approximation for Fast Generalized Data Summations
Michael Holmes · Alexander Gray · Charles Isbell
[
Abstract
]
Abstract:
In machine learning we often encounter computational bottlenecks in the form of generalized summations over datasets. When the computational cost of these methods is $O(n^2)$ or higher, applicability to large datasets is severely limited. We present a multi-stage Monte Carlo method for approximating such summations with probabilistic relative error control. This method differs from many previous scalability techniques (such as tree-based methods) in that its error is stochastic, but we derive conditions for error control and demonstrate that they work. Further, we give a theoretical sample complexity for the method that is independent of dataset size, and show that this appears to hold in experiments, where speedups reach as high as $10^{14}$ on datasets with points numbering in the millions.
Live content is unavailable. Log in and register to view live content