Skip to yearly menu bar Skip to main content


Poster

Multi-Stage Monte Carlo Approximation for Fast Generalized Data Summations

Michael Holmes · Alexander Gray · Charles Isbell


Abstract: In machine learning we often encounter computational bottlenecks in the form of generalized summations over datasets. When the computational cost of these methods is O(n2) or higher, applicability to large datasets is severely limited. We present a multi-stage Monte Carlo method for approximating such summations with probabilistic relative error control. This method differs from many previous scalability techniques (such as tree-based methods) in that its error is stochastic, but we derive conditions for error control and demonstrate that they work. Further, we give a theoretical sample complexity for the method that is independent of dataset size, and show that this appears to hold in experiments, where speedups reach as high as 1014 on datasets with points numbering in the millions.

Live content is unavailable. Log in and register to view live content