Timezone: »

Random Reshuffling is Not Always Better
Christopher De Sa

Thu Dec 10 07:30 AM -- 07:40 AM (PST) @ Orals & Spotlights: Optimization/Theory

Many learning algorithms, such as stochastic gradient descent, are affected by the order in which training examples are used. It is often observed that sampling the training examples without-replacement, also known as random reshuffling, causes learning algorithms to converge faster. We give a counterexample to the Operator Inequality of Noncommutative Arithmetic and Geometric Means, a longstanding conjecture that relates to the performance of random reshuffling in learning algorithms (Recht and Ré, "Toward a noncommutative arithmetic-geometric mean inequality: conjectures, case-studies, and consequences," COLT 2012). We use this to give an example of a learning task and algorithm for which with-replacement random sampling actually outperforms random reshuffling.

Author Information

Christopher De Sa (Cornell)

Related Events (a corresponding poster, oral, or spotlight)

More from the Same Authors