Skip to yearly menu bar Skip to main content

Workshop: OPT 2021: Optimization for Machine Learning

Faking Interpolation Until You Make It

Alasdair Paren · Rudra Poudel · Pawan K Mudigonda


Deep over-parameterized neural networks exhibit the interpolation property on many data sets. That is, these models are able to achieve approximately zero loss on all training samples simultaneously. Recently, this property has been exploited to develop novel optimisation algorithms for this setting. These algorithms use the fact that the optimal loss value is known to employ a variation of a Polyak Step-size calculated on a stochastic batch of data. In this work, we introduce an algorithm that extends this idea to tasks where the interpolation property does not hold. As we no longer have access to the optimal loss values a priori, we instead estimate them for each sample online. To realise this, we introduce a simple but highly effective heuristic for approximating the optimal value based on previous loss evaluations. Through rigorous experimentation we show the effectiveness of our approach, which outperforms adaptive gradient and line search methods.

Chat is not available.