Skip to yearly menu bar Skip to main content


Learning Energy Networks with Generalized Fenchel-Young Losses

Mathieu Blondel · Felipe Llinares-Lopez · Robert Dadashi · Leonard Hussenot · Matthieu Geist

Hall J (level 1) #433

Keywords: [ Structured Prediction ] [ convex analysis ] [ energy networks ] [ Fenchel conjugates ] [ EBMs ]


Energy-based models, a.k.a. energy networks, perform inference by optimizing an energy function, typically parametrized by a neural network. This allows one to capture potentially complex relationships between inputs andoutputs.To learn the parameters of the energy function, the solution to thatoptimization problem is typically fed into a loss function.The key challenge for training energy networks lies in computing loss gradients,as this typically requires argmin/argmax differentiation.In this paper, building upon a generalized notion of conjugate function,which replaces the usual bilinear pairing with a general energy function,we propose generalized Fenchel-Young losses, a natural loss construction forlearning energy networks. Our losses enjoy many desirable properties and theirgradients can be computed efficiently without argmin/argmax differentiation.We also prove the calibration of their excess risk in the case of linear-concaveenergies. We demonstrate our losses on multilabel classification and imitation learning tasks.

Chat is not available.