Skip to yearly menu bar Skip to main content


Poster

Measures of distortion for machine learning

Leena Chennuru Vankadara · Ulrike von Luxburg

Room 517 AB #124

Keywords: [ Unsupervised Learning ] [ Nonlinear Dimensionality Reduction and Manifold Learning ]


Abstract: Given data from a general metric space, one of the standard machine learning pipelines is to first embed the data into a Euclidean space and subsequently apply out of the box machine learning algorithms to analyze the data. The quality of such an embedding is typically described in terms of a distortion measure. In this paper, we show that many of the existing distortion measures behave in an undesired way, when considered from a machine learning point of view. We investigate desirable properties of distortion measures and formally prove that most of the existing measures fail to satisfy these properties. These theoretical findings are supported by simulations, which for example demonstrate that existing distortion measures are not robust to noise or outliers and cannot serve as good indicators for classification accuracy. As an alternative, we suggest a new measure of distortion, called $\sigma$-distortion. We can show both in theory and in experiments that it satisfies all desirable properties and is a better candidate to evaluate distortion in the context of machine learning.

Live content is unavailable. Log in and register to view live content