Workshop

Learning when test and training inputs have different distributions

Joaquin Quiñonero-Candela ⋅ Masashi Sugiyama ⋅ Anton Schwaighofer ⋅ Neil D Lawrence

Project Page

Abstract

Many machine learning algorithms assume that the training and the test data are drawn from the same distribution. Indeed many of the proofs of statistical consistency, etc., rely on this assumption. However, in practice we are very often faced with the situation where the training and the test data both follow the same conditional distribution, p(y|x), but the input distributions, p(x), differ. For example, principles of experimental design dictate that training data is acquired in a specific manner that bears little resemblance to the way the test inputs may later be generated. The aim of this workshop will be to try and shed light on the kind of situations where explicitly addressing the difference in the input distributions is beneficial, and on what the most sensible ways of doing this are.

Chat is not available.