Timezone: »
Poster
AUC optimization and the two-sample problem
Stéphan Clémençon · Nicolas Vayatis · Marine Depecker
The purpose of the paper is to explore the connection between multivariate homogeneity tests and $\auc$ optimization. The latter problem has recently received much attention in the statistical learning literature. From the elementary observation that, in the two-sample problem setup, the null assumption corresponds to the situation where the area under the optimal ROC curve is equal to 1/2, we propose a two-stage testing method based on data splitting. A nearly optimal scoring function in the AUC sense is first learnt from one of the two half-samples. Data from the remaining half-sample are then projected onto the real line and eventually ranked according to the scoring function computed at the first stage. The last step amounts to performing a standard Mann-Whitney Wilcoxon test in the one-dimensional framework. We show that the learning step of the procedure does not affect the consistency of the test as well as its properties in terms of power, provided the ranking produced is accurate enough in the AUC sense. The results of a numerical experiment are eventually displayed in order to show the efficiency of the method.
Author Information
Stéphan Clémençon (Telecom ParisTech)
Nicolas Vayatis (Ecole Normale Supérieure de Cachan)
Marine Depecker (Renault SA-Telecom ParisTech)
More from the Same Authors
-
2021 : Handling Distribution Shift in Tire Design »
Antoine De mathelin · François Deheeger · Mathilde MOUGEOT · Nicolas Vayatis -
2022 : Assessing Performance and Fairness Metrics in Face Recognition - Bootstrap Methods »
Jean-Rémy Conti · Stéphan Clémençon -
2017 Poster: Ranking Data with Continuous Labels through Oriented Recursive Partitions »
Stéphan Clémençon · Mastane Achab -
2015 Poster: SGD Algorithms based on Incomplete U-statistics: Large-Scale Minimization of Empirical Risk »
Guillaume Papa · Stéphan Clémençon · Aurélien Bellet -
2015 Poster: Extending Gossip Algorithms to Distributed Estimation of U-statistics »
Igor Colin · Aurélien Bellet · Joseph Salmon · Stéphan Clémençon -
2015 Spotlight: Extending Gossip Algorithms to Distributed Estimation of U-statistics »
Igor Colin · Aurélien Bellet · Joseph Salmon · Stéphan Clémençon -
2014 Poster: Tight Bounds for Influence in Diffusion Networks and Application to Bond Percolation and Epidemiology »
Remi Lemonnier · Kevin Scaman · Nicolas Vayatis -
2012 Poster: Link Prediction in Graphs with Autoregressive Features »
Emile Richard · Stephane Gaiffas · Nicolas Vayatis -
2011 Poster: On U-processes and clustering performance »
Stéphan Clémençon -
2011 Spotlight: On U-processes and clustering performance »
Stéphan Clémençon -
2010 Poster: Link Discovery using Graph Feature Tracking »
Emile Richard · Nicolas Baskiotis · Theos Evgeniou · Nicolas Vayatis -
2009 Demonstration: Demonstration of the TreeRank Software »
Marine Depecker -
2008 Poster: Empirical performance maximization for linear rank statistics »
Stephan Clémençon · Nicolas Vayatis -
2008 Poster: On Bootstrapping the ROC Curve »
Patrice Bertail · Stephan Clémençon · Nicolas Vayatis -
2008 Poster: Overlaying classifiers: a practical approach for optimal ranking »
Stephan Clémençon · Nicolas Vayatis