Finding significant combinations of features in the presence of categorical covariates
Laetitia Papaxanthos · Felipe Llinares-López · Dean Bodenham · Karsten Borgwardt

Mon Dec 05 09:00 AM -- 12:30 PM (PST) @ Area 5+6+7+8 #65 #None

In high-dimensional settings, where the number of features p is typically much larger than the number of samples n, methods which can systematically examine arbitrary combinations of features, a huge 2^p-dimensional space, have recently begun to be explored. However, none of the current methods is able to assess the association between feature combinations and a target variable while conditioning on a categorical covariate, in order to correct for potential confounding effects. We propose the Fast Automatic Conditional Search (FACS) algorithm, a significant discriminative itemset mining method which conditions on categorical covariates and only scales as O(k log k), where k is the number of states of the categorical covariate. Based on the Cochran-Mantel-Haenszel Test, FACS demonstrates superior speed and statistical power on simulated and real-world datasets compared to the state of the art, opening the door to numerous applications in biomedicine.

Author Information

Laetitia Papaxanthos (ETH Zurich)
Felipe Llinares-López (ETH Zurich)
Dean Bodenham (ETH Zurich)
Karsten Borgwardt (ETH Zurich)

Karsten Borgwardt is Professor of Data Mining at ETH Zürich, at the Department of Biosystems located in Basel. His work has won several awards, including the NIPS 2009 Outstanding Paper Award, the Krupp Award for Young Professors 2013 and a Starting Grant 2014 from the ERC-backup scheme of the Swiss National Science Foundation. Since 2013, he is heading the Marie Curie Initial Training Network for "Machine Learning for Personalized Medicine" with 12 partner labs in 8 countries (http://www.mlpm.eu). The business magazine "Capital" listed him as one of the "Top 40 under 40" in Science in/from Germany in 2014, 2015 and 2016. For more information, visit: https://www.bsse.ethz.ch/mlcb

