Timezone: »
Feature selection is one of the most fundamental problems in machine learning. An extensive body of work on information-theoretic feature selection exists which is based on maximizing mutual information between subsets of features and class labels. Practical methods are forced to rely on approximations due to the difficulty of estimating mutual information. We demonstrate that approximations made by existing methods are based on unrealistic assumptions. We formulate a more flexible and general class of assumptions based on variational distributions and use them to tractably generate lower bounds for mutual information. These bounds define a novel information-theoretic framework for feature selection, which we prove to be optimal under tree graphical models with proper choice of variational distributions. Our experiments demonstrate that the proposed method strongly outperforms existing information-theoretic feature selection approaches.
Author Information
Shuyang Gao (University of Southern California)
Greg Ver Steeg (USC Information Sciences Institute)
Aram Galstyan (USC Information Sciences Institute)
More from the Same Authors
-
2020 Workshop: Deep Learning through Information Geometry »
Pratik Chaudhari · Alexander Alemi · Varun Jog · Dhagash Mehta · Frank Nielsen · Stefano Soatto · Greg Ver Steeg -
2019 Poster: Fast structure learning with modular regularization »
Greg Ver Steeg · Hrayr Harutyunyan · Daniel Moyer · Aram Galstyan -
2019 Spotlight: Fast structure learning with modular regularization »
Greg Ver Steeg · Hrayr Harutyunyan · Daniel Moyer · Aram Galstyan -
2019 Poster: Exact Rate-Distortion in Autoencoders via Echo Noise »
Rob Brekelmans · Daniel Moyer · Aram Galstyan · Greg Ver Steeg -
2018 Poster: Invariant Representations without Adversarial Training »
Daniel Moyer · Shuyang Gao · Rob Brekelmans · Aram Galstyan · Greg Ver Steeg -
2014 Poster: Discovering Structure in High-Dimensional Data Through Correlation Explanation »
Greg Ver Steeg · Aram Galstyan -
2011 Poster: Comparative Analysis of Viterbi Training and Maximum Likelihood Estimation for HMMs »
Armen Allahverdyan · Aram Galstyan