Timezone: »

Poster
Diverse Ensemble Evolution: Curriculum Data-Model Marriage
Tianyi Zhou · Shengjie Wang · Jeff Bilmes

Thu Dec 06 02:00 PM -- 04:00 PM (PST) @ Room 517 AB #110
We study a new method (Diverse Ensemble Evolution (DivE$^2$)'') to train an ensemble of machine learning models that assigns data to models at each training epoch based on each model's current expertise and an intra- and inter-model diversity reward. DivE$^2$ schedules, over the course of training epochs, the relative importance of these characteristics; it starts by selecting easy samples for each model, and then gradually adjusts towards the models having specialized and complementary expertise on subsets of the training data, thereby encouraging high accuracy of the ensemble. We utilize an intra-model diversity term on data assigned to each model, and an inter-model diversity term on data assigned to pairs of models, to penalize both within-model and cross-model redundancy. We formulate the data-model marriage problem as a generalized bipartite matching, represented as submodular maximization subject to two matroid constraints. DivE$^2$ solves a sequence of continuous-combinatorial optimizations with slowly varying objectives and constraints. The combinatorial part handles the data-model marriage while the continuous part updates model parameters based on the assignments. In experiments, DivE$^2$ outperforms other ensemble training methods under a variety of model aggregation techniques, while also maintaining competitive efficiency.

#### Author Information

##### Tianyi Zhou (University of Washington, Seattle)

Tianyi Zhou is a 6th-year Ph.D student of Paul G. Allen School of Computer Science and Engineering at University of Washington, Seattle, supervised by Jeff Bilmes and Carlos Guestrin. He has worked with Dacheng Tao at University of Technology Sydney and Nanyang Technological University for 4 years before going to UW. His research covers topics in machine learning, natural language processing, statistics, and data analysis. He has published 30+ papers with 1300+ citations at top conferences and journals including NeurIPS, ICML, ICLR, AISTATS, NAACL, ACM SIGKDD, IEEE ICDM, AAAI, IJCAI, IEEE ISIT, Machine Learning Journal (Springer), DMKD (Springer), IEEE TIP, IEEE TNNLS, etc. He is the recipient of the best student paper award at IEEE ICDM 2013.