Timezone: »
Random forests are learning algorithms that build large collections of random trees and make predictions by averaging the individual tree predictions.
In this paper, we consider various tree constructions and examine how the choice of parameters affects the generalization error of the resulting random forests as the sample size goes to infinity.
We show that subsampling of data points during the tree construction phase is important: Forests can become inconsistent with either no subsampling or too severe subsampling.
As a consequence, even highly randomized trees can lead to inconsistent forests if no subsampling is used, which implies that some of the commonly used setups for random forests can be inconsistent.
As a second consequence we can show that trees that have good performance in nearest-neighbor search can be a poor choice for random forests.
Author Information
Cheng Tang (George Washington University)
Damien Garreau (Max Planck Institute)
Ulrike von Luxburg (University of Tübingen)
More from the Same Authors
-
2022 Poster: Interpolation and Regularization for Causal Learning »
Leena Chennuru Vankadara · Luca Rendsburg · Ulrike Luxburg · Debarghya Ghoshdastidar -
2019 Poster: Foundations of Comparison-Based Hierarchical Clustering »
Debarghya Ghoshdastidar · Michaël Perrot · Ulrike von Luxburg -
2018 Poster: Measures of distortion for machine learning »
Leena Chennuru Vankadara · Ulrike von Luxburg -
2018 Poster: Practical Methods for Graph Two-Sample Testing »
Debarghya Ghoshdastidar · Ulrike von Luxburg -
2017 : Ordinal distance comparisons: from topology to geometry »
Ulrike von Luxburg -
2017 Poster: Kernel functions based on triplet comparisons »
Matthäus Kleindessner · Ulrike von Luxburg -
2014 Poster: Metric Learning for Temporal Sequence Alignment »
Rémi Lajugie · Damien Garreau · Francis Bach · Sylvain Arlot -
2013 Poster: Density estimation from unweighted k-nearest neighbor graphs: a roadmap »
Ulrike von Luxburg · Morteza Alamgir -
2011 Workshop: Relations between machine learning problems - an approach to unify the field »
Robert Williamson · John Langford · Ulrike von Luxburg · Mark Reid · Jennifer Wortman Vaughan -
2011 Poster: Phase transition in the family of p-resistances »
Morteza Alamgir · Ulrike von Luxburg -
2011 Spotlight: Phase transition in the family of p-resistances »
Morteza Alamgir · Ulrike von Luxburg -
2010 Spotlight: Getting lost in space: Large sample analysis of the resistance distance »
Ulrike von Luxburg · Agnes Radl · Matthias Hein -
2010 Poster: Getting lost in space: Large sample analysis of the resistance distance »
Ulrike von Luxburg · Agnes Radl · Matthias Hein -
2009 Workshop: Clustering: Science or art? Towards principled approaches »
Margareta Ackerman · Shai Ben-David · Avrim Blum · Isabelle Guyon · Ulrike von Luxburg · Robert Williamson · Reza Zadeh -
2008 Poster: Influence of graph construction on graph-based clustering measures »
Markus M Maier · Ulrike von Luxburg · Matthias Hein -
2008 Oral: Influence of graph construction on graph-based clustering measures »
Markus M Maier · Ulrike von Luxburg · Matthias Hein -
2007 Session: Spotlights »
Ulrike von Luxburg -
2007 Session: Spotlights »
Ulrike von Luxburg -
2007 Spotlight: Consistent Minimization of Clustering Objective Functions »
Ulrike von Luxburg · Sebastien Bubeck · Stefanie S Jegelka · Michael Kaufmann -
2007 Poster: Consistent Minimization of Clustering Objective Functions »
Ulrike von Luxburg · Sebastien Bubeck · Stefanie S Jegelka · Michael Kaufmann