Timezone: »
Modern machine learning-based approaches to computer vision require very large databases of labeled images. Some contemporary vision systems already require on the order of millions of images for training (e.g., Omron face detector). While the collection of these large databases is becoming a bottleneck, new Internet-based services that allow labelers from around the world to be easily hired and managed provide a promising solution. However, using these services to label large databases brings with it new theoretical and practical challenges: (1) The labelers may have wide ranging levels of expertise which are unknown a priori, and in some cases may be adversarial; (2) images may vary in their level of difficulty; and (3) multiple labels for the same image must be combined to provide an estimate of the actual label of the image. Probabilistic approaches provide a principled way to approach these problems. In this paper we present a probabilistic model and use it to simultaneously infer the label of each image, the expertise of each labeler, and the difficulty of each image. On both simulated and real data, we demonstrate that the model outperforms the commonly used ``Majority Vote heuristic for inferring image labels, and is robust to both adversarial and noisy labelers.
Author Information
Jacob Whitehill (University of California, San Diego)
Paul L Ruvolo (UC San Diego)
Ting-fan Wu
Jacob Bergsma (University of California San Diego)
javier r movellan (university of california san diego)
More from the Same Authors
-
2013 Workshop: Crowdsourcing: Theory, Algorithms and Applications »
Jennifer Wortman Vaughan · Greg Stoddard · Chien-Ju Ho · Adish Singla · Michael Bernstein · Devavrat Shah · Arpita Ghosh · Evgeniy Gabrilovich · Denny Zhou · Nikhil Devanur · Xi Chen · Alexander Ihler · Qiang Liu · Genevieve Patterson · Ashwinkumar Badanidiyuru Varadaraja · Hossein Azari Soufiani · Jacob Whitehill -
2012 Workshop: Personalizing education with machine learning »
Michael Mozer · javier r movellan · Robert Lindsey · Jacob Whitehill -
2010 Poster: An Alternative to Low-level-Sychrony-Based Methods for Speech Detection »
Paul L Ruvolo · javier r movellan -
2008 Demonstration: Machine Perception for Human Machine Interaction »
Paul L Ruvolo · Marian S Bartlett · Nicholas J Butko · Claudia Lainscsek · Gwendolen C Littlewort · Jacob Whitehill · Tingfan Wu · javier r movellan -
2008 Poster: Optimization on a Budget: A Reinforcement Learning Approach »
Paul L Ruvolo · Ian R Fasel · javier r movellan -
2008 Spotlight: Optimization on a Budget: A Reinforcement Learning Approach »
Paul L Ruvolo · Ian R Fasel · javier r movellan