Crowdsourcing: Theory, Algorithms and Applications

Workshop

Crowdsourcing: Theory, Algorithms and Applications

Jennifer Wortman Vaughan · Greg Stoddard · Chien-Ju Ho · Adish Singla · Michael Bernstein · Devavrat Shah · Arpita Ghosh · Evgeniy Gabrilovich · Denny Zhou · Nikhil Devanur · Xi Chen · Alexander Ihler · Qiang Liu · Genevieve Patterson · Ashwinkumar Badanidiyuru Varadaraja · Hossein Azari Soufiani · Jacob Whitehill

Harrah's Tahoe A+B

Mon 9 Dec, 7:30 a.m. PST

[ Abstract ] Workshop Website

All machine learning systems are an integration of data that store human or physical knowledge, and algorithms that discover knowledge patterns and make predictions to new instances. Even though most research attention has been focused on developing more efficient learning algorithms, it is the quality and amount of training data that predominately govern the performance of real-world systems. This is only amplified by the recent popularity of large scale and complicated learning systems such as deep networks, which require millions to billions of training data to perform well. Unfortunately, the traditional methods of collecting data from specialized workers are usually expensive and slow. In recent years, however, the situation has dramatically changed with the emergence of crowdsourcing, where huge amounts of labeled data are collected from large groups of (usually online) workers for low or no cost. Many machine learning tasks, such as computer vision and natural language processing are increasingly benefitting from data crowdsourced platforms such as Amazon Mechanical Turk and CrowdFlower. On the other hand, tools in machine learning, game theory and mechanism design can help to address many challenging problems in crowdsourcing systems, such as making them more reliable, efficient and less expensive.

In this workshop, we call attention back to sources of data, discussing cheap and fast data collection methods based on crowdsourcing, and how it could impact subsequent machine learning stages.
Furthermore, we will emphasize how the data sourcing paradigm interacts with the most recent emerging trends of machine learning in NIPS community.

Examples of topics of potential interest in the workshop include (but are not limited to):

Application of crowdsourcing to machine learning.

Reliable crowdsourcing, e.g., label aggregation, quality control.

Optimal budget allocation or active learning in crowdsourcing.

Workflow design and answer aggregation for complex tasks (e.g., machine translation, proofreading).

Pricing and incentives in crowdsourcing markets.

Prediction markets / information markets and its connection to learning.

Theoretical analysis for crowdsourcing algorithms, e.g., error rates and sample complexities for label aggregation and budget allocation algorithms.

Live content is unavailable. Log in and register to view live content