NeurIPS 2019 Expo Workshop

Dec. 8, 2019

Expo 2019 Schedule »

Real world reinforcement learning with Vowpal Wabbit

Sponsor: Microsoft

Abstract:

Reinforcement learning is increasingly being used to solve real world personalization and optimization scenarios, with online, sample efficient algorithms such as Contextual Bandits. Companies such as Netflix (https://medium.com/netflix-techblog/artwork-personalization-c589f074ad76) and The New York Times (https://open.nytimes.com/how-the-new-york-times-is-experimenting-with-recommendation-algorithms-562f78624d26) are using Contextual Bandits to personalize content and optimize engagement. Across multiple deployments Microsoft uses Contextual Bandits, and recently released the Personalizer Azure Cognitive Service (http://aka.ms/personalizer) which is the world's first real world reinforcement learning service.

Vowpal Wabbit (https://vowpalwabbit.org) is an open source machine learning library, extensively used by industry, and is the first public terascale learning system (https://arxiv.org/abs/1110.4198). It provides fast, scalable machine learning and has unique capabilities such as learning to search, active learning, contextual memory, and extreme multiclass learning. It has a focus on reinforcement learning and provides production ready implementations of Contextual Bandit algorithms. Vowpal Wabbit sees significant innovation as a research to production vehicle for Microsoft Research.

Come and learn about reinforcement learning, Vowpal Wabbit, and applying contextual bandits to problems using Vowpal Wabbit.