NIPS 2016 Tutorials

Deep Reinforcement Learning Through Policy Optimization

Pieter Abbeel · John Schulman

[ Rooms 211 + 212 ]

Deep Reinforcement Learning (Deep RL) has seen several breakthroughs in recent years. In this tutorial we will focus on recent advances in Deep RL through policy gradient methods and actor critic methods. These methods have shown significant success in a wide range of domains, including continuous-action domains such as manipulation, locomotion, and flight. They have also achieved the state of the art in discrete action domains such as Atari. Fundamentally, there are two types of gradient calculations: likelihood ratio gradients (aka score function gradients) and path derivative gradients (aka perturbation analysis gradients). We will teach policy gradient methods of each type, connect with Actor-Critic methods (which learn both a value function and a policy), and cover a generalized view of the computation of gradients of expectations through Stochastic Computation Graphs.

Learning Objectives:
The objective is to provide attendees with a good understanding of foundations as well as recent advances in policy gradient methods and actor critic methods. Approaches that will be taught: Likelihood Ratio Policy Gradient (REINFORCE), Natural Policy Gradient, Trust Region Policy Optimization (TRPO), Generalized Advantage Estimation (GAE), Asynchronous Advantage Actor Critic (A3C), Path Derivative Policy Gradients, (Deep) Deterministic Policy Gradient (DDPG), Stochastic Value Gradients (SVG), Guided Policy Search …

Crowdsourcing: Beyond Label Generation

Jennifer Wortman Vaughan

[ Area 3 ]

Abstract

This tutorial will showcase some of the most innovative uses of crowdsourcing that have emerged in the past few years. While some have clear and immediate benefits to machine learning, we will also discuss examples in which crowdsourcing has allowed researchers to answer exciting questions in psychology, economics, and other fields.

We will discuss best practices for crowdsourcing (such as how and why to maintain a positive relationship with crowdworkers) and available crowdsourcing tools. We will survey recent research examining the effect of incentives on crowdworker performance. Time permitting, we will also touch on recent ethnographic research studying the community of crowdworkers and/or delve into the ethical implications of crowdsourcing.

Despite the inclusion of best practices and tools, this tutorial should not be viewed as a prescriptive guide for applying existing techniques. The goals of the tutorial are to inspire you to find novel ways of using crowdsourcing in your own research and to provide you with the resources you need to avoid common pitfalls when you do.

Target audience: This tutorial is open to anyone who wants to learn more about cutting edge research in crowdsourcing. No assumptions will be made about the audience's familiarity with either crowdsourcing or …

Variational Inference: Foundations and Modern Methods

David Blei · Shakir Mohamed · Rajesh Ranganath

[ Area 1 + 2 ]

Abstract

One of the core problems of modern statistics and machine learning is to approximate difficult-to-compute probability distributions. This problem is especially important in probabilistic modeling, which frames all inference about unknown quantities as a calculation about a conditional distribution. In this tutorial we review and discuss variational inference (VI), a method a that approximates probability distributions through optimization. VI has been used in myriad applications in machine learning and tends to be faster than more traditional methods, such as Markov chain Monte Carlo sampling. Brought into machine learning in the 1990s, recent advances and easier implementation have renewed interest and application of this class of methods. This tutorial aims to provide both an introduction to VI with a modern view of the field, and an overview of the role that probabilistic inference plays in many of the central areas of machine learning.

The tutorial has three parts. First, we provide a broad review of variational inference from several perspectives. This part serves as an introduction (or review) of its central concepts. Second, we develop and connect some of the pivotal tools for VI that have been developed in the last few years, tools like Monte Carlo gradient estimation, black box …

Theory and Algorithms for Forecasting Non-Stationary Time Series

Vitaly Kuznetsov · Mehryar Mohri

[ Rooms 211 + 212 ]

Abstract

Time series appear in a variety of key real-world applications such as signal processing, including audio and video processing; the analysis of natural phenomena such as local weather, global temperature, and earthquakes; the study of economic variables such as stock values, sales amounts, energy demand; and many other areas. But, while time series forecasting is critical for many applications, it has received little attention in the ML community in recent years, probably due to a lack of familiarity with time series and the fact that standard i.i.d. learning concepts and tools are not readily applicable in that scenario.

This tutorial precisely addresses these and many other related questions. It provides theoretical and algorithmic tools for research related to time series and for designing new solutions. We first present a concise introduction to time series, including basic concepts, common challenges and standard models. Next, we discuss important statistical learning tools and results developed in recent years and show how they are useful for deriving guarantees and designing algorithms both in stationary and non-stationary scenarios. Finally, we show how the online learning framework can be leveraged to derive algorithms that tackle important and notoriously difficult problems including model selection and ensemble methods. …

Nuts and Bolts of Building Applications using Deep Learning

Andrew Ng

[ Area 1 + 2 ]

Abstract

How do you get deep learning to work in your business, product, or scientific study? The rise of highly scalable deep learning techniques is changing how you can best approach AI problems. This includes how you define your train/dev/test split, how you organize your data, how you should think through your search among promising model architectures, and even how you might develop new AI-enabled products. In this tutorial, you’ll learn about the emerging best practices in this nascent area. You’ll come away able to better organize your and your team’s work when developing deep learning applications.

Natural Language Processing for Computational Social Science

Cristian Danescu-Niculescu-Mizil · Lillian Lee

[ Area 3 ]

Abstract

More and more of life is now manifested online, and many of the digital traces that are left by human activity are increasingly recorded in natural-language format. This tutorial will examine the opportunities for natural language processing (NLP) to contribute to computational social science, facilitating our understanding of how humans interact with others at both grand and intimate scales.

Learning Objectives:

Influence and persuasion: Can language choices affect whether a political ad is successful, a social-media post gets more re-shares, or a get-out-the-vote campaign will work?
Language as a reflection of social processes: can we detect status differences, or more broadly, the roles people take in online communities? How does language define collective identity, or signal imminent departure from a community?
Group success: can language cues help us predict whether a group will cohere or fracture? Or whether a betrayal is forthcoming? Or whether a team will succeed at its task?

Target Audience:

Unrestricted

Generative Adversarial Networks

Ian Goodfellow

[ Area 1 + 2 ]

Abstract

Generative adversarial networks (GANs) are a recently introduced class of generative models, designed to produce realistic samples. This tutorial is intended to be accessible to an audience who has no experience with GANs, and should prepare the audience to make original research contributions applying GANs or improving the core GAN algorithms. GANs are universal approximators of probability distributions. Such models generally have an intractable log-likelihood gradient, and require approximations such as Markov chain Monte Carlo or variational lower bounds to make learning feasible. GANs avoid using either of these classes of approximations. The learning process consists of a game between two adversaries: a generator network that attempts to produce realistic samples, and a discriminator network that attempts to identify whether samples originated from the training data or from the generative model. At the Nash equilibrium of this game, the generator network reproduces the data distribution exactly, and the discriminator network cannot distinguish samples from the model from training data. Both networks can be trained using stochastic gradient descent with exact gradients computed by maximum likelihood.

Topics include: - An introduction to the basics of GANs. - A review of work applying GANs to large image generation. - Extending the GAN …

ML Foundations and Methods for Precision Medicine and Healthcare

Suchi Saria · Peter Schulam

[ Area 3 ]

Abstract

Electronic health records and high throughput measurement technologies are changing the practice of healthcare to become more algorithmic and data-driven. This offers an exciting opportunity for machine learning to impact healthcare. A key challenge, however, is the heterogeneity of disease expression across people; a model that works well for one patient may perform very poorly for another. One solution is to build personalized models that blend information from a population and from the current individual to provide tailored inferences.

This tutorial will discuss ideas from machine learning that enable personalization (useful for applications in education, retail, medicine and recommender systems more broadly). The tutorial will focus on applications in healthcare and medicine. We will cover:

Bayesian hierarchical models
Transfer learning and multi-resolution sharing
Functional data analysis
Causal inference and individualized treatment effects
1. Potential outcomes
2. Strategies for adjusting for confounding
3. Sequential and time-varying treatments
4. Bayesian estimation of individualized treatment response
"Causal Risk" and What-if Reasoning
Dynamic treatment regimes
1. Estimating optimal treatment rules
2. Connections to reinforcement learning

Ultimately, the goal is to build individual-specific decision support tools that enable a data-driven understanding of alternative interventions by answering "what if?" questions: e.g. what would happen if I gave this patient drug A vs. …

Large-Scale Optimization: Beyond Stochastic Gradient Descent and Convexity

Suvrit Sra · Francis Bach

[ Rooms 211 + 212 ]

Abstract

Stochastic optimization lies at the heart of machine learning, and its cornerstone is stochastic gradient descent (SGD), a staple introduced over 60 years ago! Recent years have, however, brought an exciting new development: variance reduction (VR) for stochastic methods. These VR methods excel in settings where more than one pass through the training data is allowed, achieving convergence faster than SGD, in theory as well as practice. These speedups underline the huge surge of interest in VR methods; by now a large body of work has emerged, while new results appear regularly! This tutorial brings to the wider machine learning audience the key principles behind VR methods, by positioning them vis-à-vis SGD. Moreover, the tutorial takes a step beyond convexity and covers research-edge results for non-convex problems too, while outlining key points and as yet open challenges.

Learning Objectives:

– Introduce fast stochastic methods to the wider ML audience to go beyond a 60-year-old algorithm (SGD) – Provide a guiding light through this fast moving area, to unify, and simplify its presentation, outline common pitfalls, and to demystify its capabilities – Raise awareness about open challenges in the area, and thereby spur future research

Target Audience;

– Graduate students (masters …