Workshops
Oren Anava · Marco Cuturi · Azadeh Khaleghi · Vitaly Kuznetsov · Sasha Rakhlin

[ Room 117 ]

Data, in the form of time-dependent sequential observations emerge in many key real-world problems, ranging from biological data, financial markets, weather forecasting to audio/video processing. However, despite the ubiquity of such data, most mainstream machine learning algorithms have been primarily developed for settings in which sample points are drawn i.i.d. from some (usually unknown) fixed distribution. While there exist algorithms designed to handle non-i.i.d. data, these typically assume specific parametric form for the data-generating distribution. Such assumptions may undermine the complex nature of modern data which can possess long-range dependency patterns, and for which we now have the computing power to discern. On the other extreme lie on-line learning algorithms that consider a more general framework without any distributional assumptions. However, by being purely-agnostic, common on-line algorithms may not fully exploit the stochastic aspect of time-series data.

Our workshop will build on the success of the first NIPS Time Series Workshop that was held at NIPS 2015. The goal of this workshop is to bring together theoretical and applied researchers interested in the analysis of time series and development of new algorithms to process sequential data. This includes algorithms for time series prediction, classification, clustering, anomaly and change point detection, …

Tomas Mikolov · Baroni Marco · Armand Joulin · Germán Kruszewski · Angeliki Lazaridou · Klemen Simonic

[ Room 212 ]

Recent years have seen the success of machine learning systems, in particular deep learning architectures, on specific challenges such as image classification and playing Go. Nevertheless, machines still fail on hallmarks of human intelligence such as the flexibility to quickly switch between a number of different tasks, the ability to creatively combine previously acquired skills in order to perform a more complex goal, the capacity to learn a new skill from just a few examples, or the use of communication and interaction to extend one's knowledge in order to accomplish new goals. This workshop aims to stimulate theoretical and practical advances in the development of machines endowed with human-like general-purpose intelligence, focusing in particular on benchmarks to train and evaluate progress in machine intelligence. The workshop will feature invited talks by top researchers from machine learning, AI, cognitive science and NLP, who will discuss with the audience their ideas about what are the most pressing issues we face in developing true AI and the best methods to measure genuine progress. We are moreover calling for position statements from interested researchers to complement the workshop program. The workshop will also introduce the new Environment for Communication-Based AI to the research community, …

Adam Lerer · Jiajun Wu · Josh Tenenbaum · Emmanuel Dupoux · Rob Fergus

[ Hilton Diag. Mar, Blrm. C ]

Despite recent progress, AI is still far away from achieving common sense reasoning. One area that is gathering a lot of interest is that of intuitive or naive physics. It concerns the ability that humans and, to a certain extent, infants and animals have to predict outcomes of physical interactions involving macroscopic objects. There is extensive experimental evidence that infants can predict the outcome of events based on physical concepts such as gravity, solidity, object permanence and conservation of shape and number, at an early stage of development, although there is also evidence that this capacity develops through time and experience. Recent work has attempted to build neural models that can make predictions about stability, collisions, forces and velocities from images or videos, or interactions with an environment. Such models could be both used to understand the cognitive and neural underpinning of naive physics in humans, but also to provide with AI applications more better inference and reasoning abilities.

This workshop will bring together researchers in machine learning, computer vision, robotics, computational neuroscience, and cognitive development to discuss artificial systems that capture or model intuitive physics by learning from footage of, or interactions with a real or simulated environment. There …

Adish Singla · Rafael Frongillo · Matteo Venanzi

[ Room 120 + 121 ]

Building systems that seamlessly integrate machine learning (ML) and human intelligence can greatly push the frontier of our ability to solve challenging real-world problems. While ML research usually focuses on developing more efficient learning algorithms, it is often the quality and amount of training data that predominantly govern the performance of real-world systems. This is only amplified by the recent popularity of large scale and complex learning methodologies such as Deep Learning, which can require millions to billions of training instances to perform well. The recent rise of human computation and crowdsourcing approaches, made popular by task-solving platforms like Amazon Mechanical Turk and CrowdFlower, enable us to systematically collect and organize human intelligence. Crowdsourcing research itself is interdisciplinary, combining economics, game theory, cognitive science, and human-computer interaction, to create robust and effective mechanisms and tools. The goal of this workshop is to bring crowdsourcing and ML experts together to explore how crowdsourcing can contribute to ML and vice versa. Specifically, we will focus on the design of mechanisms for data collection and ML competitions, and conversely, applications of ML to complex crowdsourcing platforms.

CROWDSOURCING FOR DATA COLLECTION

Crowdsourcing is one of the most popular approaches to data collection for ML, …

Nick Foti · Tamara Broderick · Trevor Campbell · Michael Hughes · Jeffrey Miller · Aaron Schein · Sinead Williamson · Yanxun Xu

[ AC Barcelona Hotel - Barcelona Room ]

In theory, Bayesian nonparametric (BNP) methods are well suited to the large data sets that arise in the sciences, technology, politics, and other applied fields. By making use of infinite-dimensional mathematical structures, BNP methods allow the complexity of a learned model to grow as the size of a data set grows, exhibiting desirable Bayesian regularization properties for small data sets and allowing the practitioner to learn ever more from larger data sets. These properties have resulted in the adoption of BNP methods across a diverse set of application areas---including, but not limited to, biology, neuroscience, the humanities, social sciences, economics, and finance.

In practice, BNP methods present a number of computational and modeling challenges. Recent work has brought a wide range of models to bear on applied problems, going beyond the Dirichlet process and Gaussian process. Meanwhile, advances in accelerated inference are making these models tractable in big data problems.

In this workshop, we will explore new BNP methods for diverse applied problems, including cutting-edge models being developed by application domain experts. We will also discuss the limitations of existing methods and discuss key problems that need to be solved. A major focus of the workshop will be to expose …

David Lopez-Paz · Leon Bottou · Alec Radford

[ Area 3 ]

In adversarial training, a set of machines learn together by pursuing competing goals. For instance, in Generative Adversarial Networks (GANs, Goodfellow et al., 2014) a generator function learns to synthesize samples that best resemble some dataset, while a discriminator function learns to distinguish between samples drawn from the dataset and samples synthesized by the generator. GANs have emerged as a promising framework for unsupervised learning: GAN generators are able to produce images of unprecedented visual quality, while GAN discriminators learn features with rich semantics that lead to state-of-the-art semi-supervised learning (Radford et al., 2016). From a conceptual perspective, adversarial training is fascinating because it bypasses the need of loss functions in learning, and opens the door to new ways of regularizing (as well as fooling or attacking) learning machines. In this one-day workshop, we invite scientists and practitioners interested in adversarial training to gather, discuss, and establish new research collaborations. The workshop will feature invited talks, a hands-on demo, a panel discussion, and contributed spotlights and posters.

Among the research topics to be addressed by the workshop are

* Novel theoretical insights on adversarial training
* New methods and stability improvements for adversarial optimization
* Adversarial training as a proxy …

Mohammad Rastegari · Matthieu Courbariaux

[ Area 7 + 8 ]

Deep Neural Networks have been revolutionizing several application domains in artificial intelligence: Computer Vision, Speech Recognition and Natural Language Processing. Concurrent to the recent progress in deep learning, significant progress has been happening in virtual reality, augmented reality, and smart wearable devices. These advances create unprecedented opportunities for researchers to tackle fundamental challenges in deploying deep learning systems to portable devices with limited resources (e.g. Memory, CPU, Energy, Bandwidth). Efficient methods in deep learning can have crucial impacts in using distributed systems, embedded devices, and FPGA for several AI tasks. Achieving these goals calls for ground-breaking innovations on many fronts: learning, optimization, computer architecture, data compression, indexing, and hardware design.

This workshop is sponsored by Allen Institute for Artificial Intelligence (AI2). We offer partial travel grant and registration for limited number of people participating in the workshop.

The goal of this workshop is providing a venue for researchers interested in developing efficient techniques for deep neural networks to present new work, exchange ideas, and build connections. The workshop will feature keynotes and invited talks from prominent researchers as well as a poster session that fosters in depth discussion. Further, in a discussion panel the experts discuss about the possible approaches …

Tarek R. Besold · Antoine Bordes · Gregory Wayne · Artur Garcez

[ Hilton Diag. Mar, Blrm. B ]

While early work on knowledge representation and inference was primarily symbolic, the corresponding approaches subsequently fell out of favor, and were largely supplanted by connectionist methods. In this workshop, we will work to close the gap between the two paradigms, and aim to formulate a new unified approach that is inspired by our current understanding of human cognitive processing. This is important to help improve our understanding of Neural Information Processing and build better Machine Learning systems, including the integration of learning and reasoning in dynamic knowledge-bases, and reuse of knowledge learned in one application domain in analogous domains.

The workshop brings together established leaders and promising young scientists in the fields of neural computation, logic and artificial intelligence, knowledge representation, natural language understanding, machine learning, cognitive science and computational neuroscience. Invited lectures by senior researchers will be complemented with presentations based on contributed papers reporting recent work (following an open call for papers) and a poster session, giving ample opportunity for participants to interact and discuss the complementary perspectives and emerging approaches.

The workshop targets a single broad theme of general interest to the vast majority of the NIPS community, namely translations between connectionist models and symbolic knowledge representation …

Leila Wehbe · Marcel Van Gerven · Moritz Grosse-Wentrup · Irina Rish · Brian Murphy · Georg Langs · Guillermo Cecchi · Anwar O Nunez-Elizalde

[ Room 114 ]

This workshop explores the interface between cognitive neuroscience and recent advances in AI fields that aim to reproduce human performance such as natural language processing and computer vision, and specifically deep learning approaches to such problems.



When studying the cognitive capabilities of the brain, scientists follow a system identification approach in which they present different stimuli to the subjects and try to model the response that different brain areas have of that stimulus. The goal is to understand the brain by trying to find the function that expresses the activity of brain areas in terms of different properties of the stimulus. Experimental stimuli are becoming increasingly complex with more and more people being interested in studying real life phenomena such as the perception of natural images or natural sentences. There is therefore a need for a rich and adequate vector representation of the properties of the stimulus, that we can obtain using advances in NLP, computer vision or other relevant ML disciplines.



In parallel, new ML approaches, many of which in deep learning, are inspired to a certain extent by human behavior or biological principles. Neural networks for example were originally inspired by biological neurons. More recently, processes such as …

Miroslav Karny · David H Wolpert · David Rios Insua · Tatiana V. Guy

[ Room 127 + 128 ]

The prescriptive (normative) Bayesian theory of decision making under uncertainty has reached a high level of maturity. The assumption that the decision maker is rational (i.e. that they optimize expected utility, in Savage’s formulation) is central to this theory. However, empirical research indicates that this central assumption is often violated by real decision-makers. This limits the ability of the prescriptive Bayesian theory to provide a descriptive theory of the real world. One of the reasons that have been proposed for why the assumption of rationality might be violated by real decision makers is the limited cognitive and computational resources of those decision makers, [1]-[5]. This workshop intends to inspect this core assumption and to consider possible ways to modify or complement it.

Many of the precise issues related to this theme – some of which will be addressed in the invited talks - can be formulated as questions:

• Does the concept of rationality require Bayesian reasoning?
• Does quantum probability theory (extending classical Kolmogorov probability) provide novel insights into the relation between decision making and cognition?
• Do the extensions of expected utility (which is a linear function of the relevant probabilities) to nonlinear functions of probabilities enhance the …

Isabelle Guyon · Evelyne Viegas · Balázs Kégl · Ben Hamner · Sergio Escalera

[ Room 129 + 130 ]

Challenges in machine learning and data science are competitions running over several weeks or months to resolve problems using provided datasets or simulated environments. The playful nature of challenges naturally attracts students, making challenge a great teaching resource. For this third edition of the CiML workshop at NIPS we want to explore more in depth the opportunities that challenges offer as teaching tools. The workshop will give a large part to discussions around several axes: (1) benefits and limitations of challenges to give students problem-solving skills and teach them best practices in machine learning; (2) challenges and continuous education and up-skilling in the enterprise; (3) design issues to make challenges more effective teaching aids; (4) curricula involving students in challenge design as a means of educating them about rigorous experimental design, reproducible research, and project leadership.
CiML is a forum that brings together workshop organizers, platform providers, and participants to discuss best practices in challenge organization and new methods and application opportunities to design high impact challenges. Following the success of last year's workshop (http://ciml.chalearn.org/), in which a fruitful exchange led to many innovations, we propose to reconvene and discuss new opportunities for challenges in education, one of the hottest …

Vitaly Feldman · Aaditya Ramdas · Aaron Roth · Adam Smith

[ Room 122 + 123 ]

Adaptive data analysis is the increasingly common practice by which insights gathered from data are used to inform further analysis of the same data sets. This is common practice both in machine learning, and in scientific research, in which data-sets are shared and re-used across multiple studies. Unfortunately, most of the statistical inference theory used in empirical sciences to control false discovery rates, and in machine learning to avoid overfitting, assumes a fixed class of hypotheses to test, or family of functions to optimize over, selected independently of the data. If the set of analyses run is itself a function of the data, much of this theory becomes invalid, and indeed, has been blamed as one of the causes of the crisis of reproducibility in empirical science.

Recently, there have been several exciting proposals for how to avoid overfitting and guarantee statistical validity even in general adaptive data analysis settings. The problem is important, and ripe for further advances. The goal of this workshop is to bring together members of different communities (from machine learning, statistics, and theoretical computer science) interested in solving this problem, to share recent results, to discuss promising directions for future research, and to foster collaborations.

Moustapha Cisse · Manik Varma · Samy Bengio

[ Room 111 ]

Extreme classification, where one needs to deal with multi-class and multi-label problems involving a very large number of labels, has opened up a new research frontier in machine learning. Many challenging applications, such as photo or video annotation, web page categorization, gene function prediction, language modeling can benefit from being formulated as supervised learning tasks with millions, or even billions, of labels. Extreme classification can also give a fresh perspective on core learning problems such as ranking and recommendation by reformulating them as multi-class/label tasks where each item to be ranked or recommended is a separate label.

Extreme classification raises a number of interesting research questions including those related to:

* Large scale learning and distributed and parallel training
* Log-time and log-space prediction and prediction on a test-time budget
* Label embedding and tree-based approaches
* Crowd sourcing, preference elicitation and other data gathering techniques
* Bandits, semi-supervised learning and other approaches for dealing with training set biases and label noise
* Bandits with an extremely large number of arms
* Fine-grained classification
* Zero shot learning and extensible output spaces
* Tackling label polysemy, synonymy and correlations
* Structured output prediction and multi-task learning
* Learning from highly …

Hossein Mobahi · Anima Anandkumar · Percy Liang · Stefanie Jegelka · Anna Choromanska

[ Area 5 + 6 ]

A large body of machine learning problems require solving nonconvex optimization. This includes deep learning, Bayesian inference, clustering, and so on. The objective functions in all these instances are highly non-convex, and it is an open question if there are provable, polynomial time algorithms for these problems under realistic assumptions.

A diverse set of approaches have been devised to solve nonconvex problems in a variety of approaches. They range from simple local search approaches such as gradient descent and alternating minimization to more involved frameworks such as simulated annealing, continuation method, convex hierarchies, Bayesian optimization, branch and bound, and so on. Moreover, for solving special class of nonconvex problems there are efficient methods such as quasi convex optimization, star convex optimization, submodular optimization, and matrix/tensor decomposition.

There has been a burst of recent research activity in all these areas. This workshop brings researchers from these vastly different domains and hopes to create a dialogue among them. In addition to the theoretical frameworks, the workshop will also feature practitioners, especially in the area of deep learning who are developing new methodologies for training large scale neural networks. The result will be a cross fertilization of ideas from diverse areas and schools …

David Silver · Satinder Singh · Pieter Abbeel · Peter Chen

[ Area 1 ]

Although the theory of reinforcement learning addresses an extremely general class of learning problems with a common mathematical formulation, its power has been limited by the need to develop task-specific feature representations. A paradigm shift is occurring as researchers figure out how to use deep neural networks as function approximators in reinforcement learning algorithms; this line of work has yielded remarkable empirical results in recent years. This workshop will bring together researchers working at the intersection of deep learning and reinforcement learning, and it will help researchers with expertise in one of these fields to learn about the other.

Thore Graepel · Marc Lanctot · Joel Leibo · Guy Lever · Janusz Marecki · Frans Oliehoek · Karl Tuyls · Vicky Holgate

[ Room 133 + 134 ]

We live in a multi-agent world and to be successful in that world, agents, and in particular, artificially intelligent agents, will need to learn to take into account the agency of others. They will need to compete in market places, cooperate in teams, communicate with others, coordinate their plans, and negotiate outcomes. Examples include self-driving cars interacting in traffic, personal assistants acting on behalf of humans and negotiating with other agents, swarms of unmanned aerial vehicles, financial trading systems, robotic teams, and household robots.

Furthermore, the evolution of human intelligence itself presumably depended on interaction among human agents, possibly starting out with confrontational scavenging [1] and culminating in the evolution of culture, societies, and language. Learning from other agents is a key feature of human intelligence and an important field of research in machine learning [2]. It is therefore conceivable that exposing learning AI agents to multi-agent situations is necessary for their development towards intelligence.

We can also think of multi-agent systems as a design philosophy for complex systems. We can analyse complex systems in terms of agents at multiple scales. For example, we can view the system of world politics as an interaction of nation state agents, nation states …

Tamara Broderick · Stephan Mandt · James McInerney · Dustin Tran · David Blei · Kevin Murphy · Andrew Gelman · Michael I Jordan

[ Room 112 ]

Bayesian analysis has seen a resurgence in machine learning, expanding its scope beyond traditional applications. Increasingly complex models have been trained with large and streaming data sets, and they have been applied to a diverse range of domains. Key to this resurgence has been advances in approximate Bayesian inference. Variational and Monte Carlo methods are currently the mainstay techniques, where recent insights have improved their approximation quality, provided black box strategies for fitting many models, and enabled scalable computation.

In this year's workshop, we would like to continue the theme of approximate Bayesian inference with additional emphases. In particular, we encourage submissions not only advancing approximate inference but also regarding (1) unconventional inference techniques, with the aim to bring together diverse communities; (2) software tools for both the applied and methodological researcher; and (3) challenges in applications, both in non-traditional domains and when applying these techniques to advance current domains.

Uri Shalit · Marzyeh Ghassemi · Jason Fries · Rajesh Ranganath · Theofanis Karaletsos · David Kale · Peter Schulam · Madalina Fiterau

[ Room 116 ]

The last decade has seen unprecedented growth in the availability and size of digital health data, including electronic health records, genetics, and wearable sensors. These rich data sources present opportunities to develop and apply machine learning methods to enable precision medicine. The aim of this workshop is to engender discussion between machine learning and clinical researchers about how statistical learning can enhance both the science and the practice of medicine.

Of particular interest to this year’s workshop is a phrase recently coined by the British Medical Journal, "Big Health Data", where the focus is on modeling and improving health outcomes across large numbers of patients with diverse genetic, phenotypic, and environmental characteristics. The majority of clinical informatics research has focused on narrow populations representing, for example, patients from a single institution or sharing a common disease, and on modeling clinical factors, such as lab test results and treatments. Big health considers large and diverse cohorts, often reaching over 100 million patients in size, as well as environmental factors that are known to impact health outcomes, including socioeconomic status, health care delivery and utilization, and pollution. Big Health Data problems pose a variety of challenges for standard statistical learning, many of …

Kory Mathewson @korymath · Kaushik Subramanian · Mark Ho · Robert Loftin · Joseph L Austerweil · Anna Harutyunyan · Doina Precup · Layla El Asri · Matthew Gombolay · Jerry Zhu · Sonia Chernova · Charles Isbell · Patrick M Pilarski · Weng-Keen Wong · Manuela Veloso · Julie A Shah · Matthew Taylor · Brenna Argall · Michael Littman

[ Hilton Diag. Mar, Blrm. A ]

Interactive machine learning (IML) explores how intelligent agents solve a task together, often focusing on adaptable collaboration over the course of sequential decision making tasks. Past research in the field of IML has investigated how autonomous agents can learn to solve problems more effectively by making use of interactions with humans. Designing and engineering fully autonomous agents is a difficult and sometimes intractable challenge. As such, there is a compelling need for IML algorithms that enable artificial and human agents to collaborate and solve independent or shared goals. The range of real-world examples of IML spans from web applications such as search engines, recommendation systems and social media personalization, to dialog systems and embodied systems such as industrial robots and household robotic assistants, and to medical robotics (e.g. bionic limbs, assistive devices, and exoskeletons). As intelligent systems become more common in industry and in everyday life, the need for these systems to interact with and learn from the people around them will also increase.

This workshop seeks to brings together experts in the fields of IML, reinforcement learning (RL), human-computer interaction (HCI), robotics, cognitive psychology and the social sciences to share recent advances and explore the future of IML. Some …

Nikhil Rao · Prateek Jain · Hsiang-Fu Yu · Ming Yuan · Francis Bach

[ Area 2 ]

Several applications necessitate learning a very large number of parameters from small amounts of data, which can lead to overfitting, statistically unreliable answers, and large training/prediction costs. A common and effective method to avoid the above mentioned issues is to restrict the parameter-space using specific structural constraints such as sparsity or low rank. However, such simple constraints do not fully exploit the richer structure which is available in several applications and is present in the form of correlations, side information or higher order structure. Designing new structural constraints requires close collaboration between domain experts and machine learning practitioners. Similarly, developing efficient and principled algorithms to learn with such constraints requires further collaborations between experts in diverse areas such as statistics, optimization, approximation algorithms etc. This interplay has given rise to a vibrant area of "learning with structure in high dimensions". The goal of this workshop is to bring together the aforementioned diverse set of people who have worked in these areas and encourage discussions with an aim to help define the current frontiers for the area and initiate a discussion about meaningful and challenging problems that require attention.

Alyson Fletcher · Eva Dyer · Jascha Sohl-Dickstein · Joshua T Vogelstein · Konrad Koerding · Jakob H Macke

[ Room 211 ]

The goal of this workshop is to bring together researchers from neuroscience, deep learning, machine learning, computer science theory, and statistics for a rich discussion about how computer science and neuroscience can inform one another as these two fields rapidly move forward. We invite high quality submissions and discussion on topics including, but not limited to, the following fundamental questions: a) shared approaches for analyzing biological and artificial neural systems, b) how insights and challenges from neuroscience can inspire progress in machine learning, and c) methods for interpreting the revolutionary large scale datasets produced by new experimental neuroscience techniques.

Experimental methods for measuring neural activity and structure have undergone recent revolutionary advances, including in high-density recording arrays, population calcium imaging, and large-scale reconstructions of anatomical circuitry. These developments promise unprecedented insights into the collective dynamics of neural populations and thereby the underpinnings of brain-like computation. However, these next-generation methods for measuring the brain’s architecture and function produce high-dimensional, large scale, and complex datasets, raising challenges for analysis. What are the machine learning and analysis approaches that will be indispensable for analyzing these next-generation datasets? What are the computational bottlenecks and challenges that must be overcome?

In parallel to experimental progress …

Li Erran Li · Trevor Darrell

[ Room 124 + 125 ]

Our transportation systems are poised for a transformation as we make progress on autonomous vehicles, vehicle-to-vehicle (V2V) and vehicle-to-everything (V2X) communication infrastructures, and smart road infrastructures such as smart traffic lights. There are many challenges in transforming our current transportation systems to the future vision. For example, how do we achieve near-zero fatality? How do we optimize efficiency through intelligent traffic management and control of fleets? How do we optimize for traffic capacity during rush hours? To meet these requirements in safety, efficiency, control, and capacity, the systems must be automated with intelligent decision making.

Machine learning will be essential to enable intelligent transportation systems. Machine learning has made rapid progress in self-driving, e.g. real-time perception and prediction of traffic scenes, and has started to be applied to ride-sharing platforms such as Uber (e.g. demand forecasting) and crowd-sourced video scene analysis companies such as Nexar (understanding and avoiding accidents). To address the challenges arising in our future transportation system such as traffic management and safety, we need to consider the transportation systems as a whole rather than solving problems in isolation. New machine learning solutions are needed as transportation places specific requirements such as extremely low tolerance on uncertainty and …

Borja Balle · Aurélien Bellet · David Evans · Adrià Gascón

[ Room 131 + 132 ]

The workshop focuses on the problem of privacy-preserving machine learning in scenarios where sensitive datasets are distributed across multiple data owners. Such distributed scenarios occur quite often in practice, for example when different parties contribute different records to a dataset, or information about each record in the dataset is held by different owners. Different communities have developed approaches to deal with this problem, including differential privacy-like techniques where noisy sketches are exchanged between the parties, homomorphic encryption where operations are performed on encrypted data, and tailored approaches using techniques from the field of secure multi-party computation. The workshop will serve as a forum to unify different perspectives on this problem and explore the relative merits of each approach. The workshop will also serve as a venue for networking researchers from the machine learning and secure multi-party computation communities interested in private learning, and foster fruitful long-term collaborations.

The workshop will have a particular emphasis in the decentralization aspect of privacy-preserving machine learning. This includes a large number of realistic scenarios where the classical setup of differential privacy with a "trusted curator" that prepares the data cannot be directly applied. The problem of privacy-preserving computation gains relevance in this model, and …

Dylan Hadfield-Menell · Adrian Weller · David Duvenaud · Jacob Steinhardt · Percy Liang

[ Room 113 ]

When will a system that has performed well in the past continue to do so in the future? How do we design such systems in the presence of novel and potentially adversarial input distributions? What techniques will let us safely build and deploy autonomous systems on a scale where human monitoring becomes difficult or infeasible? Answering these questions is critical to guaranteeing the safety of emerging high stakes applications of AI, such as self-driving cars and automated surgical assistants. This workshop will bring together researchers in areas such as human-robot interaction, security, causal inference, and multi-agent systems in order to strengthen the field of reliability engineering for machine learning systems. We are interested in approaches that have the potential to provide assurances of reliability, especially as systems scale in autonomy and complexity. We will focus on four aspects — robustness (to adversaries, distributional shift, model mis-specification, corrupted data); awareness (of when a change has occurred, when the model might be mis-calibrated, etc.); adaptation (to new situations or objectives); and monitoring (allowing humans to meaningfully track the state of the system). Together, these will aid us in designing and deploying reliable machine learning systems.

Fisher Yu · Joseph Lim · Matthew D Fisher · Qixing Huang · Jianxiong Xiao

[ Room 115 ]

Deep learning is proven to be a powerful tool to build models for language (one-dimensional) and image (two-dimensional) understanding. Tremendous efforts have been devoted into these areas, however, it is still at the early stage to apply deep learning to 3D data, despite their great research values and broad real-world applications. In particular, existing methods poorly serve the three-dimensional data that drives a broad range of critical applications such as augmented reality, autonomous driving, graphics, robotics, medical imaging, neuroscience, and scientific simulations. These problems have drawn attention of researchers in different fields such as neuroscience, computer vision and graphics.

Different from text or images that can be naturally represented as 1D or 2D arrays, 3D data have multiple representation candidates, such as volumes, polygonal meshes, multi-views renderings, depth maps, and point clouds. Coupled with these representations are the myriad 3D learning problems, such as object recognition, scene layout estimation, compositional structure parsing, novel view synthesis, model completion and hallucination, etc. 3D data opens new and vast research space, which naturally calls for interdisciplinary expertise ranging from Computer Vision, Computer Graphics, to Machine Learning.

The goal of this workshop is to foster interdisciplinary communication of researchers working on 3D data (Computer …

Andrew Wilson · Been Kim · William Herlands

[ AC Barcelona, Sagrada Familia ]

Complex machine learning models, such as deep neural networks, have recently achieved great predictive successes for visual object recognition, speech perception, language modelling, and information retrieval. These predictive successes are enabled by automatically learning expressive features from the data. Typically, these learned features are a priori unknown, difficult to engineer by hand, and hard to interpret. This workshop is about interpreting the structure and predictions of these complex models.

Interpreting the learned features and the outputs of complex systems allows us to more fundamentally understand our data and predictions, and to build more effective models. For example, we may build a complex model to predict long range crime activity. But by interpreting the learned structure of the model, we can gain new insights into the processing driving crime events, enabling us to develop more effective public policy. Moreover, if we learn, for example, that the model is making good predictions by discovering how the geometry of clusters of crime events affect future activity, we can use this knowledge to design even more successful predictive models.

This 1 day workshop is focused on interpretable methods for machine learning, with an emphasis on the ability to learn structure which provides new fundamental …

Susannah Odell · Peter Donnelly · Jessica Montgomery · Sabine Hauert · Zoubin Ghahramani · Katherine Gorman

[ VIP Room ]

The Royal Society is currently carrying out a major programme of work on machine learning, to assess its potential over the next 5-10 years, barriers to realising that potential, and the legal, ethical, social and scientific questions which arise as machine learning becomes more pervasive.
As part of this work, the Royal Society has carried out a public dialogue exercise to explore public awareness of, and attitudes towards, machine learning and its applications. The results of this work illustrate some of the key questions people have about machine learning; about why it is used, for what purpose, and with what pattern of benefits and disbenefits. It draws attention to the need to enable informed public debate that engages with specific applications.

In addition, machine learning is put to use in a range of different applications, it reframes existing social and ethical challenges, such as those relating to privacy and stereotyping, and also creates new challenges, such as interpretability, robustness and human-machine interaction. Many of these form the basis of active and stimulating areas of research, which can both move forward the field of machine learning and help address key governance issues.

The UK’s experience with other emerging technologies shows that …

Elmar Rueckert · Martin Riedmiller

[ VIP Room ]

Workshop webpage: http://www.neurorobotic.eu

Modern robots are complex machines with many compliant actuators and various types of sensors including depth and vision cameras, tactile electrodes and dozens of proprioceptive sensors. The obvious challenges are to process these high dimensional input patterns, memorize low dimensional representations of them and to generate the desired motor commands to interact in dynamically changing environments. Similar challenges exist in brain machine interfaces (BMIs) where complex prostheses with perceptional feedback are controlled, or in motor neuroscience where in addition cognitive features need to be considered. Despite this broad research overlap the developments happened mainly in parallel and were not ported or exploited in the related domains. The main bottleneck for collaborative studies has been a lack of interaction between the core robotics, the machine learning and the neuroscience communities.

Why is it now just the right time for interactions?

- Latest developments based on deep neural networks have advanced the capabilities of robotic systems by learning control policies directly from the high dimensional sensor readings.
- Many variants of networks have been recently developed including the integration of feedback through recurrent connections, the projection to different feature spaces, may be trained at different time scales and can …

Ricardo Silva · John Shawe-Taylor · Adith Swaminathan · Thorsten Joachims

[ Room 133 + 134 ]

One of the promises of Big Data is its potential to answer “what if?” questions in digital, natural and social systems. Whether we speak of genetic interactions in a cell, passengers commuting in railways and roads, recommender systems matching users to ads, or understanding contagion in social networks, such systems are composed of many interacting components that suggest that learning to control them or understanding the effect of shocks to a system is not an easy task. What if some railways are closed, what will passengers do? What if we incentivize a member of a social network to propagate an idea, how influential can they be? What if some genes in a cell are knocked-out, which phenotypes can we expect?

Such questions need to be addressed via a combination of experimental and observational data, and require a careful approach to modelling heterogeneous datasets and structural assumptions concerning the causal relations among the components of the system. The workshop is aimed at bringing together research expertise from a variety of communities in machine learning, statistics, engineering, and the social, medical and natural sciences. It is an opportunity for methods for causal inference, reinforcement learning and game theory to be cross-fertilized with …

[ Room 211 ]

The goal of this workshop is to bring together researchers from neuroscience, deep learning, machine learning, computer science theory, and statistics for a rich discussion about how computer science and neuroscience can inform one another as these two fields rapidly move forward. We invite high quality submissions and discussion on topics including, but not limited to, the following fundamental questions: a) shared approaches for analyzing biological and artificial neural systems, b) how insights and challenges from neuroscience can inspire progress in machine learning, and c) methods for interpreting the revolutionary large scale datasets produced by new experimental neuroscience techniques.

Experimental methods for measuring neural activity and structure have undergone recent revolutionary advances, including in high-density recording arrays, population calcium imaging, and large-scale reconstructions of anatomical circuitry. These developments promise unprecedented insights into the collective dynamics of neural populations and thereby the underpinnings of brain-like computation. However, these next-generation methods for measuring the brain’s architecture and function produce high-dimensional, large scale, and complex datasets, raising challenges for analysis. What are the machine learning and analysis approaches that will be indispensable for analyzing these next-generation datasets? What are the computational bottlenecks and challenges that must be overcome?

In parallel to experimental progress …

Hal Daumé III · Paul Mineiro · Amanda Stent · Jason E Weston

[ Hilton Diag. Mar, Blrm. C ]

Humans conversing naturally with machines is a staple of science fiction. Building agents capable of mutually coordinating their states and actions via communication, in conjunction with human agents, would be one of the Average engineering feats of human history. In addition to the tremendous economic potential of this technology, the ability to converse appears intimately related to the overall goal of AI.

Although dialogue has been an active area within the linguistics and NLP communities for decades, the wave of optimism in the machine learning community has inspired increased interest from researchers, companies, and foundations. The NLP community has enthusiastically embraced and innovated neural information processing systems, resulting in substantial relevant activity published outside of NIPS. A forum for increased interaction (dialogue!) with these communities at NIPS will accelerate creativity and progress.

We plan to focus on the following issues:

1. How to be data-driven
a. What are tractable and useful intermediate tasks on the path to truly conversant machines? How can we leverage existing benchmark tasks and competitions? What design criteria would we like to see for the next set of benchmark tasks and competitions?
b. How do we assess performance? What can and cannot be done with offline …

Florin Popescu · Sergio Escalera · Xavier Baró · Stephane Ayache · Isabelle Guyon

[ Hilton Diag. Mar, Blrm. B ]

A crucial, high impact application of learning systems is forecasting. While machine learning has already been applied to time series analysis and signal processing, the recent big data revolution allows processing and prediction of vast data flows and forecasting of high dimensional, spatiotemporal series using massive multi-modal streams as predictors. Wider data bandwidths allow machine learning techniques such as connectionist and deep learning methods to assist traditional forecasting methods from fields such as engineering and econometrics, while probabilistic methods are uniquely suited to address the stochastic nature of many processes requiring forecasting.

This workshop will bring together multi-disciplinary researchers from signal processing, statistics, machine learning, computer vision, economics and causality looking to widen their application or methodological scope. It will begin by providing a forum to discuss pressing application areas o forecasting: video compression and understanding, energy and and smart grid management, economics and finance, environmental and health policy (e.g. epidemiology), as well as introduce challenging new datasets. A large dataset, created for an industry-driven data competition, will be presented - this dataset not only helps develop and compare new methods for forecasting, but also addresses deeper underlying learning theory questions: do effective learning systems truly infer underlying structure or …

Manohar Paluri · Lorenzo Torresani · Gal Chechik · Dario Garcia · Du Tran

[ Room 111 ]

Computer Vision is a mature field with long history of academic research, but recent advances in deep learning provided machine learning models with new capabilities to understand visual content. There have been tremendous improvements on problems like classification, detection, segmentation, which are basic proxies for the ability of a model to understand the visual content. These are accompanied by a steep rise of Computer Vision adoption in industry at scale, and by more complex tasks such as Image Captioning and Visual Q&A. These go well beyond the classical problems and open the doors to a whole new world of possibilities. As industrial applications mature, the challenges slowly shift towards challenges in data, in scale, and in moving from purely visual data to multi-modal data.

The unprecedented adoption of Computer Vision to numerous real world applications processing billions of "live" media content daily, raises a new set of challenges, including:

1. Efficient Data Collection (Smart sampling, weak annotations, ...)
2. Evaluating performance in the wild (long tails, embarrassing mistakes, calibration)
3. Incremental learning: Evolve systems incrementally in complex environments (new data, new categories, federated architectures ...)
4. Handling tradeoffs: Computation vs Accuracy vs Supervision
5. Outputs are various types (Binary predictions, …

Aparna Lakshmiratan · Li Erran Li · Siddhartha Sen · Sarah Bird · Hussein Mehanna

[ Room 116 ]

A new area is emerging at the intersection of machine learning (ML) and systems design. This birth is driven by the explosive growth of diverse applications of ML in production, the continued growth in data volume, and the complexity of large-scale learning systems. Addressing the challenges in this intersection demands a combination of the right abstractions -- for algorithms, data structures, and interfaces -- as well as scalable systems capable of addressing real world learning problems.

Designing systems for machine learning presents new challenges and opportunities over the design of traditional data processing systems. For example, what is the right abstraction for data consistency in the context of parallel, stochastic learning algorithms? What guarantees of fault tolerance are needed during distributed learning? The statistical nature of machine learning offers an opportunity for more efficient systems but requires revisiting many of the challenges addressed by the systems and database communities over the past few decades. Machine learning focused developments in distributed learning platforms, programming languages, data structures, general purpose GPU programming, and a wide variety of other domains have had and will continue to have a large impact in both academia and industry.

As the relationship between the machine learning and …

Sander M Bohte · Thomas Nowotny · Cristina Savin · Davide Zambrano

[ Room 122 + 123 ]

Despite remarkable computational success, artificial neural networks ignore the spiking nature of neural communication that is fundamental for biological neuronal networks. Understanding how spiking neurons process information and learn remains an essential challenge. It concerns not only neuroscientists studying brain function, but also neuromorphic engineers developing low-power computing architectures, or machine learning researchers devising new biologically-inspired learning algorithms. Unfortunately, despite a joint interest in spike-based computation, the interactions between these subfields remains limited. The workshop aims to bring them together and to foster the exchange between them by focusing on recent developments in efficient neural coding and spiking neurons' computation. The discussion will center around critical questions in the field, such as "what are the underlying paradigms?" "what are the fundamental constraints?", and "what are the measures for progress?”, that benefit from varied perspectives. The workshop will combine invited talks reviewing the state-of-the-art and short contributed presentations; it will conclude with a panel discussion.

Maren Mahsereci · Alex Davies · Philipp Hennig

[ Area 2 ]

http://www.probabilistic-numerics.org/meetings/NIPS2016/

Optimization problems in machine learning have aspects that make them more challenging than the traditional settings, like stochasticity, and parameters with side-effects (e.g., the batch size and structure). The field has invented many different approaches to deal with these demands. Unfortunately - and intriguingly - this extra functionality seems to invariably necessitate the introduction of tuning parameters: step sizes, decay rates, cycle lengths, batch sampling distributions, and so on. Such parameters are not present, or at least not as prominent, in classic optimization methods. But getting them right is frequently crucial, and necessitates inconvenient human “babysitting”.

Recent work has increasingly tried to eliminate such fiddle factors, typically by statistical estimation. This also includes automatic selection of external parameters like the batch-size or -structure, which have not traditionally been treated as part of the optimization task. Several different strategies have now been proposed, but they are not always compatible with each other, and lack a common framework that would foster both conceptual and algorithmic interoperability. This workshop aims to provide a forum for the nascent community studying automating parameter-tuning in optimization routines.

Among the questions to be addressed by the workshop are:

* Is the prominence of tuning parameters a …

Razvan Pascanu · Mark Ring · Tom Schaul

[ Area 7 + 8 ]

Humans have the extraordinary ability to learn continually from experience. Not only can we apply previously learned knowledge and skills to new situations, we can also use these as the foundation for later learning. One of the grand goals of AI is building an artificial "continual learning" agent that constructs a sophisticated understanding of the world from its own experience, through the autonomous incremental development of ever more complex skills and knowledge.

Hallmarks of continual learning include: interactive, incremental, online learning (learning occurs at every moment, with no fixed tasks or data sets); hierarchy or compositionality (previous learning can become the foundation far later learning); "isolaminar" construction (the same algorithm is used at all stages of learning); resistance to catastrophic forgetting (new learning does not destroy old learning); and unlimited temporal abstraction (both knowledge and skills may refer to or span arbitrary periods of time).

Continual learning is an unsolved problem which presents particular difficulties for the deep-architecture approach that is currently the favored workhorse for many applications. Some strides have been made recently, and many diverse research groups have continual learning on their road map. Hence we believe this is an opportune moment for a workshop focusing on this …

Fabrizio Costa · Thomas Gärtner · Andrea Passerini · Francois Pachet

[ Room 127 + 128 ]

In many real-world applications, machine learning algorithms are employed as a tool in a ''constructive process''. These processes are similar to the general knowledge-discovery process but have a more specific goal: the construction of one-or-more domain elements with particular properties. In this workshop we want to bring together domain experts employing machine learning tools in constructive processes and machine learners investigating novel approaches or theories concerning constructive processes as a whole. Interesting applications include but are not limited to: image synthesis, drug and protein design, computational cooking, generation of art (paintings, music, poetry). Interesting approaches include but are not limited to: deep generative learning, active approaches to structured output learning, transfer or multi-task learning of generative models, active search or online optimization over relational domains, and learning with constraints.

Many of the applications of constructive machine learning, including the ones mentioned above, are primarily considered in their respective application domain research area but are hardly present at machine learning conferences. By bringing together domain experts and machine learners working on constructive ML, we hope to bridge this gap between the communities.

John Hershey · Philemon Brakel

[ Hilton Diag. Mar, Blrm. A ]

This workshop focuses on recent advances to end-to-end methods for speech and more general audio processing. Deep learning has transformed the state of the art in speech recognition, and audio analysis in general. In recent developments, new deep learning architectures have made it possible to integrate the entire inference process into an end-to-end system. This involves solving problems of an algorithmic nature, such as search over time alignments between different domains, and dynamic tracking of changing input conditions. Topics include automatic speech recognition systems (ASR) and other audio procssing systems that subsume front-end adaptive microphone array processing and source separation as well as back-end constructs such as phonetic context dependency, dynamic time alignment, or phoneme to grapheme modeling. Other end-to-end audio applications include speaker diarization, source separation, and music transcription. A variety of architectures have been proposed for such systems, ranging from shift-invariant convolutional pooling to connectionist temporal classification (CTC) and attention based mechanisms, or other novel dynamic components. However there has been little comparison yet in the literature of the relative merits of the different approaches. This workshop delves into questions about how different approaches handle various trade-offs in terms of modularity and integration, in terms of representation and …

Suvrit Sra · Francis Bach · Sashank J. Reddi · Niao He

[ Room 112 ]

As the ninth in its series, OPT 2016 builds on remarkable precedent established by the highly successful series of workshops: OPT 2008--OPT 2015, which have been instrumental in bridging the OPT and ML communities closer together.

The previous OPT workshops enjoyed packed to overpacked attendance. This huge interest is no surprise: optimization is the 2nd largest topic at NIPS and is indeed foundational for the wider ML community.

Looking back over the past decade, a strong trend is apparent: The intersection of OPT and ML has grown monotonically to the point that now several cutting-edge advances in optimization arise from the ML community. The distinctive feature of optimization within ML is its departure from textbook approaches, in particular, by having a different set of goals driven by “big-data,” where both models and practical implementation are crucial.

This intimate relation between OPT and ML is the core theme of our workshop. We wish to use OPT2016 as a platform to foster discussion, discovery, and dissemination of the state-of-the-art in optimization as relevant to machine learning. And even beyond that, as a platform to identify new directions and challenges that will drive future research.

How OPT differs from other related workshops:

Compared …

Charles Sutton · James Geddes · Zoubin Ghahramani · Padhraic Smyth · Chris Williams

[ Room 114 ]

Machine learning methods have applied beyond their origins in artificial intelligence to a wide variety of data analysis problems in fields such as science, health care, technology, and commerce. Previous research in machine learning, perhaps motivated by its roots in AI, has primarily aimed at fully-automated approaches for prediction problems. But predictive analytics is only one step in the larger pipeline of data science, which includes data wrangling, data cleaning, exploratory visualization, data integration, model criticism and revision, and presentation of results to domain experts.


An emerging strand of work aims to address all of these challenges in one stroke is by automating a greater portion of the full data science pipeline. This workshop will bring together experts in machine learning, data mining, databases and statistics to discuss the challenges that arise in the full end-to-end process of collecting data, analysing data, and making decisions and building new methods that support, whether in an automated or semi-automated way, more of the full process of analysing real data.


Considering the full process of data science raises interesting questions for discussion, such as: What aspects of data analysis might potentially be automated and what aspects seem more difficult? Statistical model building often …

Gerald Quon · Sara Mostafavi · James Y Zou · Barbara Engelhardt · Oliver Stegle · Nicolo Fusi

[ Room 212 ]

The field of computational biology has seen dramatic growth over the past few years. A wide range of high-throughput technologies developed in the last decade now enable us to measure parts of a biological system at various resolutions—at the genome, epigenome, transcriptome, and proteome levels. These technologies are now being used to collect data for an ever-increasingly diverse set of problems, ranging from classical problems such as predicting differentially regulated genes between time points and predicting subcellular localization of RNA and proteins, to models that explore complex mechanistic hypotheses bridging the gap between genetics and disease, population genetics and transcriptional regulation. Fully realizing the scientific and clinical potential of these data requires developing novel supervised and unsupervised learning methods that are scalable, can accommodate heterogeneity, are robust to systematic noise and confounding factors, and provide mechanistic insights.

The goals of this workshop are to i) present emerging problems and innovative machine learning techniques in computational biology, and ii) generate discussion on how to best model the intricacies of biological data and synthesize and interpret results in light of the current work in the field. We will invite several leaders at the intersection of computational biology and machine learning who will …

Viren Jain · Srinivas C Turaga

[ Room 131 + 132 ]

The "wiring diagram" of essentially all nervous systems remains unknown due to the extreme difficulty of measuring detailed patterns of synaptic connectivity of entire neural circuits. At this point, the major bottleneck is in the analysis of tera or peta-voxel 3d electron microscopy image data in which neuronal processes need to be traced and synapses localized in order for connectivity information to be inferred. This presents an opportunity for machine learning and machine perception to have a fundamental impact on advances in neurobiology. However, it also presents a major challenge, as existing machine learning methods fall short of solving the problem.
The goal of this workshop is to bring together researchers in machine learning and neuroscience to discuss progress and remaining challenges in this exciting and rapidly growing field. We aim to attract machine learning and computer vision specialists interested in learning about a new problem, as well as computational neuroscientists at NIPS who may be interested in modeling connectivity data. We will discuss the release of public datasets and competitions that may facilitate further activity in this area. We expect the workshop to result in a significant increase in the scope of ideas and people engaged in this field.

Aaditya Ramdas · Arthur Gretton · Bharath Sriperumbudur · Han Liu · John Lafferty · Samory Kpotufe · Zoltán Szabó

[ Room 120 + 121 ]

Large amounts of high-dimensional data are routinely acquired in scientific fields ranging from biology, genomics and health sciences to astronomy and economics due to improvements in engineering and data acquisition techniques. Nonparametric methods allow for better modelling of complex systems underlying data generating processes compared to traditionally used linear and parametric models. From statistical point of view, scientists have enough data to reliably fit nonparametric models. However, from computational point of view, nonparametric methods often do not scale well to big data problems.

The aim of this workshop is to bring together practitioners, who are interested in developing and applying nonparametric methods in their domains, and theoreticians, who are interested in providing sound methodology. We hope to effectively communicate advances in development of computational tools for fitting nonparametric models and discuss challenging future directions that prevent applications of nonparametric methods to big data problems.

We encourage submissions on a variety of topics, including but not limited to:
- Randomized procedures for fitting nonparametric models. For example, sketching, random projections, core set selection, etc.
- Nonparametric probabilistic graphical models
- Scalable nonparametric methods
- Multiple kernel learning
- Random feature expansion
- Novel applications of nonparametric methods
- Bayesian nonparametric methods …

Roberto Calandra · Bobak Shahriari · Javier Gonzalez · Frank Hutter · Ryan Adams

[ Room 117 ]

Bayesian optimization has emerged as an exciting subfield of machine learning that is concerned with the global optimization of expensive, noisy, black-box functions using probabilistic methods. Systems implementing Bayesian optimization techniques have been successfully used to solve difficult problems in a diverse set of applications. Many recent advances in the methodologies and theory underlying Bayesian optimization have extended the framework to new applications and provided greater insights into the behaviour of these algorithms. Bayesian optimization is now increasingly being used in industrial settings, providing new and interesting challenges that require new algorithms and theoretical insights.
Classically, Bayesian optimization has been used purely for expensive single-objective black-box optimization. However, with the increased complexity of tasks and applications, this paradigm is proving to be too restricted. Hence, this year’s theme for the workshop will be “black-box optimization and beyond”. Among the recent trends that push beyond BO we can briefly enumerate:
- Adapting BO to not-so-expensive evaluations.
- “Open the black-box” and move away from viewing the model as a way of simply fitting a response surface, and towards modelling for the purpose of discovering and understanding the underlying process. For instance, this so-called grey-box modelling approach could be valuable in robotic …

Chelsea Finn · Raia Hadsell · David Held · Sergey Levine · Percy Liang

[ Area 3 ]

Deep learning systems that act in and interact with an environment must reason about how actions will change the world around them. The natural regime for such real-world decision problems involves supervision that is weak, delayed, or entirely absent, and the outputs are typically in the context of sequential decision processes, where each decision affects the next input. This regime poses a challenge for deep learning algorithms, which typically excel with: (1) large amounts of strongly supervised data and (2) a stationary distribution of independently observed inputs. The algorithmic tools for tackling these challenges have traditionally come from reinforcement learning, optimal control, and planning, and indeed the intersection of reinforcement learning and deep learning is currently an exciting and active research area. At the same time, deep learning methods for interactive decision-making domains have also been proposed in computer vision, robotics, and natural language processing, often using different tools and algorithmic formalisms from classical reinforcement learning, such as direct supervised learning, imitation learning, and model-based control. The aim of this workshop will be to bring together researchers across these disparate fields. The workshop program will focus on both the algorithmic and theoretical foundations of decision making and interaction with deep …

Matko Bošnjak · Nando de Freitas · Tejas Kulkarni · Arvind Neelakantan · Scott E Reed · Sebastian Riedel · Tim Rocktäschel

[ Room 113 ]

Machine intelligence capable of learning complex procedural behavior, inducing (latent) programs, and reasoning with these programs is a key to solving artificial intelligence. The problems of learning procedural behavior and program induction have been studied from different perspectives in many computer science fields such as program synthesis, probabilistic programming, inductive logic programming, reinforcement learning, and recently in deep learning. However, despite the common goal, there seems to be little communication and collaboration between the different fields focused on this problem.

Recently, there have been a lot of success stories in the deep learning community related to learning neural networks capable of using trainable memory abstractions. This has led to the development of neural networks with differentiable data structures such as Neural Turing Machines, Memory Networks, Neural Stacks, and Hierarchical Attentive Memory, among others. Simultaneously, neural program induction models like Neural Program Interpreters and Neural Programmer have created a lot of excitement in the field, promising induction of algorithmic behavior, and enabling inclusion of programming languages in the processes of execution and induction, while staying end-to-end trainable. Trainable program induction models have the potential to make a substantial impact in many problems involving long-term memory, reasoning, and procedural execution, such as …

Yarin Gal · Christos Louizos · Zoubin Ghahramani · Kevin Murphy · Max Welling

[ Area 1 ]

While deep learning has been revolutionary for machine learning, most modern deep learning models cannot represent their uncertainty nor take advantage of the well studied tools of probability theory. This has started to change following recent developments of tools and techniques combining Bayesian approaches with deep learning. The intersection of the two fields has received great interest from the community over the past few years, with the introduction of new deep learning models that take advantage of Bayesian techniques, as well as Bayesian models that incorporate deep learning elements.

In fact, the use of Bayesian techniques in deep learning can be traced back to the 1990s', in seminal works by Radford Neal, David MacKay, and Dayan et al.. These gave us tools to reason about deep models confidence, and achieved state-of-the-art performance on many tasks. However earlier tools did not adapt when new needs arose (such as scalability to big data), and were consequently forgotten. Such ideas are now being revisited in light of new advances in the field, yielding many exciting new results.

This workshop will study the advantages and disadvantages of such ideas, and will be a platform to host the recent flourish of ideas using Bayesian approaches …

Anima Anandkumar · Rong Ge · Yan Liu · Maximilian Nickel · Qi (Rose) Yu

[ Area 5 + 6 ]

Real world data in many domains is multimodal and heterogeneous, such as healthcare, social media, and climate science. Tensors, as generalizations of vectors and matrices, provide a natural and scalable framework for handling data with inherent structures and complex dependencies. Recent renaissance of tensor methods in machine learning ranges from academic research on scalable algorithms for tensor operations, novel models through tensor representations, to industry solutions including Google TensorFlow and Tensor Processing Unit (TPU). In particular, scalable tensor methods have attracted considerable amount of attention, with successes in a series of learning tasks, such as learning latent variable models [Anandkumar et al., 2014; Huang et al., 2015, Ge et al., 2015], relational learning [Nickle et al., 2011, 2014, 2016], spatio-temporal forecasting [Yu et al., 2014, 2015, 2016] and training deep neural networks [Alexander et al., 2015].

These progresses trigger new directions and problems towards tensor methods in machine learning. The workshop aims to foster discussion, discovery, and dissemination of research activities and outcomes in this area and encourages breakthroughs. We will bring together researchers in theories and applications who are interested in tensors analysis and development of tensor-based algorithms. We will also invite researchers from related areas, such as numerical …

Richard Baraniuk · Jiquan Ngiam · Christoph Studer · Phillip Grimaldi · Andrew Lan

[ Room 129 + 130 ]

In recent years, we have seen a rise in the amount of education data available through the digitization of education. Schools are starting to use technology in classrooms to create personalized learning experiences. Massive open online courses (MOOCs) have attracted millions of learners and present an opportunity for us to apply and develop machine learning methods towards improving student learning outcomes, leveraging the data collected.

However, development in student data analysis remains limited, and education largely follows a one-size-fits-all approach today. We have an opportunity to have a significant impact in revolutionizing the way (human) learning can work.

The goal of this workshop is to foster discussion and spur research between machine learning experts and researchers in education fields that can solve fundamental problems in education.

For this year's workshop, we are highlighting the following areas of interest:

-- Assessments and grading
Assessments are core in adaptive learning, formative learning, and summative evaluation. However, the creation and grading of quality assessments remains a difficult task for instructors. Machine learning methods can be applied to self-, peer-, auto-grading paradigms to both improve the quality of assessments and reduce the burden on instructors and students. These methods can also leverage the multimodal …

Alex Wiltschko · Zachary DeVito · Frederic Bastien · Pascal Lamblin

[ Room 115 ]

The calculation of gradients and other forms of derivatives is a core part of machine learning, computer vision, and physical simulation. But the manual creation of derivatives is prone to error and requires a high "mental overhead" for practitioners in these fields. However, the process of taking derivatives is actually the highly mechanical application of the chain rule and can be computed using formal techniques such as automatic or symbolic differentiation. A family of "autodiff" approaches exist, each with their own particular strengths and tradeoffs.

In the ideal case, automatically generated derivatives should be competitive with manually generated ones and run at near-peak performance on modern hardware, but the most expressive systems for autodiff which can handle arbitrary, Turing-complete programs, are unsuited for performance-critical applications, such as large-scale machine learning or physical simulation. Alternatively, the most performant systems are not designed for use outside of their designated application space, e.g. graphics or neural networks. This workshop will bring together developers and researchers of state-of-the-art solutions to generating derivatives automatically and discuss ways in which these solutions can be evolved to be both more expressive and achieve higher performance. Topics for discussion will include:

- Whether it is feasible to create …

[ VIP Room ]

Workshop webpage: http://www.neurorobotic.eu

Modern robots are complex machines with many compliant actuators and various types of sensors including depth and vision cameras, tactile electrodes and dozens of proprioceptive sensors. The obvious challenges are to process these high dimensional input patterns, memorize low dimensional representations of them and to generate the desired motor commands to interact in dynamically changing environments. Similar challenges exist in brain machine interfaces (BMIs) where complex prostheses with perceptional feedback are controlled, or in motor neuroscience where in addition cognitive features need to be considered. Despite this broad research overlap the developments happened mainly in parallel and were not ported or exploited in the related domains. The main bottleneck for collaborative studies has been a lack of interaction between the core robotics, the machine learning and the neuroscience communities.

Why is it now just the right time for interactions?

- Latest developments based on deep neural networks have advanced the capabilities of robotic systems by learning control policies directly from the high dimensional sensor readings.
- Many variants of networks have been recently developed including the integration of feedback through recurrent connections, the projection to different feature spaces, may be trained at different time scales and can …