Workshops
Vitaly Kuznetsov · Oren Anava · Scott Yang · Azadeh Khaleghi

[ Grand Ballroom A ]

Data, in the form of time-dependent sequential observations emerge in many key real-world problems, ranging from biological data, financial markets, weather forecasting to audio/video processing. However, despite the ubiquity of such data, most mainstream machine learning algorithms have been primarily developed for settings in which sample points are drawn i.i.d. from some (usually unknown) fixed distribution. While there exist algorithms designed to handle non-i.i.d. data, these typically assume specific parametric form for the data-generating distribution. Such assumptions may undermine the complex nature of modern data which can possess long-range dependency patterns, and for which we now have the computing power to discern. On the other extreme lie on-line learning algorithms that consider a more general framework without any distributional assumptions. However, by being purely-agnostic, common on-line algorithms may not fully exploit the stochastic aspect of time-series data.

This is the third instalment of time series workshop at NIPS and will build on the success of the previous events: NIPS 2015 Time Series Workshop and NIPS 2016 Time Series Workshop.

The goal of this workshop is to bring together theoretical and applied researchers interested in the analysis of time series and development of new algorithms to process sequential data. This includes …

Florence d'Alché-Buc · Krikamol Muandet · Bharath Sriperumbudur · Zoltán Szabó

[ Hyatt Hotel, Regency Ballroom D+E+F+H ]

The increased variability of acquired data has recently pushed the field of machine learning to extend its scope to non-standard data including for example functional (Ferraty & Vieu, 2006; Wang et al., 2015), distributional (Póczos et al., 2013), graph, or topological data (Carlsson, 2009; Vitaliy). Successful applications span across a wide range of disciplines such as healthcare (Zhou et al., 2013), action recognition from iPod/iPhone accelerometer data (Sun et al., 2013), causal inference (Lopez-Paz et al., 2015), bioinformatics (Kondor & Pan, 2016; Kusano et al., 2016), cosmology (Ravanbakhsh et al., 2016; Law et al., 2017), acoustic-to-articulatory speech inversion (Kadri et al., 2016), network inference (Brouard et al., 2016), climate research (Szabó et al., 2016), and ecological inference (Flaxman et al., 2015).

Leveraging the underlying structure of these non-standard data types often leads to significant boost in prediction accuracy and inference performance. In order to achieve these compelling improvements, however, numerous challenges and questions have to be addressed: (i) choosing an adequate representation of the data, (ii) constructing appropriate similarity measures (inner product, norm or metric) on these representations, (iii) efficiently exploiting their intrinsic structure such as multi-scale nature or invariances, (iv) designing affordable computational schemes (relying e.g., on surrogate losses), …

Nika Haghtalab · Yishay Mansour · Tim Roughgarden · Vasilis Syrgkanis · Jennifer Wortman Vaughan

[ 101 A ]

Machine learning is primarily concerned with the design and analysis of algorithms that learn about an entity. Increasingly more, machine learning is being used to design policies that affect the entity it once learned about. This can cause the entity to react and present a different behavior. Ignoring such interactions could lead to solutions that are ultimately ineffective in practice. For example, to design an effective ad display one has to take into account how a viewer would react to the displayed advertisements, for example by choosing to scroll through or click on them. Additionally, in many environments, multiple learners learn concurrently about one or more related entities. This can bring about a range of interactions between individual learners. For example, multiple firms may compete or collaborate on performing market research. How do the learners and entities interact? How do these interactions change the task at hand? What are some desirable interactions in a learning environment? And what are the mechanisms for bringing about such desirable interactions? These are some of the questions we would like to explore more in this workshop.

Traditionally, learning theory has adopted two extreme views in this respect: First, when learning occurs in isolation from …

Atilim Gunes Baydin · Mr. Prabhat · Kyle Cranmer · Frank Wood

[ 104 C ]

Physical sciences span problems and challenges at all scales in the universe: from finding exoplanets and asteroids in trillions of sky-survey pixels, to automatic tracking of extreme weather phenomena in climate datasets, to detecting anomalies in event streams from the Large Hadron Collider at CERN. Tackling a number of associated data-intensive tasks, including, but not limited to, regression, classification, clustering, dimensionality reduction, likelihood-free inference, generative models, and experimental design are critical for furthering scientific discovery. The Deep Learning for Physical Sciences (DLPS) workshop invites researchers to contribute papers that demonstrate progress in the application of machine and deep learning techniques to real-world problems in physical sciences (including the fields and subfields of astronomy, chemistry, Earth science, and physics).

We will discuss research questions, practical implementation challenges, performance / scaling, and unique aspects of processing and analyzing scientific datasets. The target audience comprises members of the machine learning community who are interested in scientific applications and researchers in the physical sciences. By bringing together these two communities, we expect to strengthen dialogue, introduce exciting new open problems to the wider NIPS community, and stimulate production of new approaches to solving science problems. Invited talks from leading individuals from both communities will …

Alessandra Tosi · Alfredo Vellido · Mauricio Álvarez

[ 204 ]

The use of machine learning has become pervasive in our society, from specialized scientific data analysis to industry intelligence and practical applications with a direct impact in the public domain. This impact involves different social issues including privacy, ethics, liability and accountability. This workshop aims to discuss the use of machine learning in safety critical environments, with special emphasis on three main application domains:
- Healthcare
- Autonomous systems
- Complainants and liability in data driven industries
We aim to answer some of these questions: How do we make our models more comprehensible and transparent? Shall we always trust our decision making process? How do we involve field experts in the process of making machine learning pipelines more practically interpretable from the viewpoint of the application domain?

Yaron Singer · Jeff A Bilmes · Andreas Krause · Stefanie Jegelka · Amin Karbasi

[ 203 ]

Traditionally, machine learning has been focused on methods where objects reside in continuous domains. The goal of this workshop is to advance state-of-the-art methods in machine learning that involve discrete structures.

Models with ultimately discrete solutions play an important role in machine learning. At its core, statistical machine learning is concerned with making inferences from data, and when the underlying variables of the data are discrete, both the tasks of model inference as well as predictions using the inferred model are inherently discrete algorithmic problems. Many of these problems are notoriously hard, and even those that are theoretically tractable become intractable in practice with abundant and steadily increasing amounts of data. As a result, standard theoretical models and off-the-shelf algorithms become either impractical or intractable (and in some cases both).

While many problems are hard in the worst case, the problems of practical interest are often much more well-behaved, and have the potential to be modeled in ways that make them tractable. Indeed, many discrete problems in machine learning can possess beneficial structure; such structure has been an important ingredient in many successful (approximate) solution strategies. Examples include submodularity, marginal polytopes, symmetries and exchangeability.

Machine learning, algorithms, discrete mathematics and …

Gautam Dasarathy · Mladen Kolar · Richard Baraniuk

[ 102 A+B ]

Whether it is biological networks of proteins and genes or technological ones like sensor networks and the Internet, we are surrounded today by complex systems composed of entities interacting with and affecting each other. An urgent need has therefore emerged for developing novel techniques for modeling, learning, and conducting inference in such networked systems. Consequently, we have seen progress from a variety of disciplines in both fundamental methodology and in applications of such methods to practical problems. However, much work remains to be done, and a unifying and principled framework for dealing with these problems remains elusive. This workshop aims to bring together theoreticians and practitioners in order to both chart out recent advances and to discuss new directions in understanding interactions in large and complex systems. NIPS, with its attendance by a broad and cross-disciplinary set of researchers offers the ideal venue for this exchange of ideas.

The workshop will feature a mix of contributed talks, contributed posters, and invited talks by leading researchers from diverse backgrounds working in these areas. We will also have a specific segment of the schedule reserved for the presentation of open problems, and will have plenty of time for discussions where we will …

Jay Pujara · Dor Arad · Bhavana Dalvi Mishra · Tim Rocktäschel

[ 102 C ]

Extracting knowledge from text, images, audio, and video and translating these extractions into a coherent, structured knowledge base (KB) is a task that spans the areas of machine learning, natural language processing, computer vision, databases, search, data mining and artificial intelligence. Over the past two decades, machine learning techniques used for information extraction, graph construction, and automated knowledge base construction have evolved from simple rule learning to end-to-end neural architectures with papers on the topic consistently appearing at NIPS. Hence, we believe this workshop will appeal to NIPS attendees and be a valuable contribution.

Furthermore, there has been significant interest and investment in knowledge base construction in both academia and industry in recent years. Most major internet companies and many startups have developed knowledge bases that power digital assistants (e.g. Siri, Alexa, Google Now) or provide the foundations for search and discovery applications. A similarly abundant set of knowledge systems have been developed at top universities such as Stanford (DeepDive), Carnegie Mellon (NELL), the University of Washington (OpenIE), the University of Mannheim (DBpedia), and the Max Planck Institut Informatik (YAGO, WebChild), among others. Our workshop serves as a forum for researchers working on knowledge base construction in both academia and …

Hendrik Purwins · Bob L. Sturm · Mark Plumbley

[ 201 A ]

Abstracts and full papers: http://media.aau.dk/smc/ml4audio/

Audio signal processing is currently undergoing a paradigm change, where data-driven machine learning is replacing hand-crafted feature design. This has led some to ask whether audio signal processing is still useful in the "era of machine learning." There are many challenges, new and old, including the interpretation of learned models in high dimensional spaces, problems associated with data-poor domains, adversarial examples, high computational requirements, and research driven by companies using large in-house datasets that is ultimately not reproducible.

ML4Audio aims to promote progress, systematization, understanding, and convergence of applying machine learning in the area of audio signal processing. Specifically, we are interested in work that demonstrates novel applications of machine learning techniques to audio data, as well as methodological considerations of merging machine learning with audio signal processing. We seek contributions in, but not limited to, the following topics:
- audio information retrieval using machine learning;
- audio synthesis with given contextual or musical constraints using machine learning;
- audio source separation using machine learning;
- audio transformations (e.g., sound morphing, style transfer) using machine learning;
- unsupervised learning, online learning, one-shot learning, reinforcement learning, and incremental learning for audio;
- applications/optimization of generative adversarial …

Ramin Hasani · Manuel Zimmer · Stephen Larson · Tomas Kazmar · Radu Grosu

[ S5 ]

A fundamental Challenge in neuroscience is to understand the elemental computations and algorithms by which brains perform information processing. This is of great significance to biologists, as well as, to engineers and computer scientists, who aim at developing energy efficient and intelligent solutions for the next generation of computers and autonomous devices. The benefits of collaborations between these fields are reciprocal, as brain-inspired computational algorithms and devices not only advance engineering, but also assist neuroscientists by conforming their models and making novel predictions. A large impediment toward such an efficient interaction is still the complexity of brains. We thus propose that the study of small model organisms should pioneer these efforts.

The nematode worm, C. elegans, provides a ready experimental system for reverse-engineering the nervous system, being one of the best studied animals in the life sciences. The neural connectome of C. elegans has been known for 30 years, providing the structural basis for building models of its neural information processing. Despite its small size, C. elegans exhibits complex behaviors, such as, locating food, mating partners and navigating its environment by integrating a plethora of environmental cues. Over the past years, the field has made an enormous progress in understanding …

Francisco Ruiz · Stephan Mandt · Cheng Zhang · James McInerney · James McInerney · Dustin Tran · Dustin Tran · David Blei · Max Welling · Tamara Broderick · Michalis Titsias

[ Seaside Ballroom ]

Approximate inference is key to modern probabilistic modeling. Thanks to the availability of big data, significant computational power, and sophisticated models, machine learning has achieved many breakthroughs in multiple application domains. At the same time, approximate inference becomes critical since exact inference is intractable for most models of interest. Within the field of approximate Bayesian inference, variational and Monte Carlo methods are currently the mainstay techniques. For both methods, there has been considerable progress both on the efficiency and performance.

In this workshop, we encourage submissions advancing approximate inference methods. We are open to a broad scope of methods within the field of Bayesian inference. In addition, we also encourage applications of approximate inference in many domains, such as computational biology, recommender systems, differential privacy, and industry applications.

Marina Meila · Frederic Chazal · Yu-Chia Chen

[ 102 C ]

This two day workshop will bring together researchers from the various subdisciplines of Geometric Data Analysis, such as manifold learning, topological data analysis, shape analysis, will showcase recent progress in this field and will establish directions for future research. The focus will be on high dimensional and big data, and on mathematically founded methodology.


Specific aims
=============
One aim of this workshop is to build connections between Topological Data Analysis on one side and Manifold Learning on the other. This is starting to happen, after years of more or less separate evolution of the two fields. The moment has been reached when the mathematical, statistical and algorithmic foundations of both areas are mature enough -- it is now time to lay the foundations for joint topological and differential geometric understanding of data, and this workshop will expliecitly focus on this process.

The second aim is to bring GDA closer to real applications. We see the challenge of real problems and real data as a motivator for researchers to explore new research questions, to reframe and expand the existing theory, and to step out of their own sub-area. In particular, for people in GDA to see TDA and ML as one. …

Suvrit Sra · Sashank J. Reddi · Alekh Agarwal · Benjamin Recht

[ Hall A ]

Dear NIPS Workshop Chairs,

We propose to organize the workshop:

OPT 2017: Optimization for Machine Learning.

This year marks a major milestone in the history of OPT, as it will be the 10th anniversary edition of this long running NIPS workshop.

The previous OPT workshops enjoyed packed to overpacked attendance. This huge interest is no surprise: optimization is the 2nd largest topic at NIPS and is indeed foundational for the wider ML community.

Looking back over the past decade, a strong trend is apparent: The intersection of OPT and ML has grown monotonically to the point that now several cutting-edge advances in optimization arise from the ML community. The distinctive feature of optimization within ML is its departure from textbook approaches, in particular, by having a different set of goals driven by “big-data,” where both models and practical implementation are crucial.

This intimate relation between OPT and ML is the core theme of our workshop. OPT workshops have previously covered a variety of topics, such as frameworks for convex programs (D. Bertsekas), the intersection of ML and optimization, especially SVM training (S. Wright), large-scale learning via stochastic gradient methods and its tradeoffs (L. Bottou, N. Srebro), exploitation of structured sparsity …

Douglas Eck · David Ha · S. M. Ali Eslami · Sander Dieleman · Rebecca Fiebrink · Luba Elliott

[ Hyatt Hotel, Seaview Ballroom ]

In the last year, generative machine learning and machine creativity have gotten a lot of attention in the non-research world. At the same time there have been significant advances in generative models for media creation and for design. This one-day workshop explores several issues in the domain of generative models for creativity and design. First, we will look at algorithms for generation and creation of new media and new designs, engaging researchers building the next generation of generative models (GANs, RL, etc) and also from a more information-theoretic view of creativity (compression, entropy, etc). Second, we will investigate the social and cultural impact of these new models, engaging researchers from HCI/UX communities. Finally, we’ll hear from some of the artists and musicians who are adopting machine learning approaches like deep learning and reinforcement learning as part of their artistic process. We’ll leave ample time for discussing both the important technical challenges of generative models for creativity and design, as well as the philosophical and cultural issues that surround this area of research.

Background
In 2016, DeepMind’s AlphaGo made two moves against Lee Sedol that were described by the Go community as “brilliant,” “surprising,” “beautiful,” and so forth. Moreover, there was …

Jacob Steinhardt · Nicolas Papernot · Bo Li · Chang Liu · Percy Liang · Dawn Song

[ Hyatt Hotel, Shoreline ]

While traditional computer security relies on well-defined attack models and proofs of security, a science of security for machine learning systems has proven more elusive. This is due to a number of obstacles, including (1) the highly varied angles of attack against ML systems, (2) the lack of a clearly defined attack surface (because the source of the data analyzed by ML systems is not easily traced), and (3) the lack of clear formal definitions of security that are appropriate for ML systems. At the same time, security of ML systems is of great import due the recent trend of using ML systems as a line of defense against malicious behavior (e.g., network intrusion, malware, and ransomware), as well as the prevalence of ML systems as parts of sensitive and valuable software systems (e.g., sentiment analyzers for predicting stock prices). This workshop will bring together experts from the computer security and machine learning communities in an attempt to highlight recent work in this area, as well as to clarify the foundations of secure ML and chart out important directions for future work and cross-community collaborations.

Aparna Lakshmiratan · Sarah Bird · Siddhartha Sen · Christopher Ré · Li Erran Li · Joseph Gonzalez · Daniel Crankshaw

[ S1 ]

A new area is emerging at the intersection of artificial intelligence, machine learning, and systems design. This birth is driven by the explosive growth of diverse applications of ML in production, the continued growth in data volume, and the complexity of large-scale learning systems. The goal of this workshop is to bring together experts working at the crossroads of machine learning, system design and software engineering to explore the challenges faced when building practical large-scale ML systems. In particular, we aim to elicit new connections among these diverse fields, and identify tools, best practices and design principles. We also want to think about how to do research in this area and properly evaluate it. The workshop will cover ML and AI platforms and algorithm toolkits, as well as dive into machine learning-focused developments in distributed learning platforms, programming languages, data structures, GPU processing, and other topics.

This workshop will follow the successful model we have previously run at ICML, NIPS and SOSP 2017.

Our plan is to run this workshop annually at one ML venue and one Systems venue, and eventually merge these communities into a full conference venue. We believe this dual approach will help to create a low …

George H Chen · Devavrat Shah · Christina Lee

[ 201 B ]

Many modern methods for prediction leverage nearest neighbor (NN) search to find past training examples most similar to a test example, an idea that dates back in text to at least the 11th century in the “Book of Optics” by Alhazen. Today, NN methods remain popular, often as a cog in a bigger prediction machine, used for instance in recommendation systems, forecasting baseball player performance and election outcomes, survival analysis in healthcare, image in-painting, crowdsourcing, graphon estimation, and more. The popularity of NN methods is due in no small part to the proliferation of high-quality fast approximate NN search methods that scale to high-dimensional massive datasets typical of contemporary applications. Moreover, NN prediction readily pairs with methods that learn similarities, such as metric learning methods or Siamese networks. In fact, some well-known pairings that result in nearest neighbor predictors that learn similarities include random forests and many boosting methods.

Despite the popularity, success, and age of nearest neighbor methods, our theoretical understanding of them is still surprisingly incomplete (perhaps much to the chagrin of the initial efforts of analysis by Fix, Hodges, Cover, and Hart) and can also be disconnected from what practitioners actually want or care about. Many successful …

Ingmar Posner · Raia Hadsell · Martin Riedmiller · Markus Wulfmeier · Rohan Paul

[ 104 B ]

In recent years robotics has made significant strides towards applications of real value to the public domain. Robots are now increasingly expected to work for and alongside us in complex, dynamic environments. Machine learning has been a key enabler of this success, particularly in the realm of robot perception where, due to substantial overlap with the machine vision community, methods and training data can be readily leveraged.

Recent advances in reinforcement learning and learning from demonstration — geared towards teaching agents how to act — provide a tantalising glimpse at a promising future trajectory for robot learning. Mastery of challenges such as the Atari suite and AlphaGo build significant excitement as to what our robots may be able to do for us in the future. However, this success relies on the ability of learning cheaply, often within the confines of a virtual environment, by trial and error over as many episodes as required. This presents a significant challenge for embodied systems acting and interacting in the real world. Not only is there a cost (either monetary or in terms of execution time) associated with a particular trial, thus limiting the amount of training data obtainable, but there also exist safety …

Sergio Escalera · Markus Weimer

[ 103 A+B ]

This is the first NIPS edition on "NIPS Competitions". We received 23 competition proposals related to data-driven and live competitions on different aspects of NIPS. Proposals were reviewed by several qualified researchers and experts in challenge organization. Five top-scored competitions were accepted to be run and present their results during the NIPS 2017 Competition track day. Evaluation was based on the quality of data, problem interest and impact, promoting the design of new models, and a proper schedule and managing procedure. Below, you can find the five accepted competitions. Organizers and participants in these competitions will be invited to present their work to this workshop, to be held on December 8th.

Accepted competitions:
The Conversational Intelligence Challenge Webpage: http://convai.io
Classifying Clinically Actionable Genetic Mutations Webpage: https://www.kaggle.com/c/msk-redefining-cancer-treatment
Learning to Run Webpage: https://www.crowdai.org/challenges/nips-2017-learning-to-run
Human-Computer Question Answering Competition Webpage: http://sites.google.com/view/hcqa/
Adversarial Attacks and Defences Webpage: https://www.kaggle.com/nips-2017-adversarial-learning-competition

William Herlands · Maria De-Arteaga

[ S7 ]

Six billion people live in developing world countries. The unique development challenges faced by these regions have long been studied by researchers ranging from sociology to statistics and ecology to economics. With the emergence of mature machine learning methods in the past decades, researchers from many fields - including core machine learning - are increasingly turning to machine learning to study and address challenges in the developing world. This workshop is about delving into the intersection of machine learning and development research.

Machine learning present tremendous potential to development research and practice. Supervised methods can provide expert telemedicine decision support in regions with few resources; deep learning techniques can analyze satellite imagery to create novel economic indicators; NLP algorithms can preserve and translate obscure languages, some of which are only spoken. Yet, there are notable challenges with machine learning in the developing world. Data cleanliness, computational capacity, power availability, and internet accessibility are more limited than in developed countries. Additionally, the specific applications differ from what many machine learning researchers normally encounter. The confluence of machine learning's immense potential with the practical challenges posed by developing world settings has inspired a growing body of research at the intersection of machine …

Manik Varma · Marius Kloft · Krzysztof Dembczynski

[ Hyatt Hotel, Regency Ballroom A+B+C ]

Extreme classification is a rapidly growing research area focussing on multi-class and multi-label problems involving an extremely large number of labels. Many applications have been found in diverse areas ranging from language modelling to document tagging in NLP, face recognition to learning universal feature representations in computer vision, gene function prediction in bioinformatics, etc. Extreme classification has also opened up a new paradigm for ranking and recommendation by reformulating them as multi- label learning tasks where each item to be ranked or recommended is treated as a separate label. Such reformulations have led to significant gains over traditional collaborative filtering and content based recommendation techniques. Consequently, extreme classifiers have been deployed in many real-world applications in industry.

Extreme classification raises a number of interesting research questions including those related to:

* Large scale learning and distributed and parallel training
* Log-time and log-space prediction and prediction on a test-time budget
* Label embedding and tree based approaches
* Crowd sourcing, preference elicitation and other data gathering techniques
* Bandits, semi-supervised learning and other approaches for dealing with training set biases and label noise
* Bandits with an extremely large number of arms
* Fine-grained classification
* Zero shot learning and …

Ian Goodfellow · Tim Hwang · Bryce Goodman · Mikel Rodriguez

[ 202 ]

Machine deception refers to the capacity for machine learning systems to manipulate human and machine agents into believing, acting upon or otherwise accepting false information. The development of machine deception has had a long, foundational and under-appreciated impact on shaping research in the field of artificial intelligence. Thought experiments such as Alan Turing’s eponymous “Turing test” - where an automated system attempts to deceive a human judge into believing it is a human interlocutor, or Searle’s “Chinese room” - in which a human operator attempts to imbue the false impression of consciousness in a machine, are simultaneously exemplars of machine deception and some of the most famous and influential concepts in the field of AI.

As the field of machine learning advances, so too does machine deception seem poised to give rise to a host of practical opportunities and concerns. Machine deception can have many benign and beneficial applications. Chatbots designed to mimic human agents offer technical support and even provide therapy at a cost and scale that may not be otherwise achievable. On the other hand, the rise of techniques that leverage bots and other autonomous agents to manipulate and shape political speech online, has put machine deception in …

Florian Strub · Harm de Vries · Abhishek Das · Satwik Kottur · Stefan Lee · Mateusz Malinowski · Olivier Pietquin · Devi Parikh · Dhruv Batra · Aaron Courville · Jeremie Mary

[ 101 B ]

Everyday interactions require a common understanding of language, i.e. for people to communicate effectively, words (for example ‘cat’) should invoke similar beliefs over physical concepts (what cats look like, the sounds they make, how they behave, what their skin feels like etc.). However, how this ‘common understanding’ emerges is still unclear.

One appealing hypothesis is that language is tied to how we interact with the environment. As a result, meaning emerges by ‘grounding’ language in modalities in our environment (images, sounds, actions, etc.).

Recent concurrent works in machine learning have focused on bridging visual and natural language understanding through visually-grounded language learning tasks, e.g. through natural images (Visual Question Answering, Visual Dialog), or through interactions with virtual physical environments. In cognitive science, progress in fMRI enables creating a semantic atlas of the cerebral cortex, or to decode semantic information from visual input. And in psychology, recent studies show that a baby’s most likely first words are based on their visual experience, laying the foundation for a new theory of infant language acquisition and learning.

As the grounding problem requires an interdisciplinary attitude, this workshop aims to gather researchers with broad expertise in various fields — machine learning, computer vision, natural …

Alborz Geramifard · Jason Williams · Larry Heck · Jim Glass · Antoine Bordes · Steve Young · Gerald Tesauro

[ Grand Ballroom B ]

In the span of only a few years, conversational systems have become commonplace. Every day, millions of people use natural-language interfaces such as Siri, Google Now, Cortana, Alexa, Facebook M and others via in-home devices, phones, or messaging channels such as Messenger, Slack, Skype, among others.  At the same time, interest among the research community in conversational systems has blossomed: for supervised and reinforcement learning, conversational systems often serve as both a benchmark task and an inspiration for new ML methods at conferences which don't focus on speech and language per se, such as NIPS, ICML, IJCAI, and others.  Research community challenge tasks are proliferating, including the sixth Dialog Systems Technology Challenge (DSTC6), the Amazon Alexa prize, and the Conversational Intelligence Challenge live competition at NIPS 2017. 

Now more than ever, it is crucial to promote cross-pollination of ideas between academic research centers and industry. The goal of this workshop is to bring together researchers and practitioners in this area, to clarify impactful research problems, share findings from large-scale real-world deployments, and generate new ideas for future lines of research.   

This workshop will include invited talks from academia and industry, contributed work, and open discussion.  In these talks, senior technical …

Kristof Schütt · Klaus-Robert Müller · Anatole von Lilienfeld · José Miguel Hernández-Lobato · Klaus-Robert Müller · Alan Aspuru-Guzik · Bharath Ramsundar · Matt Kusner · Brooks Paige · Stefan Chmiela · Alexandre Tkatchenko · Anatole von Lilienfeld · Koji Tsuda

[ S4 ]

The success of machine learning has been demonstrated time and time again in classification, generative modelling, and reinforcement learning. In particular, we have recently seen interesting developments where ML has been applied to the natural sciences (chemistry, physics, materials science, neuroscience and biology). Here, often the data is not abundant and very costly. This workshop will focus on the unique challenges of applying machine learning to molecules and materials.

Accurate prediction of chemical and physical properties is a crucial ingredient toward rational compound design in chemical and pharmaceutical industries. Many discoveries in chemistry can be guided by screening large databases of computational molecular structures and properties, but high level quantum-chemical calculations can take up to several days per molecule or material at the required accuracy, placing the ultimate achievement of in silico design out of reach for the foreseeable future. In large part the current state of the art for such problems is the expertise of individual researchers or at best highly-specific rule-based heuristic systems. Efficient methods in machine learning, applied to property and structure prediction, can therefore have pivotal impact in enabling chemical discovery and foster fundamental insights.

Because of this, in the past few years there has been …

Jason Fries · Alex Wiltschko · Andrew Beam · Isaac S Kohane · Jasper Snoek · Peter Schulam · Madalina Fiterau · David Kale · Rajesh Ranganath · Bruno Jedynak · Michael Hughes · Tristan Naumann · Natalia Antropova · Adrian Dalca · SHUBHI ASTHANA · Prateek Tandon · Jaz Kandola · Uri Shalit · Marzyeh Ghassemi · Tim Althoff · Alexander Ratner · Jumana Dakka

[ 104 A ]

The goal of the NIPS 2017 Machine Learning for Health Workshop (ML4H) is to foster collaborations that meaningfully impact medicine by bringing together clinicians, health data experts, and machine learning researchers. We aim to build on the success of the last two NIPS ML4H workshops which were widely attended and helped form the foundations of a new research community.

This year’s program emphasizes identifying previously unidentified problems in healthcare that the machine learning community hasn't addressed, or seeing old challenges through a new lens. While healthcare and medicine are often touted as prime examples for disruption by AI and machine learning, there has been vanishingly little evidence of this disruption to date. To interested parties who are outside of the medical establishment (e.g. machine learning researchers), the healthcare system can appear byzantine and impenetrable, which results in a high barrier to entry. In this workshop, we hope to reduce this activation energy by bringing together leaders at the forefront of both machine learning and healthcare for a dialog on areas of medicine that have immediate opportunities for machine learning. Attendees at this workshop will quickly gain an understanding of the key problems that are unique to healthcare and how machine …

Ricardo Silva · Panagiotis Toulis · John Shawe-Taylor · Alexander Volfovsky · Thorsten Joachims · Lihong Li · Nathan Kallus · Adith Swaminathan

[ Hall C ]

In recent years machine learning and causal inference have both seen important advances, especially through a dramatic expansion of their theoretical and practical domains. Machine learning has focused on ultra high-dimensional models and scalable stochastic algorithms, whereas causal inference has been guiding policy in complex domains involving economics, social and health sciences, and business. Through such advances a powerful cross-pollination has emerged as a new set of methodologies promising to deliver robust data analysis than each field could individually -- some examples include concepts such as doubly-robust methods, targeted learning, double machine learning, causal trees, all of which have recently been introduced.

This workshop is aimed at facilitating more interactions between researchers in machine learning and causal inference. In particular, it is an opportunity to bring together highly technical individuals who are strongly motivated by the practical importance and real-world impact of their work. Cultivating such interactions will lead to the development of theory, methodology, and - most importantly - practical tools, that better target causal questions across different domains.

In particular, we will highlight theory, algorithms and applications on automatic decision making systems, such as recommendation engines, medical decision systems and self-driving cars, as both producers and users of …

Eva Dyer · Gregory Kiar · William Gray Roncal · · Konrad P Koerding · Joshua T Vogelstein

[ 204 ]

Datasets in neuroscience are increasing in size at alarming rates relative to our ability to analyze them. This workshop aims at discussing new frameworks for processing and making sense of large neural datasets.

The morning session will focus on approaches for processing large neuroscience datasets. Examples include: distributed + high-performance computing, GPU and other hardware accelerations, spatial databases and other compression schemes used for large neuroimaging datasets, online machine learning approaches for handling large data sizes, randomization and stochastic optimization.

The afternoon session will focus on abstractions for modelling large neuroscience datasets. Examples include graphs, graphical models, manifolds, mixture models, latent variable models, spatial models, and factor learning.

In addition to talks and discussions, we plan to have papers submitted and peer reviewed. Workshop “proceedings” will consist of links to unpublished arXiv or bioarXiv papers that are of exceptional quality and are well aligned with the workshop scope. Some accepted papers will also be invited for an oral presentation; the remaining authors will be invited to present a poster.

Ben Glocker · Ender Konukoglu · Hervé Lombaert · Kanwal Bhatia

[ 103 A+B ]

Scope

'Medical Imaging meets NIPS' is a satellite workshop at NIPS 2017. The workshop aims to bring researchers together from the medical image computing and machine learning community. The objective is to discuss the major challenges in the field and opportunities for joining forces. The event will feature a series of high-profile invited speakers from industry, academia, engineering and medical sciences who aim to give an overview of recent advances, challenges, latest technology and efforts for sharing clinical data.

Motivation

Medical imaging is facing a major crisis with an ever increasing complexity and volume of data and immense economic pressure. The interpretation of medical images pushes human abilities to the limit with the risk that critical patterns of disease go undetected. Machine learning has emerged as a key technology for developing novel tools in computer aided diagnosis, therapy and intervention. Still, progress is slow compared to other fields of visual recognition which is mainly due to the domain complexity and constraints in clinical applications which require most robust, accurate, and reliable solutions.

Call for Abstracts

We invite submissions of extended abstracts for poster presentation during the workshop. Submitting an abstract is an ideal way of engaging with the workshop and …

Yarin Gal · José Miguel Hernández-Lobato · Christos Louizos · Andrew Wilson · Andrew Wilson · Diederik Kingma · Zoubin Ghahramani · Kevin Murphy · Max Welling

[ Hall C ]

While deep learning has been revolutionary for machine learning, most modern deep learning models cannot represent their uncertainty nor take advantage of the well studied tools of probability theory. This has started to change following recent developments of tools and techniques combining Bayesian approaches with deep learning. The intersection of the two fields has received great interest from the community over the past few years, with the introduction of new deep learning models that take advantage of Bayesian techniques, as well as Bayesian models that incorporate deep learning elements [1-11]. In fact, the use of Bayesian techniques in deep learning can be traced back to the 1990s’, in seminal works by Radford Neal [12], David MacKay [13], and Dayan et al. [14]. These gave us tools to reason about deep models’ confidence, and achieved state-of-the-art performance on many tasks. However earlier tools did not adapt when new needs arose (such as scalability to big data), and were consequently forgotten. Such ideas are now being revisited in light of new advances in the field, yielding many exciting new results.

Extending on last year’s workshop’s success, this workshop will again study the advantages and disadvantages of such ideas, and will be a …

Michael Mozer · Brenden Lake · Angela Yu

[ 104 A ]

The goal of this workshop is to bring together cognitive scientists, neuroscientists, and AI researchers to discuss opportunities for improving machine learning by leveraging our scientific understanding of human perception and cognition. There is a history of making these connections: artificial neural networks were originally motivated by the massively parallel, deep architecture of the brain; considerations of biological plausibility have driven the development of learning procedures; and architectures for computer vision draw parallels to the connectivity and physiology of mammalian visual cortex. However, beyond these celebrated examples, cognitive science and neuroscience has fallen short of its potential to influence the next generation of AI systems. Areas such as memory, attention, and development have rich theoretical and experimental histories, yet these concepts, as applied to AI systems so far, only bear a superficial resemblance to their biological counterparts.

The premise of this workshop is that there are valuable data and models from cognitive science that can inform the development of intelligent adaptive machines, and can endow learning architectures with the strength and flexibility of the human cognitive architecture. The structures and mechanisms of the mind and brain can provide the sort of strong inductive bias needed for machine-learning systems to attain …

Jakob Foerster · Igor Mordatch · Angeliki Lazaridou · Kyunghyun Cho · Douwe Kiela · Pieter Abbeel

[ S4 ]

Communication is one of the most impressive human abilities. The question of how communication arises has been studied for many decades, if not centuries. However, due to the computational and representational limitations, in the past problem-settings had to be restricted to low dimensional, simple observation spaces. With the rise of deep reinforcement learning methods, this question can now be studied in complex multi-agent settings, which has lead to flourishing activity in the area over the last two years. In these settings agents can learn to communicate in grounded multi-modal environments and rich communication protocols emerge.

However, the recent research has been largely disconnected from the study of emergent communication in other fields and even from work done on this topic in previous decades. This workshop will provide a forum for a variety of researchers from different fields (machine learning, game-theory, linguistics, cognitive science, and programming languages) interested in the question of communication and emergent language to exchange ideas.

https://sites.google.com/site/emecom2017/

Li Erran Li · Anca Dragan · Juan Carlos Niebles · Silvio Savarese

[ 201 A ]

Our transportation systems are poised for a transformation as we make progress on autonomous vehicles, vehicle-to-vehicle (V2V) and vehicle-to-everything (V2X) communication infrastructures, and smart road infrastructures such as smart traffic lights.
There are many challenges in transforming our current transportation systems to the future vision. For example, how to make perception accurate and robust to accomplish safe autonomous driving? How to learn long term driving strategies (known as driving policies) so that autonomous vehicles can be equipped with adaptive human negotiation skills when merging, overtaking and giving way, etc? how do we achieve near-zero fatality? How do we optimize efficiency through intelligent traffic management and control of fleets? How do we optimize for traffic capacity during rush hours? To meet these requirements in safety, efficiency, control, and capacity, the systems must be automated with intelligent decision making.

Machine learning will be essential to enable intelligent transportation systems. Machine learning has made rapid progress in self-driving, e.g. real-time perception and prediction of traffic scenes, and has started to be applied to ride-sharing platforms such as Uber (e.g. demand forecasting) and crowd-sourced video scene analysis companies such as Nexar (understanding and avoiding accidents). To address the challenges arising in our future transportation …

Sanjeev Arora · Maithra Raghu · Russ Salakhutdinov · Ludwig Schmidt · Oriol Vinyals

[ Hall A ]

The past five years have seen a huge increase in the capabilities of deep neural networks. Maintaining this rate of progress however, faces some steep challenges, and awaits fundamental insights. As our models become more complex, and venture into areas such as unsupervised learning or reinforcement learning, designing improvements becomes more laborious, and success can be brittle and hard to transfer to new settings.

This workshop seeks to highlight recent works that use theory as well as systematic experiments to isolate the fundamental questions that need to be addressed in deep learning. These have helped flesh out core questions on topics such as generalization, adversarial robustness, large batch training, generative adversarial nets, and optimization, and point towards elements of the theory of deep learning that is expected to emerge in the future.

The workshop aims to enhance this confluence of theory and practice, highlighting influential work with these methods, future open directions, and core fundamental problems. There will be an emphasis on discussion, via panels and round tables, to identify future research directions that are promising and tractable.

Isabelle Guyon · Evelyne Viegas · Sergio Escalera · Jacob D Abernethy

[ S1 ]

Challenges in machine learning and data science are competitions running over several weeks or months to resolve problems using provided datasets or simulated environments. The playful nature of challenges naturally attracts students, making challenge a great teaching resource. For this fourth edition of the CiML workshop at NIPS we want to explore the impact of machine learning challenges as a research tool. The workshop will give a large part to discussions around several axes: (1) benefits and limitations of challenges as a research tool; (2) methods to induce and train young researchers; (3) experimental design to foster contributions that will push the state of the art.
CiML is a forum that brings together workshop organizers, platform providers, and participants to discuss best practices in challenge organization and new methods and application opportunities to design high impact challenges. Following the success of last year's workshop, in which a fruitful exchange led to many innovations, we propose to reconvene and discuss new opportunities for challenges as a research tool, one of the hottest topics identified in last year's discussions. We have invited prominent speakers in this field.
We will also reserve time to an open discussion to dig into other topic including …

Dylan Hadfield-Menell · Jacob Steinhardt · David Duvenaud · David Krueger · Anca Dragan

[ 201 B ]

In order to be helpful to users and to society at large, an autonomous agent needs to be aligned with the objectives of its stakeholders. Misaligned incentives are a common and crucial problem with human agents --- we should expect similar challenges to arise from misaligned incentives with artificial agents. For example, it is not uncommon to see reinforcement learning agents ‘hack’ their specified reward function. How do we build learning systems that will reliably achieve a user's intended objective? How can we ensure that autonomous agents behave reliably in unforeseen situations? How do we design systems whose behavior will be aligned with the values and goals of society at large? As AI capabilities develop, it is crucial for the AI community to come to satisfying and trustworthy answers to these questions. This workshop will focus on three central challenges in value alignment: learning complex rewards that reflect human preferences (e.g. meaningful oversight, preference elicitation, inverse reinforcement learning, learning from demonstration or feedback), engineering reliable AI systems (e.g. robustness to distributional shift, model misspecification, or adversarial data, via methods such as adversarial training, KWIK-style learning, or transparency to human inspection), and dealing with bounded rationality and incomplete information in both …

Erich Elsen · Danijar Hafner · Zak Stone · Brennan Saeta

[ 101 B ]

Five years ago, it took more than a month to train a state-of-the-art image recognition model on the ImageNet dataset. Earlier this year, Facebook demonstrated that such a model could be trained in an hour. However, if we could parallelize this training problem across the world’s fastest supercomputers (~100 PFlops), it would be possible to train the same model in under a minute. This workshop is about closing that gap: how can we turn months into minutes and increase the productivity of machine learning researchers everywhere?

This one-day workshop will facilitate active debate and interaction across many different disciplines. The conversation will range from algorithms to infrastructure to silicon, with invited speakers from Cerebras, DeepMind, Facebook, Google, OpenAI, and other organizations. When should synchronous training be preferred over asynchronous training? Are large batch sizes the key to reach supercomputer scale, or is it possible to fully utilize a supercomputer at batch size one? How important is sparsity in enabling us to scale? Should sparsity patterns be structured or unstructured? To what extent do we expect to customize model architectures for particular problem domains, and to what extent can a “single model architecture” deliver state-of-the-art results across many different domains? How …

Katherine Gorman

[ Hyatt Hotel, Shoreline ]

For many in the sciences, collaboration is a given, or at least a given assumption. The field of AIML is no different, and collaboration across fields and disciplines has long been a source of data and funding. But for many, effective collaboration can be confounding, and for those who have never worked with someone from a different field, it can be confusing and daunting.

Good collaboration requires good communication, but more fundamentally, clear communication is a core skillset for anyone. It takes practice, and in highly specialized fields, it is often subject to an all-too-common malady: the curse of knowledge. The curse of knowledge happens when experts in a field, communicating within their field, begin to make assumptions about the knowledge and understanding of their audience and begin to overlook the fundamentals of clear communication. They do this because for an audience of their peers, they seem to become less necessary, while short cuts like jargon seem to make communication faster and more efficient. But today, clear communication around issues and techniques in machine intelligence work is crucial not only within the community, but also to foster collaboaration across disciplines, and between the community and the lay public.

In this …

Roberto Calandra · Frank Hutter · Hugo Larochelle · Sergey Levine

[ Hyatt Beacon Ballroom D+E+F+H ]

Recent years have seen rapid progress in meta-learning methods, which learn (and optimize) the performance of learning methods based on data, generate new learning methods from scratch, and learn to transfer knowledge across tasks and domains. Meta-learning can be seen as the logical conclusion of the arc that machine learning has undergone in the last decade, from learning classifiers, to learning representations, and finally to learning algorithms that themselves acquire representations and classifiers. The ability to improve one’s own learning capabilities through experience can also be viewed as a hallmark of intelligent beings, and there are strong connections with work on human learning in neuroscience.

Meta-learning methods are also of substantial practical interest, since they have, e.g., been shown to yield new state-of-the-art automated machine learning methods, novel deep learning architectures, and substantially improved one-shot learning systems.

Some of the fundamental questions that this workshop aims to address are:
- What are the fundamental differences in the learning “task” compared to traditional “non-meta” learners?
- Is there a practical limit to the number of meta-learning layers (e.g., would a meta-meta-meta-learning algorithm be of practical use)?
- How can we design more sample-efficient meta-learning methods?
- How can we exploit our …

James Zou · Anshul Kundaje · Gerald Quon · Nicolo Fusi · Sara Mostafavi

[ 104 B ]

The field of computational biology has seen dramatic growth over the past few years. A wide range of high-throughput technologies developed in the last decade now enable us to measure parts of a biological system at various resolutions—at the genome, epigenome, transcriptome, and proteome levels. These technologies are now being used to collect data for an ever-increasingly diverse set of problems, ranging from classical problems such as predicting differentially regulated genes between time points and predicting subcellular localization of RNA and proteins, to models that explore complex mechanistic hypotheses bridging the gap between genetics and disease, population genetics and transcriptional regulation. Fully realizing the scientific and clinical potential of these data requires developing novel supervised and unsupervised learning methods that are scalable, can accommodate heterogeneity, are robust to systematic noise and confounding factors, and provide mechanistic insights.

The goals of this workshop are to i) present emerging problems and innovative machine learning techniques in computational biology, and ii) generate discussion on how to best model the intricacies of biological data and synthesize and interpret results in light of the current work in the field. We will invite several leaders at the intersection of computational biology and machine learning who will …

Hrishikesh Aradhye · Joaquin Quinonero Candela · Rohit Prasad

[ 102 A+B ]

Deep Machine Learning has changed the computing paradigm. Products of today are built with machine intelligence as a central attribute, and consumers are beginning to expect near-human interaction with the appliances they use. However, much of the Deep Learning revolution has been limited to the cloud, enabled by popular toolkits such as Caffe, TensorFlow, and MxNet, and by specialized hardware such as TPUs. In comparison, mobile devices until recently were just not fast enough, there were limited developer tools, and there were limited use cases that required on-device machine learning. That has recently started to change, with the advances in real-time computer vision and spoken language understanding driving real innovation in intelligent mobile applications. Several mobile-optimized neural network libraries were recently announced (CoreML [1], Caffe2 for mobile [2], TensorFlow Lite [3]), which aim to dramatically reduce the barrier to entry for mobile machine learning. Innovation and competition at the silicon layer has enabled new possibilities for hardware acceleration. To make things even better, mobile-optimized versions of several state-of-the-art benchmark models were recently open sourced [4]. Widespread increase in availability of connected “smart” appliances for consumers and IoT platforms for industrial use cases means that there is an ever-expanding surface area …

Klaus-Robert Müller · Andrea Vedaldi · Lars K Hansen · Wojciech Samek · Grégoire Montavon

[ Hyatt Hotel, Regency Ballroom A+B+C ]

Machine learning has become an indispensable tool for a number of tasks ranging from the detection of objects in images to the understanding of natural languages. While these models reach impressively high predictive accuracy, they are often perceived as black-boxes, and it is not clear what information in the input data is used for predicting. In sensitive applications such as medical diagnosis or self-driving cars, where a single incorrect prediction can be very costly, the reliance of the model on the right features must be guaranteed. This indeed lowers the risk that the model behaves erroneously in presence of novel factors of variation in the test data. Furthermore, interpretability is instrumental when applying machine learning to the sciences, as the detailed understanding of the trained model (e.g., what features it uses to capture the complex relations between physical or biological variables) is a prerequisite for building meaningful new scientific hypotheses. Without such understanding and the possibility of verification that the model has learned something meaningful (e.g. obeying the known physical or biological laws), even the best predictor is of no use for scientific purposes. Finally, also from the perspective of a deep learning engineer, being able to visualize what the …

Emily Denton · Siddharth Narayanaswamy · Tejas Kulkarni · Honglak Lee · Diane Bouchacourt · Josh Tenenbaum · David Pfau

[ 203 ]

An important facet of human experience is our ability to break down what we observe and interact with, along characteristic lines. Visual scenes consist of separate objects, which may have different poses and identities within their category. In natural language, the syntax and semantics of a sentence can often be separated from one another. In planning and cognition plans can be broken down into immediate and long term goals. Inspired by this much research in deep representation learning has gone into finding disentangled factors of variation. However, this research often lacks a clear definition of what disentangling is or much relation to work in other branches of machine learning, neuroscience or cognitive science. In this workshop we intend to bring a wide swathe of scientists studying disentangled representations under one roof to try to come to a unified view of the problem of disentangling.

The workshop will address these issues through 3 focuses:
What is disentangling: Are disentangled representations just the same as statistically independent representations, or is there something more? How does disentangling relate to interpretability? Can we define what it means to separate style and content, or is human judgement the final arbiter? Are disentangled representations the same …

Isabelle Augenstein · Stephen Bach · Eugene Belilovsky · Matthew Blaschko · Christoph Lampert · Edouard Oyallon · Emmanouil Antonios Platanios · Alexander Ratner · Christopher Ré

[ Grand Ballroom B ]

Modern representation learning techniques like deep neural networks have had a major impact both within and beyond the field of machine learning, achieving new state-of-the-art performances with little or no feature engineering on a vast array of tasks. However, these gains are often difficult to translate into real-world settings as they require massive hand-labeled training sets. And in the vast majority of real-world settings, collecting such training sets by hand is infeasible due to the cost of labeling data or the paucity of data in a given domain (e.g. rare diseases in medical applications). In this workshop we focus on techniques for few sample learning and using weaker supervision when large unlabeled datasets are available, as well as theory associated with both.

One increasingly popular approach is to use weaker forms of supervision—i.e. supervision that is potentially noisier, biased, and/or less precise. An overarching goal of such approaches is to use domain knowledge and resources from subject matter experts, but to solicit it in higher-level, lower-fidelity, or more opportunistic ways. Examples include higher-level abstractions such as heuristic labeling rules, feature annotations, constraints, expected distributions, and generalized expectation criteria; noisier or biased labels from distant supervision, crowd workers, and weak classifiers; …

Alex Wiltschko · Bart van Merriënboer · Pascal Lamblin

[ 104 C ]

Many algorithms in machine learning, computer vision, physical simulation, and other fields require the calculation of gradients and other derivatives. Manual derivation of gradients can be time consuming and error-prone. Automatic differentiation comprises a set of techniques to calculate the derivative of a numerical computation expressed as a computer program. These techniques are commonly used in atmospheric sciences and computational fluid dynamics, and have more recently also been adopted by machine learning researchers.

Practitioners across many fields have built a wide set of automatic differentiation tools, using different programming languages, computational primitives and intermediate compiler representations. Each of these choices comes with positive and negative trade-offs, in terms of their usability, flexibility and performance in specific domains.

This workshop will bring together researchers in the fields of automatic differentiation and machine learning to discuss ways in which advanced automatic differentiation frameworks and techniques can enable more advanced machine learning models, run large-scale machine learning on accelerators with better performance, and increase the usability of machine learning frameworks for practitioners. Topics for discussion will include:

* What abstractions (languages, kernels, interfaces, instruction sets) do we need to develop advanced automatic differentiation frameworks for the machine learning ecosystem?
* What different use …

Olivier Bousquet · Marco Cuturi · Gabriel Peyré · Fei Sha · Justin Solomon

[ Hyatt Hotel, Seaview Ballroom ]

Optimal transport (OT) is gradually establishing itself as a powerful and essential tool to compare probability measures, which in machine learning take the form of point clouds, histograms, bags-of-features, or more generally datasets to be compared with probability densities and generative models. OT can be traced back to early work by Monge, and later to Kantorovich and Dantzig during the birth of linear programming. The mathematical theory of OT has produced several important developments since the 90's, crowned by Cédric Villani's Fields Medal in 2010. OT is now transitioning into more applied spheres, including recent applications to machine learning, because it can tackle challenging learning scenarios including dimensionality reduction, structured prediction problems that involve histograms, and estimation of generative models in highly degenerate, high-dimensional problems. This workshop will follow that organized 3 years ago (NIPS 2014) and will seek to amplify that trend. We will provide the audience with an update on all of the very recent successes brought forward by efficient solvers and innovative applications through a long list of invited talks. We will add to that a few contributed presentations (oral, and, if needed posters) and, finally, a panel for all invited speakers to take questions from the …

Maya Cakmak · Anna Rafferty · Adish Singla · Jerry Zhu · Sandra Zilles

[ Seaside Ballroom ]

This workshop focuses on “machine teaching”, the inverse problem of machine learning, in which the goal is to find an optimal training set given a machine learning algorithm and a target model. The study of machine teaching began in the early 1990s, primarily coming out of computational learning theory. Recently, there has been a surge of interest in machine teaching as several different communities within machine learning have found connections to this problem; these connections have included the following:

* machine teaching has close connections to newly introduced models of interaction in machine learning community, such as curriculum learning, self-paced learning, and knowledge distillation. [Hinton et al. 2015; Bengio et al. 2009]

* there are strong theoretical connections between the Teaching-dimension (the sample complexity of teaching) and the VC-dimension (the sample complexity of learning from randomly chosen examples). [Doliwa et al. 2014]

* machine teaching problem formulation has been recently studied in the context of diverse applications including personalized educational systems, cyber-security problems, robotics, program synthesis, human-in-the-loop systems, and crowdsourcing. [Jha et al. 2016; Zhu 2015; Mei & Zhu 2015; Ba & Caruana 2014; Patil et al. 2014; Singla et al. 2014; Cakmak & Thomaz 2014]

In this workshop, we …

[ 102 C ]

This two day workshop will bring together researchers from the various subdisciplines of Geometric Data Analysis, such as manifold learning, topological data analysis, shape analysis, will showcase recent progress in this field and will establish directions for future research. The focus will be on high dimensional and big data, and on mathematically founded methodology.

Specific aims

One aim of this workshop is to build connections between Topological Data Analysis on one side and Manifold Learning on the other. This is starting to happen, after years of more or less separate evolution of the two fields. The moment has been reached when the mathematical, statistical and algorithmic foundations of both areas are mature enough -- it is now time to lay the foundations for joint topological and differential geometric understanding of data, and this workshop will expliecitly focus on this process.

The second aim is to bring GDA closer to real applications. We see the challenge of real problems and real data as a motivator for researchers to explore new research questions, to reframe and expand the existing theory, and to step out of their own sub-area. In particular, for people in GDA to see TDA and ML as one.

The …

Alex Dimakis · Nikolaos Vasiloglou · Guy Van den Broeck · Alexander Ihler · Assaf Araki

[ 202 ]

Every year hundreds of papers are published at NIPS. Although the authors provide sound and scientific description and proof of their ideas, there is no space for explaining all the tricks and details that can make the implementation of the paper work. The goal of this workshop is to help authors evangelize their paper to the industry and expose the participants to all the Machine Learning/Artificial Intelligence know-how that cannot be found in the papers. Also the effect/importance of tuning parameters is rarely discussed, due to lack of space.
Submissions
We encourage you to prepare a poster of your favorite paper that explains graphically and at a higher level the concepts and the ideas discussed in it. You should also submit a jupyter notebook that explains in detail how equations in the paper translate to code. You are welcome to use any of the famous platforms like tensorFlow, Keras, MxNet, CNTK, etc.
For more information visit here
For more information https://www.mltrain.cc/

John Shawe-Taylor · Massimiliano Pontil · Nicolò Cesa-Bianchi · Emine Yilmaz · Chris Watkins · Sebastian Riedel · Marko Grobelnik

[ 103 C ]

Social Media and other online media sources play a critical role in distributing news and informing public opinion. Initially it seemed that democratising the dissemination of information and news with online media might be wholly good – but during the last year we have witnessed other perhaps less positive effects.
The algorithms that prioritise content for users aim to provide information that will be ‘liked’ by each user in order to retain their attention and interest. These algorithms are now well-tuned and are indeed able to match content to different users’ preferences. This has meant that users increasingly see content that aligns with their world view, confirms their beliefs, supports their opinions, in short that maintains their ‘information bubble’, creating the so-called echo-chambers. As a result, views have often become more polarised rather than less, with people expressing genuine disbelief that fellow citizens could possibly countenance alternative opinions, be they pro- or anti-brexit, pro- or anti-Trump. Perhaps the most extreme example is that of fake news in which news is created in order to satisfy and reinforce certain beliefs.
This polarisation of views cannot be beneficial for society. As the success of Computer Science and more specifically Machine Learning have …

Benjamin Guedj · Pascal Germain · Francis Bach

[ 101 A ]

Industry-wide successes of machine learning at the dawn of the (so-called) big data era has led to an increasing gap between practitioners and theoreticians. The former are using off-the-shelf statistical and machine learning methods, while the latter are designing and studying the mathematical properties of such algorithms. The tradeoff between those two movements is somewhat addressed by Bayesian researchers, where sound mathematical guarantees often meet efficient implementation and provide model selection criteria. In the late 90s, a new paradigm has emerged in the statistical learning community, used to derive probably approximately correct (PAC) bounds on Bayesian-flavored estimators. This PAC-Bayesian theory has been pioneered by Shawe-Taylor and Willamson (1997), and McAllester (1998, 1999). It has been extensively formalized by Catoni (2004, 2007) and has triggered, slowly but surely, increasing research efforts during last decades.

We believe it is time to pinpoint the current PAC-Bayesian trends relatively to other modern approaches in the (statistical) machine learning community. Indeed, we observe that, while the field grows by its own, it took some undesirable distance from some related areas. Firstly, it seems to us that the relation to Bayesian methods has been forsaken in numerous works, despite the potential of PAC-Bayesian theory to bring …

Andrew G Barto · Doina Precup · Shie Mannor · Tom Schaul · Roy Fox · Carlos Florensa

[ Grand Ballroom A ]

Reinforcement Learning (RL) has become a powerful tool for tackling complex sequential decision-making problems. It has been shown to train agents to reach super-human capabilities in game-playing domains such as Go and Atari. RL can also learn advanced control policies in high-dimensional robotic systems. Nevertheless, current RL agents have considerable difficulties when facing sparse rewards, long planning horizons, and more generally a scarcity of useful supervision signals. Unfortunately, the most valuable control tasks are specified in terms of high-level instructions, implying sparse rewards when formulated as an RL problem. Internal spatio-temporal abstractions and memory structures can constrain the decision space, improving data efficiency in the face of scarcity, but are likewise challenging for a supervisor to teach.

Hierarchical Reinforcement Learning (HRL) is emerging as a key component for finding spatio-temporal abstractions and behavioral patterns that can guide the discovery of useful large-scale control architectures, both for deep-network representations and for analytic and optimal-control methods. HRL has the potential to accelerate planning and exploration by identifying skills that can reliably reach desirable future states. It can abstract away the details of low-level controllers to facilitate long-horizon planning and meta-learning in a high-level feature space. Hierarchical structures are modular and amenable to …

Ruben Martinez-Cantin · José Miguel Hernández-Lobato · Javier Gonzalez

[ S7 ]

Bayesian optimization (BO) is a recent subfield of machine learning comprising a collection of methodologies for the efficient optimization of expensive black-box functions. BO techniques work by fitting a model to black-box function data and then using the model's predictions to decide where to collect data next, so that the optimization problem can be solved using only a small number of function evaluations. The resulting methods are characterized by their high sample-efficiency when compared to alternative black-box optimization algorithms, enabling the solution of new challenging problems. For example, in recent years, BO has become a popular tool in the machine learning community for the excellent performance attained in the problem of hyperparameter tuning, with important results both in academia and industry. This success has made BO a crucial player in the current trend of “automatic machine learning”.

As new BO methods have been developed, the area of applicability has been continuously expanding. While the problem of hyperparameter tuning permeates all disciplines, the field has moved towards more specific problems in science and engineering requiring of new advanced methodology. Today, Bayesian optimization is the most promising approach for accelerating and automating science and engineering. Therefore, we have chosen this year's theme …