Timezone: »
Interacting with increasingly sophisticated decision-making systems is becoming more and more a part of our daily life. This creates an immense responsibility for designers of these systems to build them in a way to guarantee safe interaction with their users and good performance, in the presence of noise and changes in the environment, and/or of model misspecification and uncertainty. Any progress in this area will be a huge step forward in using decision-making algorithms in emerging high stakes applications, such as autonomous driving, robotics, power systems, health care, recommendation systems, and finance.
This workshop aims to bring together researchers from academia and industry in order to discuss main challenges, describe recent advances, and highlight future research directions pertaining to develop safe and robust decision-making systems. We aim to highlight new and emerging theoretical and applied research opportunities for the community that arise from the evolving needs for decision-making systems and algorithms that guarantee safe interaction and good performance under a wide range of uncertainties in the environment.
Fri 8:00 a.m. - 8:15 a.m.
|
Opening Remarks
(
Opening Presentation
)
|
🔗 |
Fri 8:15 a.m. - 8:55 a.m.
|
Aviv Tamar: Visual Plan Imagination - An Interpretable Robot Learning Framework
(
Invited Talk
)
How can we build autonomous robots that operate in unstructured and dynamic environments such as homes or hospitals? This problem has been investigated under several disciplines, including planning (motion planning, task planning, etc.), and reinforcement learning. While both of these fields have witnessed tremendous progress, each have fundamental drawbacks: planning approaches require substantial manual engineering in mapping perception to a formal planning problem, while RL, which can operate directly on raw percepts, is data hungry, cannot generalize to new tasks, and is ‘black box’ in nature. Motivated by humans’ remarkable capability to imagine and plan complex manipulations of objects, and recent advances in imagining images such as GANs, we present Visual Plan Imagination (VPI) — a new computational problem that combines image imagination and planning. In VPI, given off-policy image data from a dynamical system, the task is to ‘imagine’ image sequences that transition the system from start to goal. Thus, VPI focuses on the essence of planning with high-dim perception, and abstracts away low level control and reward engineering. More importantly, VPI provides a safe and interpretable basis for robotic control — before the robot acts, a human inspects the imagined plan the robot will act upon, and can intervene if necessary. I will describe our approach to VPI based on Causal InfoGAN, a deep generative model that learns features that are compatible with strong planning algorithms. We show that Causal InfoGAN can generate convincing visual plans, and we demonstrate learning to imagine and execute real robot rope manipulation from image data. I will also discuss our VPI simulation benchmarks, and recent efforts in novelty detection, an important component in VPI, and in safe decision making in general. |
Aviv Tamar 🔗 |
Fri 8:55 a.m. - 9:35 a.m.
|
Daniel Kuhn: From Data to Decisions: Distributionally Robust Optimization is Optimal
(
Invited Talk
)
We study stochastic optimization problems where the decision-maker cannot observe the distribution of the exogenous uncertainties but has access to a finite set of independent training samples. In this setting, the goal is to find a procedure that transforms the data to an estimate of the expected cost function under the unknown data-generating distribution, i.e., a predictor, and an optimizer of the estimated cost function that serves as a near-optimal candidate decision, i.e., a prescriptor. As functions of the data, predictors and prescriptors constitute statistical estimators. We propose a meta-optimization problem to find the least conservative predictors and prescriptors subject to constraints on their out-of-sample disappointment. The out-of-sample disappointment quantifies the probability that the actual expected cost of the candidate decision under the unknown true distribution exceeds its predicted cost. Leveraging tools from large deviations theory, we prove that this meta-optimization problem admits a unique solution: The best predictor-prescriptor pair is obtained by solving a distributionally robust optimization problem over all distributions within a given relative entropy distance from the empirical distribution of the data. |
Daniel Kuhn 🔗 |
Fri 9:35 a.m. - 10:30 a.m.
|
Poster Session
(
Posters
)
|
Ahana Ghosh · Javad Shafiee · Akhilan Boopathy · Alex Tamkin · Theodoros Vasiloudis · Vedant Nanda · Ali Baheri · Paul Fieguth · Andrew Bennett · Guanya Shi · Hao Liu · Arushi Jain · Jacob Tyo · Benjie Wang · Boxiao Chen · Carroll Wainwright · Chandramouli Shama Sastry · Chao Tang · Daniel S. Brown · David Inouye · David Venuto · Dhruv Ramani · Dimitrios Diochnos · Divyam Madaan · Dmitrii Krashenikov · Joel Oren · Doyup Lee · Eleanor Quint · elmira amirloo · Matteo Pirotta · Gavin Hartnett · Geoffroy Dubourg-Felonneau · Gokul Swamy · Pin-Yu Chen · Ilija Bogunovic · Jason Carter · Javier Garcia-Barcos · Jeet Mohapatra · Jesse Zhang · Jian Qian · John Martin · Oliver Richter · Federico Zaiter · Tsui-Wei Weng · Karthik Abinav Sankararaman · Kyriakos Polymenakos · Lan Hoang · mahdieh abbasi · Marco Gallieri · Mathieu Seurin · Matteo Papini · Matteo Turchetta · Matthew Sotoudeh · Mehrdad Hosseinzadeh · Nathan Fulton · Masatoshi Uehara · Niranjani Prasad · Oana-Maria Camburu · Patrik Kolaric · Philipp Renz · Prateek Jaiswal · Reazul Hasan Russel · Riashat Islam · Rishabh Agarwal · Alexander Aldrick · Sachin Vernekar · Sahin Lale · Sai Kiran Narayanaswami · Samuel Daulton · Sanjam Garg · Sebastian East · Shun Zhang · Soheil Dsidbari · Justin Goodwin · Victoria Krakovna · Wenhao Luo · Wesley Chung · Yuanyuan Shi · Yuh-Shyang Wang · Hongwei Jin · Ziping Xu
|
Fri 10:30 a.m. - 11:10 a.m.
|
Finale Doshi-Velez: Combining Statistical methods with Human Input for Evaluation and Optimization in Batch Settings
(
Invited Talk
)
Statistical methods for off-policy evaluation and counterfactual reasoning will have fundamental limitations based on what assumptions can be made and what kind of exploration is present in the data (some of which is being presented here by other speakers!). In this talk, I'll discuss some recent directions in our lab regarding ways to integrate human experts into the process of policy evaluation and selection in batch settings. The first deals with statistical limitations by seeking a diverse collection of statistically-indistinguishable (with respect to outcome) policies for humans to eventually decide from. The second involves directly integrating human feedback to eliminate or validate specific sources of sensitivity in an off-policy evaluation to get more robust estimates (or at least better understand the source of their non-robustness). More broadly, I will discuss open directions for moving from purely-statistical (e.g. off-policy evaluation) or purely-human (e.g. interpretability-based) approaches for robust/safe decision-making toward combining the advantages of both. |
Finale Doshi-Velez 🔗 |
Fri 11:10 a.m. - 11:50 a.m.
|
Marco Pavone: On Safe and Efficient Human-robot Interactions via Multi-modal Intent Modeling and Reachability-based Safety Assurance
(
Invited Talk
)
In this talk I will present a decision-making and control stack for human-robot interactions by using autonomous driving as a motivating example. Specifically, I will first discuss a data-driven approach for learning multimodal interaction dynamics between robot-driven and human-driven vehicles based on recent advances in deep generative modeling. Then, I will discuss how to incorporate such a learned interaction model into a real-time, interaction-aware decision-making framework. The framework is designed to be minimally interventional; in particular, by leveraging backward reachability analysis, it ensures safety even when other cars defy the robot's expectations without unduly sacrificing performance. I will present recent results from experiments on a full-scale steer-by-wire platform, validating the framework and providing practical insights. I will conclude the talk by providing an overview of related efforts from my group on infusing safety assurances in robot autonomy stacks equipped with learning-based components, with an emphasis on adding structure within robot learning via control-theoretical and formal methods |
Marco Pavone 🔗 |
Fri 11:50 a.m. - 12:30 p.m.
|
Dimitar Filev: Practical Approaches to Driving Policy Design for Autonomous Vehicles
(
Invited Talk
)
The presentation deals with some practical facets of application of AI methods to designing driving policy for autonomous vehicles. Relationship between the reinforcement learning (RL) based solutions and the use of rule-based and model-based techniques for improving their robustness and safety are discussed. One approach to obtaining explainable RL models by learning alternative rule-based representations is proposed. The presentation also elaborates on the opportunities for extending the AI driving policy approaches by applying game theory inspired methodology to addressing diverse and unforeseen scenarios, and representing the negotiation aspects of decision making in autonomous driving. |
🔗 |
Fri 12:30 p.m. - 2:00 p.m.
|
Lunch Break
(
Lunch
)
|
🔗 |
Fri 2:00 p.m. - 2:40 p.m.
|
Nathan Kallus: Efficiently Breaking the Curse of Horizon with Double Reinforcement Learning
(
Invited Talk
)
Off-policy evaluation (OPE) is crucial for reinforcement learning in domains like medicine with limited exploration, but OPE is also notoriously difficult because the similarity between trajectories generated by any proposed policy and the observed data diminishes exponentially as horizons grow, known as the curse of horizon. To understand precisely when this curse bites, we consider for the first time the semi-parametric efficiency limits of OPE in Markov decision processes (MDP), establishing the best-possible estimation errors and characterizing the curse as a problem-dependent phenomenon rather than method-dependent. Efficiency in OPE is crucial because, without exploration, we must use the available data to its fullest. In finite horizons, this shows standard doubly-robust (DR) estimators are in fact inefficient for MDPs. In infinite horizons, while the curse renders certain problems fundamentally intractable, OPE may be feasible in ergodic time-invariant MDPs. We develop the first OPE estimator that achieves the efficiency limits in both setting, termed Double Reinforcement Learning (DRL). In both finite and infinite horizons, DRL improves upon existing estimators, which we show are inefficient, and leverages problem structure to its fullest in the face of the curse of horizon. We establish many favorable characteristics for DRL including efficiency even when nuisances are estimated slowly by blackbox models, finite-sample guarantees, and model double robustness. |
Nathan Kallus 🔗 |
Fri 2:40 p.m. - 3:20 p.m.
|
Scott Niekum: Scaling Probabilistically Safe Learning to Robotics
(
Invited Talk
)
In recent years, high-confidence reinforcement learning algorithms have enjoyed success in application areas with high-quality models and plentiful data, but robotics remains a challenging domain for scaling up such approaches. Furthermore, very little work has been done on the even more difficult problem of safe imitation learning, in which the demonstrator's reward function is not known. This talk focuses on three recent developments in this emerging area of research: (1) a theory of safe imitation learning; (2) scalable reward inference in the absence of models; (3) efficient off-policy policy evaluation. The proposed algorithms offer a blend of safety and practicality, making a significant step towards safe robot learning with modest amounts of real-world data. |
Scott Niekum 🔗 |
Fri 3:20 p.m. - 4:30 p.m.
|
Poster Session and Coffee Break
|
🔗 |
Fri 4:30 p.m. - 5:10 p.m.
|
Andy Sun: Recent Advances in Multistage Decision-making under Uncertainty: New Algorithms and Complexity Analysis
(
Invited Talk
)
In this talk, we will review some recent advances in the area of multistage decision making under uncertainty, especially in the domain of stochastic and robust optimization. We will present some new algorithmic development that allows for exactly solving huge-scale stochastic programs with integer recourse decisions, and algorithms in a dual perspective that can deal with infeasibility in problems. This significantly extends the scope of stochastic dual dynamic programming (SDDP) algorithms from convex or binary state variable cases to general nonconvex problems. We will also present a new analysis of the iteration complexity of the proposed algorithms. This settles some open questions in regards of the complexity of SDDP. |
Andy Sun 🔗 |
Fri 5:10 p.m. - 5:50 p.m.
|
Thorsten Joachim: Fair Ranking with Biased Data
(
Invited Talk
)
Search engines and recommender systems have become the dominant matchmaker for a wide range of human endeavors -- from online retail to finding romantic partners. Consequently, they carry immense power in shaping markets and allocating opportunity to the participants. In this talk, I will discuss how the machine learning algorithms underlying these system can produce unfair ranking policies for both exogenous and endogenous reasons. Exogenous reasons often manifest themselves as biases in the training data, which then get reflected in the learned ranking policy and lead to rich-get-richer dynamics. But even when trained with unbiased data, reasons endogenous to the algorithms can lead to unfair allocation of opportunity. To overcome these challenges, I will present new machine learning algorithms that directly address both endogenous and exogenous unfairness. |
Thorsten Joachims 🔗 |
Fri 5:50 p.m. - 6:00 p.m.
|
Concluding Remarks
(
Remarks
)
|
🔗 |
Author Information
Mohammad Ghavamzadeh (Facebook AI Research)
Shie Mannor (Technion)
Yisong Yue (Caltech)
Marek Petrik (University of New Hampshire)
Yinlam Chow (Google Research)
More from the Same Authors
-
2021 : The Multi-Agent Behavior Dataset: Mouse Dyadic Social Interactions »
Jennifer J Sun · Tomomi Karigo · Dipam Chakraborty · Sharada Mohanty · Benjamin Wild · Quan Sun · Chen Chen · David Anderson · Pietro Perona · Yisong Yue · Ann Kennedy -
2021 : Empirical Study of Off-Policy Policy Evaluation for Reinforcement Learning »
Cameron Voloshin · Hoang Le · Nan Jiang · Yisong Yue -
2021 Spotlight: Safe Reinforcement Learning with Natural Language Constraints »
Tsung-Yen Yang · Michael Y Hu · Yinlam Chow · Peter J. Ramadge · Karthik Narasimhan -
2021 Spotlight: RL for Latent MDPs: Regret Guarantees and a Lower Bound »
Jeongyeol Kwon · Yonathan Efroni · Constantine Caramanis · Shie Mannor -
2021 : Unbiased Efficient Feature Counts for Inverse RL »
Gerard Donahue · Brendan Crowe · Marek Petrik · Daniel Brown -
2021 : Behavior Policy Search for Risk Estimators in Reinforcement Learning »
Elita Lobo · Marek Petrik · Dharmashankar Subramanian -
2021 : Reinforcement Learning in Reward-Mixing MDPs »
Jeongyeol Kwon · Yonathan Efroni · Constantine Caramanis · Shie Mannor -
2021 : Covariate Shift of Latent Confounders in Imitation and Reinforcement Learning »
Guy Tennenholtz · Assaf Hallak · Gal Dalal · Shie Mannor · Gal Chechik · Uri Shalit -
2021 : Latent Geodesics of Model Dynamics for Offline Reinforcement Learning »
Guy Tennenholtz · Nir Baram · Shie Mannor -
2021 : Locality Matters: A Scalable Value Decomposition Approach for Cooperative Multi-Agent Reinforcement Learning »
Roy Zohar · Shie Mannor · Guy Tennenholtz -
2022 : Neurosymbolic Programming for Science »
Jennifer J Sun · Megan Tjandrasuwita · Atharva Sehgal · Armando Solar-Lezama · Swarat Chaudhuri · Yisong Yue · Omar Costilla Reyes -
2022 : SustainGym: A Benchmark Suite of Reinforcement Learning for Sustainability Applications »
Christopher Yeh · Victor Li · Rajeev Datta · Yisong Yue · Adam Wierman -
2022 : A Mixture-of-Expert Approach to RL-based Dialogue Management »
Yinlam Chow · Azamat Tulepbergenov · Ofir Nachum · Dhawal Gupta · Moonkyung Ryu · Mohammad Ghavamzadeh · Craig Boutilier -
2022 : DiffStack: A Differentiable and Modular Control Stack for Autonomous Vehicles »
Peter Karkus · Boris Ivanovic · Shie Mannor · Marco Pavone -
2022 : Implementing Reinforcement Learning Datacenter Congestion Control in NVIDIA NICs »
Benjamin Fuhrer · Yuval Shpigelman · Chen Tessler · Shie Mannor · Gal Chechik · Eitan Zahavi · Gal Dalal -
2022 : SoftTreeMax: Policy Gradient with Tree Search »
Gal Dalal · Assaf Hallak · Shie Mannor · Gal Chechik -
2022 : Panel »
Jeevana Priya Inala · Pushmeet Kohli · Ann Kennedy · Sriram Rajamani · Yisong Yue -
2022 : Implementing Reinforcement Learning Datacenter Congestion Control in NVIDIA NICs »
Benjamin Fuhrer · Yuval Shpigelman · Chen Tessler · Shie Mannor · Gal Chechik · Eitan Zahavi · Gal Dalal -
2022 : Deep Neural Imputation: A Framework for Recovering Incomplete Brain Recordings »
Sabera Talukder · Jennifer J Sun · Matthew Leonard · Bingni Brunton · Yisong Yue -
2022 Poster: Tractable Optimality in Episodic Latent MABs »
Jeongyeol Kwon · Yonathan Efroni · Constantine Caramanis · Shie Mannor -
2022 Poster: Robust $\phi$-Divergence MDPs »
Chin Pang Ho · Marek Petrik · Wolfram Wiesemann -
2022 Poster: Reinforcement Learning with a Terminator »
Guy Tennenholtz · Nadav Merlis · Lior Shani · Shie Mannor · Uri Shalit · Gal Chechik · Assaf Hallak · Gal Dalal -
2022 Poster: Finite Sample Analysis Of Dynamic Regression Parameter Learning »
Mark Kozdoba · Edward Moroshko · Shie Mannor · Yacov Crammer -
2022 Poster: Policy Optimization with Linear Temporal Logic Constraints »
Cameron Voloshin · Hoang Le · Swarat Chaudhuri · Yisong Yue -
2022 Poster: Uncertainty Estimation Using Riemannian Model Dynamics for Offline Reinforcement Learning »
Guy Tennenholtz · Shie Mannor -
2022 Poster: Efficient Risk-Averse Reinforcement Learning »
Ido Greenberg · Yinlam Chow · Mohammad Ghavamzadeh · Shie Mannor -
2021 : Panel B: Safe Learning and Decision Making in Uncertain and Unstructured Environments »
Yisong Yue · J. Zico Kolter · Ivan Dario D Jimenez Rodriguez · Dragos Margineantu · Animesh Garg · Melissa Greeff -
2021 : Learning for Agile Control in the Real World: Challenges and Opportunities »
Yisong Yue · Ivan Dario D Jimenez Rodriguez -
2021 : Safe RL Panel Discussion »
Animesh Garg · Marek Petrik · Shie Mannor · Claire Tomlin · Ugo Rosolia · Dylan Hadfield-Menell -
2021 : Shie Mannor »
Shie Mannor -
2021 : Shie Mannor »
Shie Mannor -
2021 Workshop: Safe and Robust Control of Uncertain Systems »
Ashwin Balakrishna · Brijen Thananjeyan · Daniel Brown · Marek Petrik · Melanie Zeilinger · Sylvia Herbert -
2021 Poster: Meta-Adaptive Nonlinear Control: Theory and Algorithms »
Guanya Shi · Kamyar Azizzadenesheli · Michael O'Connell · Soon-Jo Chung · Yisong Yue -
2021 Poster: Safe Reinforcement Learning with Natural Language Constraints »
Tsung-Yen Yang · Michael Y Hu · Yinlam Chow · Peter J. Ramadge · Karthik Narasimhan -
2021 Poster: Twice regularized MDPs and the equivalence between robustness and regularization »
Esther Derman · Matthieu Geist · Shie Mannor -
2021 Poster: RL for Latent MDPs: Regret Guarantees and a Lower Bound »
Jeongyeol Kwon · Yonathan Efroni · Constantine Caramanis · Shie Mannor -
2021 Poster: Sim and Real: Better Together »
Shirli Di-Castro · Dotan Di Castro · Shie Mannor -
2021 Poster: Improve Agents without Retraining: Parallel Tree Search with Off-Policy Correction »
Gal Dalal · Assaf Hallak · Steven Dalton · iuri frosio · Shie Mannor · Gal Chechik -
2021 Poster: DeepGEM: Generalized Expectation-Maximization for Blind Inversion »
Angela Gao · Jorge Castellanos · Yisong Yue · Zachary Ross · Katherine Bouman -
2021 Poster: Fast Algorithms for $L_\infty$-constrained S-rectangular Robust MDPs »
Bahram Behzadian · Marek Petrik · Chin Pang Ho -
2021 Poster: Reinforcement Learning in Reward-Mixing MDPs »
Jeongyeol Kwon · Yonathan Efroni · Constantine Caramanis · Shie Mannor -
2021 Poster: Iterative Amortized Policy Optimization »
Joseph Marino · Alexandre Piche · Alessandro Davide Ialongo · Yisong Yue -
2020 Workshop: Learning Meets Combinatorial Algorithms »
Marin Vlastelica · Jialin Song · Aaron Ferber · Brandon Amos · Georg Martius · Bistra Dilkina · Yisong Yue -
2020 Poster: Online Optimization with Memory and Competitive Control »
Guanya Shi · Yiheng Lin · Soon-Jo Chung · Yisong Yue · Adam Wierman -
2020 Poster: A General Large Neighborhood Search Framework for Solving Integer Linear Programs »
Jialin Song · ravi lanka · Yisong Yue · Bistra Dilkina -
2020 Poster: Learning compositional functions via multiplicative weight updates »
Jeremy Bernstein · Jiawei Zhao · Markus Meister · Ming-Yu Liu · Anima Anandkumar · Yisong Yue -
2020 Poster: Learning Differentiable Programs with Admissible Neural Heuristics »
Ameesh Shah · Eric Zhan · Jennifer J Sun · Abhinav Verma · Yisong Yue · Swarat Chaudhuri -
2020 Poster: Latent Bandits Revisited »
Joey Hong · Branislav Kveton · Manzil Zaheer · Yinlam Chow · Amr Ahmed · Craig Boutilier -
2020 Poster: Bayesian Robust Optimization for Imitation Learning »
Daniel S. Brown · Scott Niekum · Marek Petrik -
2020 Poster: On the distance between two neural networks and the stability of learning »
Jeremy Bernstein · Arash Vahdat · Yisong Yue · Ming-Yu Liu -
2020 Poster: The Power of Predictions in Online Control »
Chenkai Yu · Guanya Shi · Soon-Jo Chung · Yisong Yue · Adam Wierman -
2020 Poster: CoinDICE: Off-Policy Confidence Interval Estimation »
Bo Dai · Ofir Nachum · Yinlam Chow · Lihong Li · Csaba Szepesvari · Dale Schuurmans -
2020 Spotlight: CoinDICE: Off-Policy Confidence Interval Estimation »
Bo Dai · Ofir Nachum · Yinlam Chow · Lihong Li · Csaba Szepesvari · Dale Schuurmans -
2019 : Poster and Coffee Break 2 »
Karol Hausman · Kefan Dong · Ken Goldberg · Lihong Li · Lin Yang · Lingxiao Wang · Lior Shani · Liwei Wang · Loren Amdahl-Culleton · Lucas Cassano · Marc Dymetman · Marc Bellemare · Marcin Tomczak · Margarita Castro · Marius Kloft · Marius-Constantin Dinu · Markus Holzleitner · Martha White · Mengdi Wang · Michael Jordan · Mihailo Jovanovic · Ming Yu · Minshuo Chen · Moonkyung Ryu · Muhammad Zaheer · Naman Agarwal · Nan Jiang · Niao He · Nikolaus Yasui · Nikos Karampatziakis · Nino Vieillard · Ofir Nachum · Olivier Pietquin · Ozan Sener · Pan Xu · Parameswaran Kamalaruban · Paul Mineiro · Paul Rolland · Philip Amortila · Pierre-Luc Bacon · Prakash Panangaden · Qi Cai · Qiang Liu · Quanquan Gu · Raihan Seraj · Richard Sutton · Rick Valenzano · Robert Dadashi · Rodrigo Toro Icarte · Roshan Shariff · Roy Fox · Ruosong Wang · Saeed Ghadimi · Samuel Sokota · Sean Sinclair · Sepp Hochreiter · Sergey Levine · Sergio Valcarcel Macua · Sham Kakade · Shangtong Zhang · Sheila McIlraith · Shie Mannor · Shimon Whiteson · Shuai Li · Shuang Qiu · Wai Lok Li · Siddhartha Banerjee · Sitao Luan · Tamer Basar · Thinh Doan · Tianhe Yu · Tianyi Liu · Tom Zahavy · Toryn Klassen · Tuo Zhao · Vicenç Gómez · Vincent Liu · Volkan Cevher · Wesley Suttle · Xiao-Wen Chang · Xiaohan Wei · Xiaotong Liu · Xingguo Li · Xinyi Chen · Xingyou Song · Yao Liu · YiDing Jiang · Yihao Feng · Yilun Du · Yinlam Chow · Yinyu Ye · Yishay Mansour · · Yonathan Efroni · Yongxin Chen · Yuanhao Wang · Bo Dai · Chen-Yu Wei · Harsh Shrivastava · Hongyang Zhang · Qinqing Zheng · SIDDHARTHA SATPATHI · Xueqing Liu · Andreu Vall -
2019 : Adaptive Trust Region Policy Optimization: Convergence and Faster Rates of regularized MDPs »
Lior Shani · Yonathan Efroni · Shie Mannor -
2019 Poster: Imitation-Projected Programmatic Reinforcement Learning »
Abhinav Verma · Hoang Le · Yisong Yue · Swarat Chaudhuri -
2019 Poster: NAOMI: Non-Autoregressive Multiresolution Sequence Imputation »
Yukai Liu · Rose Yu · Stephan Zheng · Eric Zhan · Yisong Yue -
2019 Poster: Teaching Multiple Concepts to a Forgetful Learner »
Anette Hunziker · Yuxin Chen · Oisin Mac Aodha · Manuel Gomez Rodriguez · Andreas Krause · Pietro Perona · Yisong Yue · Adish Singla -
2019 Poster: DualDICE: Behavior-Agnostic Estimation of Discounted Stationary Distribution Corrections »
Ofir Nachum · Yinlam Chow · Bo Dai · Lihong Li -
2019 Spotlight: DualDICE: Behavior-Agnostic Estimation of Discounted Stationary Distribution Corrections »
Ofir Nachum · Yinlam Chow · Bo Dai · Lihong Li -
2019 Poster: Beyond Confidence Regions: Tight Bayesian Ambiguity Sets for Robust MDPs »
Marek Petrik · Reazul Hasan Russel -
2019 Poster: Tight Regret Bounds for Model-Based Reinforcement Learning with Greedy Policies »
Yonathan Efroni · Nadav Merlis · Mohammad Ghavamzadeh · Shie Mannor -
2019 Spotlight: Tight Regret Bounds for Model-Based Reinforcement Learning with Greedy Policies »
Yonathan Efroni · Nadav Merlis · Mohammad Ghavamzadeh · Shie Mannor -
2019 Poster: Landmark Ordinal Embedding »
Nikhil Ghosh · Yuxin Chen · Yisong Yue -
2018 : Yisong Yue »
Yisong Yue -
2018 Poster: Understanding the Role of Adaptivity in Machine Teaching: The Case of Version Space Learners »
Yuxin Chen · Adish Singla · Oisin Mac Aodha · Pietro Perona · Yisong Yue -
2018 Poster: Multiple-Step Greedy Policies in Approximate and Online Reinforcement Learning »
Yonathan Efroni · Gal Dalal · Bruno Scherrer · Shie Mannor -
2018 Poster: Policy-Conditioned Uncertainty Sets for Robust Markov Decision Processes »
Andrea Tirinzoni · Marek Petrik · Xiangli Chen · Brian Ziebart -
2018 Spotlight: Multiple-Step Greedy Policies in Approximate and Online Reinforcement Learning »
Yonathan Efroni · Gal Dalal · Bruno Scherrer · Shie Mannor -
2018 Spotlight: Policy-Conditioned Uncertainty Sets for Robust Markov Decision Processes »
Andrea Tirinzoni · Marek Petrik · Xiangli Chen · Brian Ziebart -
2018 Poster: A Lyapunov-based Approach to Safe Reinforcement Learning »
Yinlam Chow · Ofir Nachum · Edgar Duenez-Guzman · Mohammad Ghavamzadeh -
2018 Poster: A Block Coordinate Ascent Algorithm for Mean-Variance Optimization »
Tengyang Xie · Bo Liu · Yangyang Xu · Mohammad Ghavamzadeh · Yinlam Chow · Daoming Lyu · Daesub Yoon -
2018 Poster: A General Method for Amortizing Variational Filtering »
Joseph Marino · Milan Cvitkovic · Yisong Yue -
2017 : Coffee break and Poster Session II »
Mohamed Kane · Albert Haque · Vagelis Papalexakis · John Guibas · Peter Li · Carlos Arias · Eric Nalisnick · Padhraic Smyth · Frank Rudzicz · Xia Zhu · Theodore Willke · Noemie Elhadad · Hans Raffauf · Harini Suresh · Paroma Varma · Yisong Yue · Ognjen (Oggi) Rudovic · Luca Foschini · Syed Rameel Ahmad · Hasham ul Haq · Valerio Maggio · Giuseppe Jurman · Sonali Parbhoo · Pouya Bashivan · Jyoti Islam · Mirco Musolesi · Chris Wu · Alexander Ratner · Jared Dunnmon · Cristóbal Esteban · Aram Galstyan · Greg Ver Steeg · Hrant Khachatrian · Marc Górriz · Mihaela van der Schaar · Anton Nemchenko · Manasi Patwardhan · Tanay Tandon -
2017 Poster: Rotting Bandits »
Nir Levine · Yacov Crammer · Shie Mannor -
2017 Poster: Conservative Contextual Linear Bandits »
Abbas Kazerouni · Mohammad Ghavamzadeh · Yasin Abbasi · Benjamin Van Roy -
2017 Poster: Shallow Updates for Deep Reinforcement Learning »
Nir Levine · Tom Zahavy · Daniel J Mankowitz · Aviv Tamar · Shie Mannor -
2016 Poster: Safe Policy Improvement by Minimizing Robust Baseline Regret »
Mohammad Ghavamzadeh · Marek Petrik · Yinlam Chow -
2016 Poster: Generating Long-term Trajectories Using Deep Hierarchical Networks »
Stephan Zheng · Yisong Yue · Patrick Lucey -
2016 Poster: Adaptive Skills Adaptive Partitions (ASAP) »
Daniel J Mankowitz · Timothy A Mann · Shie Mannor -
2015 : Between stochastic and adversarial: forecasting with online ARMA models »
Shie Mannor -
2015 Workshop: Machine Learning for (e-)Commerce »
Esteban Arcaute · Mohammad Ghavamzadeh · Shie Mannor · Georgios Theocharous -
2015 Poster: Smooth Interactive Submodular Set Cover »
Bryan He · Yisong Yue -
2015 Poster: Online Learning for Adversaries with Memory: Price of Past Mistakes »
Oren Anava · Elad Hazan · Shie Mannor -
2015 Poster: Risk-Sensitive and Robust Decision-Making: a CVaR Optimization Approach »
Yinlam Chow · Aviv Tamar · Shie Mannor · Marco Pavone -
2015 Poster: Policy Gradient for Coherent Risk Measures »
Aviv Tamar · Yinlam Chow · Mohammad Ghavamzadeh · Shie Mannor -
2015 Demonstration: Data-Driven Speech Animation »
Yisong Yue · Iain Matthews -
2015 Poster: Community Detection via Measure Space Embedding »
Mark Kozdoba · Shie Mannor -
2014 Workshop: Large-scale reinforcement learning and Markov decision problems »
Benjamin Van Roy · Mohammad Ghavamzadeh · Peter Bartlett · Yasin Abbasi Yadkori · Ambuj Tewari -
2014 Workshop: From Bad Models to Good Policies (Sequential Decision Making under Uncertainty) »
Odalric-Ambrym Maillard · Timothy A Mann · Shie Mannor · Jeremie Mary · Laurent Orseau · Thomas Dietterich · Ronald Ortner · Peter Grünwald · Joelle Pineau · Raphael Fonteneau · Georgios Theocharous · Esteban D Arcaute · Christos Dimitrakakis · Nan Jiang · Doina Precup · Pierre-Luc Bacon · Marek Petrik · Aviv Tamar -
2014 Poster: "How hard is my MDP?" The distribution-norm to the rescue »
Odalric-Ambrym Maillard · Timothy A Mann · Shie Mannor -
2014 Poster: RAAM: The Benefits of Robustness in Approximating Aggregated MDPs in Reinforcement Learning »
Marek Petrik · Dharmashankar Subramanian -
2014 Poster: Robust Logistic Regression and Classification »
Jiashi Feng · Huan Xu · Shie Mannor · Shuicheng Yan -
2014 Spotlight: RAAM: The Benefits of Robustness in Approximating Aggregated MDPs in Reinforcement Learning »
Marek Petrik · Dharmashankar Subramanian -
2014 Oral: "How hard is my MDP?" The distribution-norm to the rescue »
Odalric-Ambrym Maillard · Timothy A Mann · Shie Mannor -
2014 Poster: Algorithms for CVaR Optimization in MDPs »
Yinlam Chow · Mohammad Ghavamzadeh -
2013 Poster: Actor-Critic Algorithms for Risk-Sensitive MDPs »
Prashanth L.A. · Mohammad Ghavamzadeh -
2013 Poster: Reinforcement Learning in Robust Markov Decision Processes »
Shiau Hong Lim · Huan Xu · Shie Mannor -
2013 Poster: Approximate Dynamic Programming Finally Performs Well in the Game of Tetris »
Victor Gabillon · Mohammad Ghavamzadeh · Bruno Scherrer -
2013 Oral: Actor-Critic Algorithms for Risk-Sensitive MDPs »
Prashanth L.A. · Mohammad Ghavamzadeh -
2013 Poster: Online PCA for Contaminated Data »
Jiashi Feng · Huan Xu · Shie Mannor · Shuicheng Yan -
2013 Poster: Learning Multiple Models via Regularized Weighting »
Daniel Vainsencher · Shie Mannor · Huan Xu -
2012 Poster: Best Arm Identification: A Unified Approach to Fixed Budget and Fixed Confidence »
Victor Gabillon · Mohammad Ghavamzadeh · Alessandro Lazaric -
2012 Poster: The Perturbed Variation »
Maayan Harel · Shie Mannor -
2011 Poster: From Bandits to Experts: On the Value of Side-Observations »
Shie Mannor · Ohad Shamir -
2011 Spotlight: From Bandits to Experts: On the Value of Side-Observations »
Shie Mannor · Ohad Shamir -
2011 Poster: Multi-Bandit Best Arm Identification »
Victor Gabillon · Mohammad Ghavamzadeh · Alessandro Lazaric · Sebastien Bubeck -
2011 Poster: Committing Bandits »
Loc X Bui · Ramesh Johari · Shie Mannor -
2011 Poster: Speedy Q-Learning »
Mohammad Gheshlaghi Azar · Remi Munos · Mohammad Ghavamzadeh · Hilbert J Kappen -
2010 Spotlight: Online Classification with Specificity Constraints »
Andrey Bernstein · Shie Mannor · Nahum Shimkin -
2010 Poster: Online Classification with Specificity Constraints »
Andrey Bernstein · Shie Mannor · Nahum Shimkin -
2010 Poster: Distributionally Robust Markov Decision Processes »
Huan Xu · Shie Mannor -
2010 Spotlight: LSTD with Random Projections »
Mohammad Ghavamzadeh · Alessandro Lazaric · Odalric-Ambrym Maillard · Remi Munos -
2010 Poster: LSTD with Random Projections »
Mohammad Ghavamzadeh · Alessandro Lazaric · Odalric-Ambrym Maillard · Remi Munos -
2008 Workshop: Model Uncertainty and Risk in Reinforcement Learning »
Yaakov Engel · Mohammad Ghavamzadeh · Shie Mannor · Pascal Poupart -
2008 Poster: Regularized Policy Iteration »
Amir-massoud Farahmand · Mohammad Ghavamzadeh · Csaba Szepesvari · Shie Mannor -
2007 Spotlight: Incremental Natural Actor-Critic Algorithms »
Shalabh Bhatnagar · Richard Sutton · Mohammad Ghavamzadeh · Mark P Lee -
2007 Poster: Incremental Natural Actor-Critic Algorithms »
Shalabh Bhatnagar · Richard Sutton · Mohammad Ghavamzadeh · Mark P Lee -
2006 Poster: Bayesian Policy Gradient Algorithms »
Mohammad Ghavamzadeh · Yaakov Engel -
2006 Spotlight: Bayesian Policy Gradient Algorithms »
Mohammad Ghavamzadeh · Yaakov Engel