Timezone: »
In this work, we propose a multi-objective decision making framework that accommodates different user preferences over objectives, where preferences are learned via policy comparisons. Our model consists of a known Markov decision process with a vector-valued reward function, with each user having an unknown preference vector that expresses the relative importance of each objective. The goal is to efficiently compute a near-optimal policy for a given user. We consider two user feedback models. We first address the case where a user is provided with two policies and returns their preferred policy as feedback. We then move to a different user feedback model, where a user is instead provided with two small weighted sets of representative trajectories and selects the preferred one. In both cases, we suggest an algorithm that finds a nearly optimal policy for the user using a number of comparison queries that scales quasilinearly in the number of objectives.
Author Information
Han Shao (Toyota Technological Institute at Chicago)
Lee Cohen (Toyota Technological Institute at Chicago)
Avrim Blum (Toyota Technological Institute at Chicago)
Yishay Mansour (Tel Aviv University / Google)
Aadirupa Saha (Apple)
Aadirupa Saha is a PhD student at the department of Computer Science and Automation (CSA), Indian Institute of Science (IISc), Bangalore and was a research intern at Google, Mountain View, CA (June-Sept, 2019). Her research interests broadly lie in the areas of Machine Learning, Statistical Learning Theory and Optimization. Her current research specifically focuses on decision making under uncertainty from sequential data, reinforcement learning, and preference based rank aggregation problems.
Matthew Walter (TTI-Chicago)
More from the Same Authors
-
2021 Spotlight: Excess Capacity and Backdoor Poisoning »
Naren Manoj · Avrim Blum -
2021 : One for One, or All for All: Equilibria and Optimality of Collaboration in Federated Learning »
Richard Phillips · Han Shao · Avrim Blum · Nika Haghtalab -
2021 : On classification of strategic agents who can both game and improve »
Saba Ahmadi · Hedyeh Beyhaghi · Avrim Blum · Keziah Naggita -
2021 : The Strategic Perceptron »
Saba Ahmadi · Hedyeh Beyhaghi · Avrim Blum · Keziah Naggita -
2021 : One for One, or All for All: Equilibria and Optimality of Collaboration in Federated Learning »
Richard Phillips · Han Shao · Avrim Blum · Nika Haghtalab -
2021 : On classification of strategic agents who can both game and improve »
Saba Ahmadi · Hedyeh Beyhaghi · Avrim Blum · Keziah Naggita -
2021 : The Strategic Perceptron »
Saba Ahmadi · Hedyeh Beyhaghi · Avrim Blum · Keziah Naggita -
2022 : Distributed Online and Bandit Convex Optimization »
Kumar Kshitij Patel · Aadirupa Saha · Nati Srebro · Lingxiao Wang -
2022 : On Convexity and Linear Mode Connectivity in Neural Networks »
David Yunis · Kumar Kshitij Patel · Pedro Savarese · Gal Vardi · Jonathan Frankle · Matthew Walter · Karen Livescu · Michael Maire -
2022 : Finding Safe Zones of Markov Decision Processes Policies »
Lee Cohen · Yishay Mansour · Michal Moshkovitz -
2022 : Finding Safe Zones of Markov Decision Processes Policies »
Michal Moshkovitz · Lee Cohen · Yishay Mansour -
2022 : Certifiable Robustness Against Patch Attacks Using an ERM Oracle »
Kevin Stangl · Avrim Blum · Omar Montasser · Saba Ahmadi -
2023 : Dueling Optimization with a Monotone Adversary »
Avrim Blum · Meghal Gupta · Gene Li · Naren Manoj · Aadirupa Saha · Yuanyuan Yang -
2023 : On The Vulnerability of Fairness Constrained Learning to Malicious Noise »
Avrim Blum · Princewill Okoroafor · Aadirupa Saha · Kevin Stangl -
2023 : Subwords as Skills: Tokenization for Sparse-Reward Reinforcement Learning »
David Yunis · Justin Jung · Falcon Dai · Matthew Walter -
2023 Poster: Multiclass Boosting: Simple and Intuitive Weak Learning Criteria »
Nataly Brukhim · Amit Daniely · Yishay Mansour · Shay Moran -
2023 Poster: Finding Safe Zones of Markov Decision Processes Policies »
Lee Cohen · Yishay Mansour · Michal Moshkovitz -
2023 Poster: Black-Box Differential Privacy for Interactive ML »
Haim Kaplan · Yishay Mansour · Shay Moran · Kobbi Nissim · Uri Stemmer -
2023 Poster: Strategic Classification under Unknown Personalized Manipulation »
Han Shao · Avrim Blum · Omar Montasser -
2022 Panel: Panel 2C-2: Agreement-on-the-line: Predicting the… & A Theory of… »
Han Shao · Christina Baek -
2022 : Panel »
Meena Jagadeesan · Avrim Blum · Jon Kleinberg · Celestine Mendler-Dünner · Jennifer Wortman Vaughan · Chara Podimata -
2022 Poster: Boosting Barely Robust Learners: A New Perspective on Adversarial Robustness »
Avrim Blum · Omar Montasser · Greg Shakhnarovich · Hongyang Zhang -
2022 Poster: A Theory of PAC Learnability under Transformation Invariances »
Han Shao · Omar Montasser · Avrim Blum -
2021 Workshop: Learning in Presence of Strategic Behavior »
Omer Ben-Porat · Nika Haghtalab · Annie Liang · Yishay Mansour · David Parkes -
2021 : AI Driving Olympics + Q&A »
Andrea Censi · Liam Paull · Jacopo Tani · Emilio Frazzoli · Holger Caesar · Matthew Walter · Andrea Daniele · Sahika Genc · Sharada Mohanty -
2021 Poster: Excess Capacity and Backdoor Poisoning »
Naren Manoj · Avrim Blum -
2021 Poster: Dueling Bandits with Adversarial Sleeping »
Aadirupa Saha · Pierre Gaillard -
2021 Poster: Dueling Bandits with Team Comparisons »
Lee Cohen · Ulrike Schmidt-Kraepelin · Yishay Mansour -
2021 Poster: Optimal Algorithms for Stochastic Contextual Preference Bandits »
Aadirupa Saha -
2020 Workshop: Workshop on Dataset Curation and Security »
Nathalie Baracaldo · Yonatan Bisk · Avrim Blum · Michael Curry · John Dickerson · Micah Goldblum · Tom Goldstein · Bo Li · Avi Schwarzschild -
2020 Poster: Sample Complexity of Uniform Convergence for Multicalibration »
Eliran Shabat · Lee Cohen · Yishay Mansour -
2020 Session: Orals & Spotlights Track 24: Learning Theory »
Avrim Blum · Steve Hanneke -
2020 Poster: Prediction with Corrupted Expert Advice »
Idan Amir · Idan Attias · Tomer Koren · Yishay Mansour · Roi Livni -
2020 Poster: Online Learning with Primary and Secondary Losses »
Avrim Blum · Han Shao -
2020 Spotlight: Prediction with Corrupted Expert Advice »
Idan Amir · Idan Attias · Tomer Koren · Yishay Mansour · Roi Livni -
2020 Poster: Adversarially Robust Streaming Algorithms via Differential Privacy »
Avinatan Hassidim · Haim Kaplan · Yishay Mansour · Yossi Matias · Uri Stemmer -
2020 Poster: Private Learning of Halfspaces: Simplifying the Construction and Reducing the Sample Complexity »
Haim Kaplan · Yishay Mansour · Uri Stemmer · Eliad Tsfadia -
2020 Oral: Adversarially Robust Streaming Algorithms via Differential Privacy »
Avinatan Hassidim · Haim Kaplan · Yishay Mansour · Yossi Matias · Uri Stemmer -
2019 : Poster and Coffee Break 2 »
Karol Hausman · Kefan Dong · Ken Goldberg · Lihong Li · Lin Yang · Lingxiao Wang · Lior Shani · Liwei Wang · Loren Amdahl-Culleton · Lucas Cassano · Marc Dymetman · Marc Bellemare · Marcin Tomczak · Margarita Castro · Marius Kloft · Marius-Constantin Dinu · Markus Holzleitner · Martha White · Mengdi Wang · Michael Jordan · Mihailo Jovanovic · Ming Yu · Minshuo Chen · Moonkyung Ryu · Muhammad Zaheer · Naman Agarwal · Nan Jiang · Niao He · Nikolaus Yasui · Nikos Karampatziakis · Nino Vieillard · Ofir Nachum · Olivier Pietquin · Ozan Sener · Pan Xu · Parameswaran Kamalaruban · Paul Mineiro · Paul Rolland · Philip Amortila · Pierre-Luc Bacon · Prakash Panangaden · Qi Cai · Qiang Liu · Quanquan Gu · Raihan Seraj · Richard Sutton · Rick Valenzano · Robert Dadashi · Rodrigo Toro Icarte · Roshan Shariff · Roy Fox · Ruosong Wang · Saeed Ghadimi · Samuel Sokota · Sean Sinclair · Sepp Hochreiter · Sergey Levine · Sergio Valcarcel Macua · Sham Kakade · Shangtong Zhang · Sheila McIlraith · Shie Mannor · Shimon Whiteson · Shuai Li · Shuang Qiu · Wai Lok Li · Siddhartha Banerjee · Sitao Luan · Tamer Basar · Thinh Doan · Tianhe Yu · Tianyi Liu · Tom Zahavy · Toryn Klassen · Tuo Zhao · Vicenç Gómez · Vincent Liu · Volkan Cevher · Wesley Suttle · Xiao-Wen Chang · Xiaohan Wei · Xiaotong Liu · Xingguo Li · Xinyi Chen · Xingyou Song · Yao Liu · YiDing Jiang · Yihao Feng · Yilun Du · Yinlam Chow · Yinyu Ye · Yishay Mansour · · Yonathan Efroni · Yongxin Chen · Yuanhao Wang · Bo Dai · Chen-Yu Wei · Harsh Shrivastava · Hongyang Zhang · Qinqing Zheng · SIDDHARTHA SATPATHI · Xueqing Liu · Andreu Vall -
2019 : Poster Spotlight 1 »
David Brandfonbrener · Joan Bruna · Tom Zahavy · Haim Kaplan · Yishay Mansour · Nikos Karampatziakis · John Langford · Paul Mineiro · Donghwan Lee · Niao He -
2019 : The AI Driving Olympics: An Accessible Robot Learning Benchmark »
Matthew Walter -
2019 Poster: Online Stochastic Shortest Path with Bandit Feedback and Unknown Transition Function »
Aviv Rosenberg · Yishay Mansour -
2019 Poster: Graph-based Discriminators: Sample Complexity and Expressiveness »
Roi Livni · Yishay Mansour -
2019 Spotlight: Graph-based Discriminators: Sample Complexity and Expressiveness »
Roi Livni · Yishay Mansour -
2019 Poster: Learning to Screen »
Alon Cohen · Avinatan Hassidim · Haim Kaplan · Yishay Mansour · Shay Moran -
2019 Poster: Individual Regret in Cooperative Nonstochastic Multi-Armed Bandits »
Yogev Bar-On · Yishay Mansour -
2019 Poster: Combinatorial Bandits with Relative Feedback »
Aadirupa Saha · Aditya Gopalan -
2019 Poster: Maximum Expected Hitting Cost of a Markov Decision Process and Informativeness of Rewards »
Falcon Dai · Matthew Walter -
2018 Poster: On preserving non-discrimination when combining expert advice »
Avrim Blum · Suriya Gunasekar · Thodoris Lykouris · Nati Srebro -
2017 Poster: Collaborative PAC Learning »
Avrim Blum · Nika Haghtalab · Ariel Procaccia · Mingda Qiao -
2016 : Robust Learning and Inference »
Yishay Mansour -
2016 Poster: Online Pricing with Strategic and Patient Buyers »
Michal Feldman · Tomer Koren · Roi Livni · Yishay Mansour · Aviv Zohar -
2015 : Listen, Attend and Walk: Neural Mapping of Navigational Instructions to Action Sequences »
Matthew Walter -
2014 Poster: Learning Optimal Commitment to Overcome Insecurity »
Avrim Blum · Nika Haghtalab · Ariel Procaccia -
2014 Poster: Learning Mixtures of Ranking Models »
Pranjal Awasthi · Avrim Blum · Or Sheffet · Aravindan Vijayaraghavan -
2014 Poster: Active Learning and Best-Response Dynamics »
Maria-Florina F Balcan · Christopher Berlind · Avrim Blum · Emma Cohen · Kaushik Patnaik · Le Song -
2014 Spotlight: Learning Mixtures of Ranking Models »
Pranjal Awasthi · Avrim Blum · Or Sheffet · Aravindan Vijayaraghavan -
2013 Poster: From Bandits to Experts: A Tale of Domination and Independence »
Noga Alon · Nicolò Cesa-Bianchi · Claudio Gentile · Yishay Mansour -
2013 Oral: From Bandits to Experts: A Tale of Domination and Independence »
Noga Alon · Nicolò Cesa-Bianchi · Claudio Gentile · Yishay Mansour -
2010 Spotlight: Trading off Mistakes and Don't-Know Predictions »
Amin Sayedi · Avrim Blum · Morteza Zadimoghaddam -
2010 Poster: Trading off Mistakes and Don't-Know Predictions »
Amin Sayedi · Morteza Zadimoghaddam · Avrim Blum -
2009 Workshop: Clustering: Science or art? Towards principled approaches »
Margareta Ackerman · Shai Ben-David · Avrim Blum · Isabelle Guyon · Ulrike von Luxburg · Robert Williamson · Reza Zadeh -
2009 Poster: Tracking Dynamic Sources of Malicious Activity at Internet Scale »
Shobha Venkataraman · Avrim Blum · Dawn Song · Subhabrata Sen · Oliver Spatscheck -
2009 Spotlight: Tracking Dynamic Sources of Malicious Activity at Internet Scale »
Shobha Venkataraman · Avrim Blum · Dawn Song · Subhabrata Sen · Oliver Spatscheck -
2008 Workshop: New Challanges in Theoretical Machine Learning: Data Dependent Concept Spaces »
Maria-Florina F Balcan · Shai Ben-David · Avrim Blum · Kristiaan Pelckmans · John Shawe-Taylor