Timezone: »
Many reinforcement learning (RL) applications have combinatorial action spaces, where each action is a composition of sub-actions. A standard RL approach ignores this inherent factorization structure, resulting in a potential failure to make meaningful inferences about rarely observed sub-action combinations; this is particularly problematic for offline settings, where data may be limited. In this work, we propose a form of linear Q-function decomposition induced by factored action spaces. We study the theoretical properties of our approach, identifying scenarios where it is guaranteed to lead to zero bias when used to approximate the Q-function. Outside the regimes with theoretical guarantees, we show that our approach can still be useful because it leads to better sample efficiency without necessarily sacrificing policy optimality, allowing us to achieve a better bias-variance trade-off. Across several offline RL problems using simulators and real-world datasets motivated by healthcare, we demonstrate that incorporating factored action spaces into value-based RL can result in better-performing policies. Our approach can help an agent make more accurate inferences within underexplored regions of the state-action space when applying RL to observational datasets.
Author Information
Shengpu Tang (University of Michigan)
Maggie Makar (University of Michigan)
Michael Sjoding (University of Michigan - Ann Arbor)
Finale Doshi-Velez (Harvard)
Jenna Wiens (University of Michigan)
More from the Same Authors
-
2021 Spotlight: Learning MDPs from Features: Predict-Then-Optimize for Sequential Decision Making by Reinforcement Learning »
Kai Wang · Sanket Shah · Haipeng Chen · Andrew Perrault · Finale Doshi-Velez · Milind Tambe -
2021 : Identification of Subgroups With Similar Benefits in Off-Policy Policy Evaluation »
Ramtin Keramati · Omer Gottesman · Leo Celi · Finale Doshi-Velez · Emma Brunskill -
2021 : Sequential Decision Making with Limited Resources »
Hallee Wong · Maggie Makar · Aniruddh Raghu · John Guttag -
2022 : An Empirical Analysis of the Advantages of Finite vs.~Infinite Width Bayesian Neural Networks »
Jiayu Yao · Yaniv Yacoby · Beau Coker · Weiwei Pan · Finale Doshi-Velez -
2022 : Feature-Level Synthesis of Human and ML Insights »
Isaac Lage · Sonali Parbhoo · Finale Doshi-Velez -
2022 : What Makes a Good Explanation?: A Unified View of Properties of Interpretable ML »
Varshini Subhash · Zixi Chen · Marton Havasi · Weiwei Pan · Finale Doshi-Velez -
2022 : Towards Data-Driven Offline Simulations for Online Reinforcement Learning »
Shengpu Tang · Felipe Vieira Frujeri · Dipendra Misra · Alex Lamb · John Langford · Paul Mineiro · Sebastian Kochman -
2022 : What Makes a Good Explanation?: A Unified View of Properties of Interpretable ML »
Zixi Chen · Varshini Subhash · Marton Havasi · Weiwei Pan · Finale Doshi-Velez -
2022 : Conditional differential measurement error: partial identifiability and estimation »
Pengrun Huang · Maggie Makar -
2022 : (When) Are Contrastive Explanations of Reinforcement Learning Helpful? »
Sanjana Narayanan · Isaac Lage · Finale Doshi-Velez -
2022 : Leveraging Human Features at Test-Time »
Isaac Lage · Sonali Parbhoo · Finale Doshi-Velez -
2022 : An Empirical Analysis of the Advantages of Finite v.s. Infinite Width Bayesian Neural Networks »
Jiayu Yao · Yaniv Yacoby · Beau Coker · Weiwei Pan · Finale Doshi-Velez -
2022 Panel: Panel 5B-3: Leveraging Factored Action… & Skills Regularized Task… »
Minjong Yoo · Shengpu Tang -
2022 : What Makes a Good Explanation?: A Unified View of Properties of Interpretable ML »
Varshini Subhash · Zixi Chen · Marton Havasi · Weiwei Pan · Finale Doshi-Velez -
2022 Poster: Learning Concept Credible Models for Mitigating Shortcuts »
Jiaxuan Wang · Sarah Jabbour · Maggie Makar · Michael Sjoding · Jenna Wiens -
2022 Poster: Addressing Leakage in Concept Bottleneck Models »
Marton Havasi · Sonali Parbhoo · Finale Doshi-Velez -
2022 Poster: Causally motivated multi-shortcut identification and removal »
Jiayun Zheng · Maggie Makar -
2021 : Retrospective Panel »
Sergey Levine · Nando de Freitas · Emma Brunskill · Finale Doshi-Velez · Nan Jiang · Rishabh Agarwal -
2021 : LAF | Panel discussion »
Aaron Snoswell · Jake Goldenfein · Finale Doshi-Velez · Evi Micha · Ivana Dusparic · Jonathan Stray -
2021 : LAF | The Role of Explanation in RL Legitimacy, Accountability, and Feedback »
Finale Doshi-Velez -
2021 : Invited talk #2: Finale Doshi-Velez »
Finale Doshi-Velez -
2021 Poster: Learning MDPs from Features: Predict-Then-Optimize for Sequential Decision Making by Reinforcement Learning »
Kai Wang · Sanket Shah · Haipeng Chen · Andrew Perrault · Finale Doshi-Velez · Milind Tambe -
2020 : Batch RL Models Built for Validation »
Finale Doshi-Velez -
2020 : Panel »
Emma Brunskill · Nan Jiang · Nando de Freitas · Finale Doshi-Velez · Sergey Levine · John Langford · Lihong Li · George Tucker · Rishabh Agarwal · Aviral Kumar -
2020 : Q & A and Panel Session with Tom Mitchell, Jenn Wortman Vaughan, Sanjoy Dasgupta, and Finale Doshi-Velez »
Tom Mitchell · Jennifer Wortman Vaughan · Sanjoy Dasgupta · Finale Doshi-Velez · Zachary Lipton -
2020 Workshop: I Can’t Believe It’s Not Better! Bridging the gap between theory and empiricism in probabilistic machine learning »
Jessica Forde · Francisco Ruiz · Melanie Fernandez Pradier · Aaron Schein · Finale Doshi-Velez · Isabel Valera · David Blei · Hanna Wallach -
2020 Poster: Incorporating Interpretable Output Constraints in Bayesian Neural Networks »
Wanqian Yang · Lars Lorch · Moritz Graule · Himabindu Lakkaraju · Finale Doshi-Velez -
2020 Spotlight: Incorporating Interpretable Output Constraints in Bayesian Neural Networks »
Wanqian Yang · Lars Lorch · Moritz Graule · Himabindu Lakkaraju · Finale Doshi-Velez -
2020 Poster: Model-based Reinforcement Learning for Semi-Markov Decision Processes with Neural ODEs »
Jianzhun Du · Joseph Futoma · Finale Doshi-Velez -
2020 : Discussion Panel: Hugo Larochelle, Finale Doshi-Velez, Devi Parikh, Marc Deisenroth, Julien Mairal, Katja Hofmann, Phillip Isola, and Michael Bowling »
Hugo Larochelle · Finale Doshi-Velez · Marc Deisenroth · Devi Parikh · Julien Mairal · Katja Hofmann · Phillip Isola · Michael Bowling -
2019 : Panel - The Role of Communication at Large: Aparna Lakshmiratan, Jason Yosinski, Been Kim, Surya Ganguli, Finale Doshi-Velez »
Aparna Lakshmiratan · Finale Doshi-Velez · Surya Ganguli · Zachary Lipton · Michela Paganini · Anima Anandkumar · Jason Yosinski -
2019 : Invited talk #4 »
Finale Doshi-Velez -
2019 : Finale Doshi-Velez: Combining Statistical methods with Human Input for Evaluation and Optimization in Batch Settings »
Finale Doshi-Velez -
2018 : Finale Doshi-Velez »
Finale Doshi-Velez -
2018 : Panel on research process »
Zachary Lipton · Charles Sutton · Finale Doshi-Velez · Hanna Wallach · Suchi Saria · Rich Caruana · Thomas Rainforth -
2018 : Finale Doshi-Velez »
Finale Doshi-Velez -
2018 Poster: Human-in-the-Loop Interpretability Prior »
Isaac Lage · Andrew Ross · Samuel J Gershman · Been Kim · Finale Doshi-Velez -
2018 Spotlight: Human-in-the-Loop Interpretability Prior »
Isaac Lage · Andrew Ross · Samuel J Gershman · Been Kim · Finale Doshi-Velez -
2018 Poster: Representation Balancing MDPs for Off-policy Policy Evaluation »
Yao Liu · Omer Gottesman · Aniruddh Raghu · Matthieu Komorowski · Aldo Faisal · Finale Doshi-Velez · Emma Brunskill -
2017 : Panel Session »
Neil Lawrence · Finale Doshi-Velez · Zoubin Ghahramani · Yann LeCun · Max Welling · Yee Whye Teh · Ole Winther -
2017 : Finale Doshi-Velez »
Finale Doshi-Velez -
2017 : Automatic Model Selection in BNNs with Horseshoe Priors »
Finale Doshi-Velez -
2017 : Coffee break and Poster Session I »
Nishith Khandwala · Steve Gallant · Gregory Way · Aniruddh Raghu · Li Shen · Aydan Gasimova · Alican Bozkurt · William Boag · Daniel Lopez-Martinez · Ulrich Bodenhofer · Samaneh Nasiri GhoshehBolagh · Michelle Guo · Christoph Kurz · Kirubin Pillay · Kimis Perros · George H Chen · Alexandre Yahi · Madhumita Sushil · Sanjay Purushotham · Elena Tutubalina · Tejpal Virdi · Marc-Andre Schulz · Samuel Weisenthal · Bharat Srikishan · Petar Veličković · Kartik Ahuja · Andrew Miller · Erin Craig · Disi Ji · Filip Dabek · Chloé Pou-Prom · Hejia Zhang · Janani Kalyanam · Wei-Hung Weng · Harish Bhat · Hugh Chen · Simon Kohl · Mingwu Gao · Tingting Zhu · Ming-Zher Poh · Iñigo Urteaga · Antoine Honoré · Alessandro De Palma · Maruan Al-Shedivat · Pranav Rajpurkar · Matthew McDermott · Vincent Chen · Yanan Sui · Yun-Geun Lee · Li-Fang Cheng · Chen Fang · Sibt ul Hussain · Cesare Furlanello · Zeev Waks · Hiba Chougrad · Hedvig Kjellstrom · Finale Doshi-Velez · Wolfgang Fruehwirt · Yanqing Zhang · Lily Hu · Junfang Chen · Sunho Park · Gatis Mikelsons · Jumana Dakka · Stephanie Hyland · yann chevaleyre · Hyunwoo Lee · Xavier Giro-i-Nieto · David Kale · Michael Hughes · Gabriel Erion · Rishab Mehra · William Zame · Stojan Trajanovski · Prithwish Chakraborty · Kelly Peterson · Muktabh Mayank Srivastava · Amy Jin · Heliodoro Tejeda Lemus · Priyadip Ray · Tamas Madl · Joseph Futoma · Enhao Gong · Syed Rameel Ahmad · Eric Lei · Ferdinand Legros -
2017 : Contributed talk: Beyond Sparsity: Tree-based Regularization of Deep Models for Interpretability »
Mike Wu · Sonali Parbhoo · Finale Doshi-Velez -
2017 : Invited talk: The Role of Explanation in Holding AIs Accountable »
Finale Doshi-Velez -
2017 Poster: Robust and Efficient Transfer Learning with Hidden Parameter Markov Decision Processes »
Taylor Killian · Samuel Daulton · Finale Doshi-Velez · George Konidaris -
2017 Oral: Robust and Efficient Transfer Learning with Hidden Parameter Markov Decision Processes »
Taylor Killian · Samuel Daulton · Finale Doshi-Velez · George Konidaris -
2016 : BNNs for RL: A Success Story and Open Questions »
Finale Doshi-Velez -
2015 Workshop: Machine Learning From and For Adaptive User Technologies: From Active Learning & Experimentation to Optimization & Personalization »
Joseph Jay Williams · Yasin Abbasi Yadkori · Finale Doshi-Velez -
2015 : Data Driven Phenotyping for Diseases »
Finale Doshi-Velez -
2015 Poster: Mind the Gap: A Generative Approach to Interpretable Feature Selection and Extraction »
Been Kim · Julie A Shah · Finale Doshi-Velez