Timezone: »
Spotlight
Provable General Function Class Representation Learning in Multitask Bandits and MDP
Rui Lu · Andrew Zhao · Simon Du · Gao Huang
While multitask representation learning has become a popular approach in reinforcement learning (RL) to boost the sample efficiency, the theoretical understanding of why and how it works is still limited. Most previous analytical works could only assume that the representation function is already known to the agent or from linear function class, since analyzing general function class representation encounters non-trivial technical obstacles such as generalization guarantee, formulation of confidence bound in abstract function space, etc. However, linear-case analysis heavily relies on the particularity of linear function class, while real-world practice usually adopts general non-linear representation functions like neural networks. This significantly reduces its applicability. In this work, we extend the analysis to general function class representations. Specifically, we consider an agent playing $M$ contextual bandits (or MDPs) concurrently and extracting a shared representation function $\phi$ from a specific function class $\Phi$ using our proposed Generalized Functional Upper Confidence Bound algorithm (GFUCB). We theoretically validate the benefit of multitask representation learning within general function class for bandits and linear MDP for the first time. Lastly, we conduct experiments to demonstrate the effectiveness of our algorithm with neural net representation.
Author Information
Rui Lu (Tsinghua University)
Andrew Zhao (Tsinghua University)
Andrew Zhao is a PhD student at Tsinghua University. He obtained his masters degree from USC in 2020, and undergrad degree from UBC in 2017. His research interests are in machine learning and reinforcement learning.
Simon Du (University of Washington)
Gao Huang (Cornell University)
Related Events (a corresponding poster, oral, or spotlight)
-
2022 Poster: Provable General Function Class Representation Learning in Multitask Bandits and MDP »
Dates n/a. Room
More from the Same Authors
-
2022 Poster: Contrastive Language-Image Pre-Training with Knowledge Graphs »
Xuran Pan · Tianzhu Ye · Dongchen Han · Shiji Song · Gao Huang -
2022 Poster: A Mixture Of Surprises for Unsupervised Reinforcement Learning »
Andrew Zhao · Matthieu Lin · Yangguang Li · Yong-jin Liu · Gao Huang -
2022 Poster: Efficient Knowledge Distillation from Model Checkpoints »
Chaofei Wang · Qisen Yang · Rui Huang · Shiji Song · Gao Huang -
2022 : Understanding Curriculum Learning in Policy Optimization for Online Combinatorial Optimization »
Runlong Zhou · Yuandong Tian · YI WU · Simon Du -
2022 : Boosting Offline Reinforcement Learning via Data Resampling »
Yang Yue · Bingyi Kang · Xiao Ma · Zhongwen Xu · Gao Huang · Shuicheng Yan -
2022 Spotlight: Lightning Talks 4A-4 »
Yunhao Tang · LING LIANG · Thomas Chau · Daeha Kim · Junbiao Cui · Rui Lu · Lei Song · Byung Cheol Song · Andrew Zhao · Remi Munos · Łukasz Dudziak · Jiye Liang · Ke Xue · Kaidi Xu · Mark Rowland · Hongkai Wen · Xing Hu · Xiaobin Huang · Simon Du · Nicholas Lane · Chao Qian · Lei Deng · Bernardo Avila Pires · Gao Huang · Will Dabney · Mohamed Abdelfattah · Yuan Xie · Marc Bellemare -
2022 Spotlight: Lightning Talks 1B-3 »
Chaofei Wang · Qixun Wang · Jing Xu · Long-Kai Huang · Xi Weng · Fei Ye · Harsh Rangwani · shrinivas ramasubramanian · Yifei Wang · Qisen Yang · Xu Luo · Lei Huang · Adrian G. Bors · Ying Wei · Xinglin Pan · Sho Takemori · Hong Zhu · Rui Huang · Lei Zhao · Yisen Wang · Kato Takashi · Shiji Song · Yanan Li · Rao Anwer · Yuhei Umeda · Salman Khan · Gao Huang · Wenjie Pei · Fahad Shahbaz Khan · Venkatesh Babu R · Zenglin Xu -
2022 Spotlight: Efficient Knowledge Distillation from Model Checkpoints »
Chaofei Wang · Qisen Yang · Rui Huang · Shiji Song · Gao Huang -
2022 Poster: When are Offline Two-Player Zero-Sum Markov Games Solvable? »
Qiwen Cui · Simon Du -
2022 Poster: Latency-aware Spatial-wise Dynamic Networks »
Yizeng Han · Zhihang Yuan · Yifan Pu · Chenhao Xue · Shiji Song · Guangyu Sun · Gao Huang -
2022 Poster: Learning in Congestion Games with Bandit Feedback »
Qiwen Cui · Zhihan Xiong · Maryam Fazel · Simon Du -
2022 Poster: Provably Efficient Offline Multi-agent Reinforcement Learning via Strategy-wise Bonus »
Qiwen Cui · Simon Du -
2022 Poster: On Gap-dependent Bounds for Offline Reinforcement Learning »
Xinqi Wang · Qiwen Cui · Simon Du -
2022 Poster: Near-Optimal Randomized Exploration for Tabular Markov Decision Processes »
Zhihan Xiong · Ruoqi Shen · Qiwen Cui · Maryam Fazel · Simon Du -
2021 Workshop: Ecological Theory of Reinforcement Learning: How Does Task Design Influence Agent Learning? »
Manfred Díaz · Hiroki Furuta · Elise van der Pol · Lisa Lee · Shixiang (Shane) Gu · Pablo Samuel Castro · Simon Du · Marc Bellemare · Sergey Levine -
2016 Poster: Supervised Word Mover's Distance »
Gao Huang · Chuan Guo · Matt J Kusner · Yu Sun · Fei Sha · Kilian Weinberger -
2016 Oral: Supervised Word Mover's Distance »
Gao Huang · Chuan Guo · Matt J Kusner · Yu Sun · Fei Sha · Kilian Weinberger