Timezone: »
Poster
Breaking the Sample Size Barrier in Model-Based Reinforcement Learning with a Generative Model
Gen Li · Yuting Wei · Yuejie Chi · Yuantao Gu · Yuxin Chen
We investigate the sample efficiency of reinforcement learning in a $\gamma$-discounted infinite-horizon Markov decision process (MDP) with state space S and action space A, assuming access to a generative model. Despite a number of prior work tackling this problem, a complete picture of the trade-offs between sample complexity and statistical accuracy is yet to be determined. In particular, prior results suffer from a sample size barrier, in the sense that their claimed statistical guarantees hold only when the sample size exceeds at least $ |S| |A| / (1-\gamma)^2 $ (up to some log factor). The current paper overcomes this barrier by certifying the minimax optimality of model-based reinforcement learning as soon as the sample size exceeds the order of $ |S| |A| / (1-\gamma) $ (modulo some log factor). More specifically, a perturbed model-based planning algorithm provably finds an $\epsilon$-optimal policy with an order of $ |S| |A| / ((1-\gamma)^3\epsilon^2 ) $ samples (up to log factor) for any $0< \epsilon < 1/(1-\gamma)$. Along the way, we derive improved (instance-dependent) guarantees for model-based policy evaluation. To the best of our knowledge, this work provides the first minimax-optimal guarantee in a generative model that accommodates the entire range of sample sizes (beyond which finding a meaningful policy is information theoretically impossible).
Author Information
Gen Li (Tsinghua University)
Yuting Wei (Carnegie Mellon University)
Yuejie Chi (CMU)
Yuantao Gu (Tsinghua University)
Yuxin Chen (Princeton University)
More from the Same Authors
-
2021 Spotlight: Breaking the Sample Complexity Barrier to Regret-Optimal Model-Free Reinforcement Learning »
Gen Li · Laixi Shi · Yuxin Chen · Yuantao Gu · Yuejie Chi -
2021 : DESTRESS: Computation-Optimal and Communication-Efficient Decentralized Nonconvex Finite-Sum Optimization »
Boyue Li · Zhize Li · Yuejie Chi -
2021 : DESTRESS: Computation-Optimal and Communication-Efficient Decentralized Nonconvex Finite-Sum Optimization »
Boyue Li · Zhize Li · Yuejie Chi -
2021 : Policy Mirror Descent for Regularized RL: A Generalized Framework with Linear Convergence »
Wenhao Zhan · Shicong Cen · Baihe Huang · Yuxin Chen · Jason Lee · Yuejie Chi -
2021 : Policy Mirror Descent for Regularized RL: A Generalized Framework with Linear Convergence »
Wenhao Zhan · Shicong Cen · Baihe Huang · Yuxin Chen · Jason Lee · Yuejie Chi -
2022 : A Multi-Token Coordinate Descent Method for Vertical Federated Learning »
Pedro Valdeira · Yuejie Chi · Claudia Soares · Joao Xavier -
2023 Poster: Counterfactual Generation with Identifiability Guarantee »
hanqi yan · Lingjing Kong · Lin Gui · Yuejie Chi · Eric Xing · Yulan He · Kun Zhang -
2023 Poster: The Curious Price of Distributional Robustness in Reinforcement Learning with a Generative Model »
Laixi Shi · Gen Li · Yuting Wei · Yuxin Chen · Matthieu Geist · Yuejie Chi -
2023 Poster: Seeing is not Believing: Robust Reinforcement Learning against Spurious Correlation »
Wenhao Ding · Laixi Shi · Yuejie Chi · DING ZHAO -
2023 Poster: Identification of Nonlinear Latent Hierarchical Models »
Lingjing Kong · Biwei Huang · Feng Xie · Eric Xing · Yuejie Chi · Kun Zhang -
2023 Poster: Reward-agnostic Fine-tuning: Provable Statistical Benefits of Hybrid Reinforcement Learning »
Gen Li · Wenhao Zhan · Jason Lee · Yuejie Chi · Yuxin Chen -
2022 Poster: BEER: Fast $O(1/T)$ Rate for Decentralized Nonconvex Optimization with Communication Compression »
Haoyu Zhao · Boyue Li · Zhize Li · Peter Richtarik · Yuejie Chi -
2022 Poster: Minimax-Optimal Multi-Agent RL in Markov Games With a Generative Model »
Gen Li · Yuejie Chi · Yuting Wei · Yuxin Chen -
2022 Poster: SoteriaFL: A Unified Framework for Private Federated Learning with Communication Compression »
Zhize Li · Haoyu Zhao · Boyue Li · Yuejie Chi -
2021 Poster: Fast Policy Extragradient Methods for Competitive Games with Entropy Regularization »
Shicong Cen · Yuting Wei · Yuejie Chi -
2021 Poster: Breaking the Sample Complexity Barrier to Regret-Optimal Model-Free Reinforcement Learning »
Gen Li · Laixi Shi · Yuxin Chen · Yuantao Gu · Yuejie Chi -
2021 Poster: Sample-Efficient Reinforcement Learning Is Feasible for Linearly Realizable MDPs with Limited Revisiting »
Gen Li · Yuxin Chen · Yuejie Chi · Yuantao Gu · Yuting Wei -
2020 Poster: Randomized tests for high-dimensional regression: A more efficient and powerful solution »
Yue Li · Ilmun Kim · Yuting Wei -
2020 Poster: Sample Complexity of Asynchronous Q-Learning: Sharper Analysis and Variance Reduction »
Gen Li · Yuting Wei · Yuejie Chi · Yuantao Gu · Yuxin Chen -
2019 Poster: Nonconvex Low-Rank Symmetric Tensor Completion from Noisy Data »
Changxiao Cai · Gen Li · H. Vincent Poor · Yuxin Chen