Timezone: »
Poster
Sample Complexity of Asynchronous QLearning: Sharper Analysis and Variance Reduction
Gen Li · Yuting Wei · Yuejie Chi · Yuantao Gu · Yuxin Chen
Asynchronous Qlearning aims to learn the optimal actionvalue function (or Qfunction) of a Markov decision process (MDP), based on a single trajectory of Markovian samples induced by a behavior policy. Focusing on a $\gamma$discounted MDP with state space S and action space A, we demonstrate that the $ \ell_{\infty} $based sample complexity of classical asynchronous Qlearning  namely, the number of samples needed to yield an entrywise $\epsilon$accurate estimate of the Qfunction  is at most on the order of $ \frac{1}{ \mu_{\min}(1\gamma)^5 \epsilon^2 }+ \frac{ t_{\mathsf{mix}} }{ \mu_{\min}(1\gamma) } $ up to some logarithmic factor, provided that a proper constant learning rate is adopted. Here, $ t_{\mathsf{mix}} $ and $ \mu_{\min} $ denote respectively the mixing time and the minimum stateaction occupancy probability of the sample trajectory. The first term of this bound matches the complexity in the case with independent samples drawn from the stationary distribution of the trajectory. The second term reflects the expense taken for the empirical distribution of the Markovian trajectory to reach a steady state, which is incurred at the very beginning and becomes amortized as the algorithm runs. Encouragingly, the above bound improves upon the stateoftheart result by a factor of at least SA. Further, the scaling on the discount complexity can be improved by means of variance reduction.
Author Information
Gen Li (Tsinghua University)
Yuting Wei (Carnegie Mellon University)
Yuejie Chi (CMU)
Yuantao Gu (Tsinghua University)
Yuxin Chen (Princeton University)
More from the Same Authors

2021 Spotlight: Breaking the Sample Complexity Barrier to RegretOptimal ModelFree Reinforcement Learning »
Gen Li · Laixi Shi · Yuxin Chen · Yuantao Gu · Yuejie Chi 
2021 : DESTRESS: ComputationOptimal and CommunicationEfficient Decentralized Nonconvex FiniteSum Optimization »
Boyue Li · Zhize Li · Yuejie Chi 
2021 : DESTRESS: ComputationOptimal and CommunicationEfficient Decentralized Nonconvex FiniteSum Optimization »
Boyue Li · Zhize Li · Yuejie Chi 
2021 : Policy Mirror Descent for Regularized RL: A Generalized Framework with Linear Convergence »
Wenhao Zhan · Shicong Cen · Baihe Huang · Yuxin Chen · Jason Lee · Yuejie Chi 
2021 : Policy Mirror Descent for Regularized RL: A Generalized Framework with Linear Convergence »
Wenhao Zhan · Shicong Cen · Baihe Huang · Yuxin Chen · Jason Lee · Yuejie Chi 
2022 : A MultiToken Coordinate Descent Method for Vertical Federated Learning »
Pedro Valdeira · Yuejie Chi · Claudia Soares · Joao Xavier 
2022 Poster: BEER: Fast $O(1/T)$ Rate for Decentralized Nonconvex Optimization with Communication Compression »
Haoyu Zhao · Boyue Li · Zhize Li · Peter Richtarik · Yuejie Chi 
2022 Poster: MinimaxOptimal MultiAgent RL in Markov Games With a Generative Model »
Gen Li · Yuejie Chi · Yuting Wei · Yuxin Chen 
2022 Poster: SoteriaFL: A Unified Framework for Private Federated Learning with Communication Compression »
Zhize Li · Haoyu Zhao · Boyue Li · Yuejie Chi 
2021 Poster: Fast Policy Extragradient Methods for Competitive Games with Entropy Regularization »
Shicong Cen · Yuting Wei · Yuejie Chi 
2021 Poster: Breaking the Sample Complexity Barrier to RegretOptimal ModelFree Reinforcement Learning »
Gen Li · Laixi Shi · Yuxin Chen · Yuantao Gu · Yuejie Chi 
2021 Poster: SampleEfficient Reinforcement Learning Is Feasible for Linearly Realizable MDPs with Limited Revisiting »
Gen Li · Yuxin Chen · Yuejie Chi · Yuantao Gu · Yuting Wei 
2020 Poster: Randomized tests for highdimensional regression: A more efficient and powerful solution »
Yue Li · Ilmun Kim · Yuting Wei 
2020 Poster: Breaking the Sample Size Barrier in ModelBased Reinforcement Learning with a Generative Model »
Gen Li · Yuting Wei · Yuejie Chi · Yuantao Gu · Yuxin Chen 
2019 Poster: Nonconvex LowRank Symmetric Tensor Completion from Noisy Data »
Changxiao Cai · Gen Li · H. Vincent Poor · Yuxin Chen