Timezone: »
We propose a generalization of constrained Markov decision processes (CMDPs) that we call the \emph{semi-infinitely constrained Markov decision process} (SICMDP).Particularly, in a SICMDP model, we impose a continuum of constraints instead of a finite number of constraints as in the case of ordinary CMDPs.We also devise a reinforcement learning algorithm for SICMDPs that we call SI-CRL.We first transform the reinforcement learning problem into a linear semi-infinitely programming (LSIP) problem and then use the dual exchange method in the LSIP literature to solve it.To the best of our knowledge, we are the first to apply tools from semi-infinitely programming (SIP) to solve reinforcement learning problems.We present theoretical analysis for SI-CRL, identifying its sample complexity and iteration complexity.We also conduct extensive numerical examples to illustrate the SICMDP model and validate the SI-CRL algorithm.
Author Information
Liangyu Zhang (Peking University)
Yang Peng (Peking University)
Wenhao Yang (Peking University)
Zhihua Zhang (Peking University)
More from the Same Authors
-
2022 Poster: Personalized Federated Learning towards Communication Efficiency, Robustness and Fairness »
Shiyun Lin · Yuze Han · Xiang Li · Zhihua Zhang -
2022 Poster: Asymptotic Behaviors of Projected Stochastic Approximation: A Jump Diffusion Perspective »
Jiadong Liang · Yuze Han · Xiang Li · Zhihua Zhang -
2022 Spotlight: Personalized Federated Learning towards Communication Efficiency, Robustness and Fairness »
Shiyun Lin · Yuze Han · Xiang Li · Zhihua Zhang -
2022 Spotlight: Lightning Talks 3A-1 »
Shu Ding · Wanxing Chang · Jiyang Guan · Mouxiang Chen · Guan Gui · Yue Tan · Shiyun Lin · Guodong Long · Yuze Han · Wei Wang · Zhen Zhao · Ye Shi · Jian Liang · Chenghao Liu · Lei Qi · Ran He · Jie Ma · Zemin Liu · Xiang Li · Hoang Tuan · Luping Zhou · Zhihua Zhang · Jianling Sun · Jingya Wang · LU LIU · Tianyi Zhou · Lei Wang · Jing Jiang · Yinghuan Shi -
2022 Poster: A Statistical Online Inference Approach in Averaged Stochastic Approximation »
Chuhan Xie · Zhihua Zhang -
2019 Poster: A Regularized Approach to Sparse Optimal Policy in Reinforcement Learning »
Wenhao Yang · Xiang Li · Zhihua Zhang