Timezone: »
Imitation learning (IL) is a general learning paradigm for sequential decision-making problems. Interactive imitation learning, where learners can interactively query for expert annotations, has been shown to achieve provably superior sample efficiency guarantees compared with its offline counterpart or reinforcement learning. In this work, we study classification-based online imitation learning (abbrev. COIL) and the fundamental feasibility to design oracle-efficient regret-minimization algorithms in this setting, with a focus on the general non-realizable case. We make the following contributions: (1) we show that in the COIL problem, any proper online learning algorithm cannot guarantee a sublinear regret in general; (2) we propose Logger, an improper online learning algorithmic framework, that reduces COIL to online linear optimization, by utilizing a new definition of mixed policy class; (3) we design two oracle-efficient algorithms within the Logger framework that enjoy different sample and interaction round complexity tradeoffs, and show their improvements over behavior cloning; (4) we show that under standard complexity-theoretic assumptions, efficient dynamic regret minimization is infeasible in the Logger framework.
Author Information
Yichen Li (The University of Arizona)
Chicheng Zhang (University of Arizona)
More from the Same Authors
-
2022 Poster: PopArt: Efficient Sparse Regression and Experimental Design for Optimal Sparse Linear Bandits »
Kyoungseok Jang · Chicheng Zhang · Kwang-Sung Jun -
2021 Poster: Provably efficient multi-task reinforcement learning with model transfer »
Chicheng Zhang · Zhi Wang -
2020 Poster: Crush Optimism with Pessimism: Structured Bandits Beyond Asymptotic Optimality »
Kwang-Sung Jun · Chicheng Zhang -
2020 Poster: Efficient active learning of sparse halfspaces with arbitrary bounded noise »
Chicheng Zhang · Jie Shen · Pranjal Awasthi -
2020 Poster: Efficient Contextual Bandits with Continuous Actions »
Maryam Majzoubi · Chicheng Zhang · Rajan Chari · Akshay Krishnamurthy · John Langford · Aleksandrs Slivkins -
2020 Oral: Efficient active learning of sparse halfspaces with arbitrary bounded noise »
Chicheng Zhang · Jie Shen · Pranjal Awasthi