Timezone: »

Deep Learning for Real-Time Atari Game Play Using Offline Monte-Carlo Tree Search Planning
Xiaoxiao Guo · Satinder Singh · Honglak Lee · Richard L Lewis · Xiaoshi Wang

Tue Dec 09 04:00 PM -- 08:59 PM (PST) @ Level 2, room 210D #None

The combination of modern Reinforcement Learning and Deep Learning approaches holds the promise of making significant progress on challenging applications requiring both rich perception and policy-selection. The Arcade Learning Environment (ALE) provides a set of Atari games that represent a useful benchmark set of such applications. A recent breakthrough in combining model-free reinforcement learning with deep learning, called DQN, achieves the best real-time agents thus far. Planning-based approaches achieve far higher scores than the best model-free approaches, but they exploit information that is not available to human players, and they are orders of magnitude slower than needed for real-time play. Our main goal in this work is to build a better real-time Atari game playing agent than DQN. The central idea is to use the slow planning-based agents to provide training data for a deep-learning architecture capable of real-time play. We proposed new agents based on this idea and show that they outperform DQN.

Author Information

Xiaoxiao Guo (University of Michigan, Ann Arbor)
Satinder Singh (University of Michigan)
Honglak Lee (Google / U. Michigan)
Richard L Lewis (University of Michigan)
Xiaoshi Wang (University of Michigan)

More from the Same Authors