Skip to yearly menu bar Skip to main content


Invited Talk
in
Competition: The Robot Air Hockey Challenge: Robust, Reliable, and Safe Learning Techniques for Real-world Robotics

Robot Air Hockey and Other Physical Challenges: An Historical Perspective

Christopher G. Atkeson

[ ]
Fri 15 Dec 7:15 a.m. PST — 7:45 a.m. PST

Abstract:

There have been many air hockey robots (Search for "air hockey robot" on youtube.com). I will survey ideas on how to design air hockey players, and relate them to current work on controlling robots in a variety of dynamic tasks. Two decades ago we explored manually defining primitives or skills (forehand, backhand, ...) and learning a primitive selector, first from observation, and then refining it with practice. Our view was that it is useful in learning to segment behavior into a sequence of "subroutine calls", each call having "arguments" or parameters. We chose tasks that humans do such as air hockey so we could explore learning from observation (aka learning from demonstration, imitation learning) as well as optimization-based learning approaches to learning from practice such as reinforcement learning. A key observation was that to learn from observation, the learner had to perceive in terms of primitives: segmenting behavior into individual primitives and estimating what parameters were used for each time a primitive is used. Our motivation for decomposing learning into two parts (learning skills and learning which skill to use when) is that we believed that learning a behavior selector could be very data efficient. Our approach to training the selector from observation used supervised learning, and learning from practice used model free reinforcement learning in a form that was sufficiently data efficient that all learning could be done on a physical robot rather than in simulation. One innovation we see today that was not practical 20 years ago is large scale training in simulation, transferring the learned controller to a real robot, and further learning in reality. Some current approaches to dynamic robot control pursue a conceptually similar approach of explicitly separating learning "skills" and learning to select "skills" by implicitly defining primitives by using a manually designed curriculum, and then learning a selector (in one case by distilling separate skill networks into a single network). Other approaches train on a large number of manually or automatically generated situations, and do not explicitly define a set of primitives or individual skills.

Chat is not available.