Timezone: »

 
Tutorial
Reinforcement Learning for Embodied Cognition
Dana Ballard

Mon Dec 06 09:30 AM -- 11:30 AM (PST) @
Event URL: Regency D »

The enormous progress in instrumentation for measuring brain states has made it possible to tackle the large issue of an overall model of brain computation. The intrinsic complexity of the brain can lead one to set aside issues related to its relationships with the body, but the field of Embodied Cognition stresses that understanding of brain function at the system level requires one to address the role of the brain-body interface. While it is obvious that the brain receives all its input through the senses and directs its outputs through the motor system, it has only recently been appreciated that the body interface performs huge amounts of computation that does not have to be repeated by the brain, and thus affords the brain great simplifications in its representations. In effect the brain's abstract states can explicitly or implicitly refer to coded representations of the world created by the body.

Even if the brain can communicate with the world through abstractions, the severe speed limitations in its neural circuitry means that vast amounts of indexing must be performed during development so that appropriate behavioral responses can be rapidly accessed. One way this could happen would be if the brain used some kind of decomposition whereby behavioral primitives could be quickly accessed and combined. Such a factorization has huge synergies with embodied cognition models, which can use the natural filtering imposed by the body in directing behavior to select relevant primitives. These advantages can be explored with virtual environments replete with humanoid avatars. Such settings allow the manipulation of experimental parameters in systematic ways. Our test settings are those of everyday natural settings such as walking and driving in a small town, and sandwich making and looking for lost items in an apartment.

The issues we focus on center around the programming of the individual behavioral primitives using reinforcement learning (RL). Central issues are eye fixation programming, credit assignment to individual behavioral modules, and learning the value of behaviors via inverse reinforcement learning.

Eye fixations are the central information gathering method used by humans, yet the protocols for programming them are still unsettled. We show that information gain in an RL setting can potentially explain experimental data.

Credit assignment. If behaviors are to be decomposed into individual modules, then dividing up received reward amongst them becomes a major issue. We show that Bayesian estimation techniques, used in the RL setting, resolve this issue efficiently.

Inverse Reinforcement Learning. One way to learn new behaviors would be if a human agent could imitate them and learn their value. We show that an efficient algorithm developed by Rothkopf can estimate value of behaviors from observed data using Bayesian RL techniques.

Author Information

Dana Ballard (University of Texas, Austin)

Dana H. Ballard obtained his undergraduate degree in Aeronautics and Astronautics from M.I.T. in 1967. Subsequently he obtained MS and PhD degrees in information engineering from the University of Michigan and the University of California at Irvine in 1969 and 1974 respectively. He is the author of two books, Computer Vision (with Christopher Brown) and An Introduction to Natural Computation. His main research interest is in computational theories of the brain with emphasis on human vision. His research places emphasis on Embodied Cognition. Starting in 1985, he and Chris Brown designed and built the first high-speed binocular camera control system capable of simulating human eye movements in real time. Currently he pursues this research at the University of Texas at Austin by using model humans in virtual reality environments. His focus is on the use of machine learning as a model for human behavior with an emphasis on reinforcement learning

More from the Same Authors