Skip to yearly menu bar Skip to main content


San Diego Spotlight Poster

Imitation Beyond Expectation Using Pluralistic Stochastic Dominance

Ali Farajzadeh · Danyal Saeed · Syed M Abbas · Rushit Shah · Aadirupa Saha · Brian Ziebart

Exhibit Hall C,D,E #515
[ ]
Thu 4 Dec 4:30 p.m. PST — 7:30 p.m. PST

Abstract:

Imitation learning seeks policies reflecting the values of demonstrated behaviors. Prevalent approaches learn to match or exceed the demonstrator's performance in expectation without knowing the demonstrator’s reward function. Unfortunately, this does not induce pluralistic imitators that learn to support qualitatively distinct demonstrations. We reformulate imitation learning using stochastic dominance over the demonstrations' reward distribution across a range of reward functions as our foundational aim. Our approach matches imitator policy samples (or support) with demonstrations using optimal transport theory to define an imitation learning objective over trajectory pairs. We demonstrate the benefits of pluralistic stochastic dominance (PSD) for imitation in both theory and practice.

Live content is unavailable. Log in and register to view live content