Anca Dragan: Learning human preferences from language
Anca Dragan
2022 Invited Talk
in
Workshop: InterNLP: Workshop on Interactive Learning for Natural Language Processing
in
Workshop: InterNLP: Workshop on Interactive Learning for Natural Language Processing
Abstract
In classic instruction following, language like "I'd like the JetBlue flight" maps to actions (e.g., selecting that flight). However, language also conveys information about a user's underlying reward function (e.g., a general preference for JetBlue), which can allow a model to carry out desirable actions in new contexts. In this talk, I'll share a model that infers rewards from language pragmatically: reasoning about how speakers choose utterances not only to elicit desired actions, but also to reveal information about their preferences.
Video
Chat is not available.
Successful Page Load