Show or Tell? Interactive Task Learning with Large Language Models
in
Workshop: Workshop on Multi-Turn Interactions in Large Language Models
Abstract
Large Language Models (LLMs) can perform tasks specified in natural language,making them accessible to users regardless of technical background. However,specifying tasks within a single, static prompt is often both difficult and suboptimal.Interactive Task Learning (ITL)—a goal for autonomous agents—proposesto address this challenge through multi-turn interactions: teachers provide a taskdescription and (optionally) a demonstration, agents attempt the task while askingclarifying questions, and teachers offer feedback. Despite ITL’s promise, systematicevaluation of LLMs’ interactive learning capabilities remains limited. We introducethe ListOps Domain, a novel testbed for evaluating models’ ability to learncompositional symbolic tasks through ITL. We evaluate small-to-medium sizeLLMs (4 to 32 billion parameters) and find that a limited form of teacher feedback—expressing only reminders about broken rules rather than explicitly identifyingor correcting errors—enhances generalization. Using this feedback, we comparemodels’ ITL and Few-Shot Learning (FSL) capabilities and find that ITL frequentlyoutperforms FSL, especially within more powerful models. We conclude with adiscussion of limitations and recommendations for advancing ITL research.