Timezone: »

 
ProgPrompt: Generating Situated Robot Task Plans using Large Language Models
Ishika Singh · Valts Blukis · Arsalan Mousavian · Ankit Goyal · Danfei Xu · Jonathan Tremblay · Dieter Fox · Jesse Thomason · Animesh Garg
Event URL: https://openreview.net/forum?id=aflRdmGOhw1 »

Task planning can require defining myriad domain knowledge about the world in which a robot needs to act. To ameliorate that effort, large language models (LLMs) can be used to score potential next actions during task planning, and even generate action sequences directly, given an instruction in natural language with no additional domain information. However, such methods either require enumerating all possible next steps for scoring, or generate free-form text that may contain actions not possible on a given robot in its current context. We present a programmatic LLM prompt structure that enables plan generation functional across situated environments, robot capabilities, and tasks. Our key insight is to prompt the LLM with program-like specifications of the available actions and objects in an environment, as well as with executable example programs. We make concrete recommendations about prompt structure and generation constraints through ablation experiments, demonstrate state of the art success rates in VirtualHome household tasks, and deploy our method on a physical robot arm for tabletop tasks.

Author Information

Ishika Singh (University of Southern California)

I am a PhD student at USC, advised by Jesse Thomason. My research interests are across NLP and robotics; primarily around why (how) language is necessary for (can be utilized in) robot learning. I think that problems under Embodied AI are yet more foundational, and hold deep-rooted benefits for the society.

Valts Blukis (Cornell University)
Arsalan Mousavian (NVIDIA)
Ankit Goyal (Princeton University)
Danfei Xu (Georgia Institute of Technology)
Jonathan Tremblay (NVIDIA)
Dieter Fox (University of Washington)
Jesse Thomason (University of Southern California)
Animesh Garg (University of Toronto, Nvidia, Vector Institute)

I am a CIFAR AI Chair Assistant Professor of Computer Science at the University of Toronto, a Faculty Member at the Vector Institute, and Sr. Researcher at Nvidia. My current research focuses on machine learning for perception and control in robotics.

More from the Same Authors