Timezone: »
We identify key areas of improvement for WebShop, an e-commerce shopping environment for training decision making language agents. Specifically, shortcomings in: 1) faithfulness of the reward function to human evaluation, 2) comprehensiveness of its content, and 3) human participation required for generating instructions has hindered WebShop’s promises to be a scalable real-world environment. To solve these issues, we first incorporate greater faithfulness to human evaluation by designing a new reward function to capture lexical similarities and synonyms. Second, we identify customer reviews, similar products, and customer FAQs as missing semantic components that are most helpful to human execution of the task from surveying 75 respondents. Finally, we reformulate the attribute tagging problem as a extractive short-phrase prediction task to enhance scalability. Our V2 reward function closes the gap between the scores of the WebShop’s automated reward function (from 81.5% to 87.7%) and human evaluation (89.9%). Our attribute tagging approach achieves an accuracy of 72.2% with a t5-3b model fine tuned on 2, 000 training data points, showing potential to automate the instruction creation pipeline.
Author Information
John Yang (Princeton University)

I'm a 2nd year Master's student studying Natural Language Processing at Princeton University, advised by Professor Karthik Narasimhan.
Howard Chen (Princeton University)
Karthik Narasimhan (Princeton University)
More from the Same Authors
-
2022 : REACT: Synergizing Reasoning and Acting in Language Models »
Shunyu Yao · Jeffrey Zhao · Dian Yu · Izhak Shafran · Karthik Narasimhan · Yuan Cao -
2022 : Karthik Narasimhan: Semantic Supervision for few-shot generalization and personalization »
Karthik Narasimhan -
2022 Poster: Using natural language and program abstractions to instill human inductive biases in machines »
Sreejan Kumar · Carlos G. Correa · Ishita Dasgupta · Raja Marjieh · Michael Y Hu · Robert Hawkins · Jonathan D Cohen · nathaniel daw · Karthik Narasimhan · Tom Griffiths -
2022 Poster: WebShop: Towards Scalable Real-World Web Interaction with Grounded Language Agents »
Shunyu Yao · Howard Chen · John Yang · Karthik Narasimhan -
2022 Poster: Learning Physics Constrained Dynamics Using Autoencoders »
Tsung-Yen Yang · Justinian Rosca · Karthik Narasimhan · Peter J. Ramadge -
2022 Poster: DataMUX: Data Multiplexing for Neural Networks »
Vishvak Murahari · Carlos Jimenez · Runzhe Yang · Karthik Narasimhan -
2016 Poster: Hierarchical Deep Reinforcement Learning: Integrating Temporal Abstraction and Intrinsic Motivation »
Tejas Kulkarni · Karthik Narasimhan · Ardavan Saeedi · Josh Tenenbaum