Timezone: »
While large language models (LLMs) have demonstrated impressive capabilities across tasks in language understanding and interactive decision making, their abilities for reasoning (e.g. chain-of-thought prompting) and acting (e.g. action plan generation) have primarily been studied as separate topics. In this paper, we explore the use of LLMs to generate both reasoning traces and task-specific actions in an interleaved manner, allowing for greater synergy between the two: reasoning traces help the model induce, track, and update action plans as well as handle exceptions, while actions allow it to interface with external sources, such as knowledge bases or environments, to gather additional information. We apply our approach, named ReAct, to a diverse set of language and decision making tasks and demonstrate its effectiveness over state-of-the-art baselines, as well as improved human interpretability and trustworthiness over methods without reasoning or acting components. Concretely, on question answering (HotpotQA) and fact verification (Fever), ReAct overcomes issues of hallucination and error propagation prevalent in chain-of-thought reasoning by interacting with a simple Wikipedia API, and generates human-like task-solving trajectories that are more interpretable than baselines without reasoning traces. On two interactive decision making benchmarks (ALFWorld and WebShop), ReAct outperforms imitation and reinforcement learning methods by an absolute success rate of 34% and 10% respectively, while being prompted with only one or two in-context examples.
Author Information
Shunyu Yao (Princeton University)
Jeffrey Zhao (Google Brain)
Dian Yu (Google)
Izhak Shafran (Google Inc)
Karthik Narasimhan (Princeton University)
Yuan Cao (Google Brain)
More from the Same Authors
-
2021 : Efficient and Private Federated Learning with Partially Trainable Networks »
Hakim Sidahmed · Zheng Xu · Yuan Cao -
2022 : Towards an Enhanced, Faithful, and Adaptable Web Interaction Environment »
John Yang · Howard Chen · Karthik Narasimhan -
2022 : WebShop: Towards Scalable Real-World Web Interaction with Grounded Language Agents »
Shunyu Yao -
2022 : Karthik Narasimhan: Semantic Supervision for few-shot generalization and personalization »
Karthik Narasimhan -
2022 Poster: Using natural language and program abstractions to instill human inductive biases in machines »
Sreejan Kumar · Carlos G. Correa · Ishita Dasgupta · Raja Marjieh · Michael Y Hu · Robert Hawkins · Jonathan D Cohen · nathaniel daw · Karthik Narasimhan · Tom Griffiths -
2022 Poster: WebShop: Towards Scalable Real-World Web Interaction with Grounded Language Agents »
Shunyu Yao · Howard Chen · John Yang · Karthik Narasimhan -
2022 Poster: Learning Physics Constrained Dynamics Using Autoencoders »
Tsung-Yen Yang · Justinian Rosca · Karthik Narasimhan · Peter J. Ramadge -
2022 Poster: DataMUX: Data Multiplexing for Neural Networks »
Vishvak Murahari · Carlos Jimenez · Runzhe Yang · Karthik Narasimhan -
2021 : Contributed Talk 5: Efficient and Private Federated Learning with Partially Trainable Networks »
Hakim Sidahmed · Zheng Xu · Yuan Cao -
2021 Poster: Understanding How Encoder-Decoder Architectures Attend »
Kyle Aitken · Vinay Ramasesh · Yuan Cao · Niru Maheswaranathan -
2020 Poster: Your GAN is Secretly an Energy-based Model and You Should Use Discriminator Driven Latent Sampling »
Tong Che · Ruixiang ZHANG · Jascha Sohl-Dickstein · Hugo Larochelle · Liam Paull · Yuan Cao · Yoshua Bengio -
2019 Poster: Modeling Expectation Violation in Intuitive Physics with Coarse Probabilistic Object Representations »
Kevin Smith · Lingjie Mei · Shunyu Yao · Jiajun Wu · Elizabeth Spelke · Josh Tenenbaum · Tomer Ullman -
2018 Poster: 3D-Aware Scene Manipulation via Inverse Graphics »
Shunyu Yao · Tzu Ming Hsu · Jun-Yan Zhu · Jiajun Wu · Antonio Torralba · Bill Freeman · Josh Tenenbaum -
2016 Poster: Hierarchical Deep Reinforcement Learning: Integrating Temporal Abstraction and Intrinsic Motivation »
Tejas Kulkarni · Karthik Narasimhan · Ardavan Saeedi · Josh Tenenbaum