Skip to yearly menu bar Skip to main content

Workshop: Foundation Models for Decision Making

Chain of Code: Reasoning with a Language Model-Augmented Code Interpreter

Chengshu Li · Jacky Liang · Fei Xia · Andy Zeng · Sergey Levine · Dorsa Sadigh · Karol Hausman · Xinyun Chen · Fei-Fei Li · brian ichter

[ ] [ Project Page ]
presentation: Foundation Models for Decision Making
Fri 15 Dec 6:15 a.m. PST — 3:30 p.m. PST


Given language models (LMs) perform well on reasoning tasks and coding tasks, we hypothesize that code-writing also allows LMs to solve non-coding tasks by applying similar logical and algorithmic reasoning. However, not all such reasoning can be grounded in executable code. For example, while LMs can directly write Python-executable code to solve math problems, for problems like inferring how many countries have land-locked capital cities, the LM may write “code” with functions such as “iscitylandlocked()”, which cannot be directly executed. Our intuition is that an LM can not only generate code to solve many problems, but when necessary it can “think in code” (or even “pseudocode”) to arrive at more effective problem-solving strategies – generating executable code at times and simulating the execution of any ill-defined statements when useful. In this work, we propose Chain of Code (CoC), a method that improves LMs’ reasoning capabilities by first prompting LMs to write code to solve a task, and then, during code execution, using LMs to simulate the execution of code lines that could not be executed. We demonstrate that Chain of Code outperforms Chain of Thought and other baselines across a variety of benchmarks; on BIG-Bench Hard, Chain of Code achieves 84%, a significant gain of 12% over Chain of Thought.

Chat is not available.