Timezone: »

GALOIS: Boosting Deep Reinforcement Learning via Generalizable Logic Synthesis
Yushi Cao · Zhiming Li · Tianpei Yang · Hao Zhang · YAN ZHENG · Yi Li · Jianye Hao · Yang Liu

Wed Nov 30 02:00 PM -- 04:00 PM (PST) @ Hall J #902

Despite achieving superior performance in human-level control problems, unlike humans, deep reinforcement learning (DRL) lacks high-order intelligence (e.g., logic deduction and reuse), thus it behaves ineffectively than humans regarding learning and generalization in complex problems. Previous works attempt to directly synthesize a white-box logic program as the DRL policy, manifesting logic-driven behaviors. However, most synthesis methods are built on imperative or declarative programming, and each has a distinct limitation, respectively. The former ignores the cause-effect logic during synthesis, resulting in low generalizability across tasks. The latter is strictly proof-based, thus failing to synthesize programs with complex hierarchical logic. In this paper, we combine the above two paradigms together and propose a novel Generalizable Logic Synthesis (GALOIS) framework to synthesize hierarchical and strict cause-effect logic programs. GALOIS leverages the program sketch and defines a new sketch-based hybrid program language for guiding the synthesis. Based on that, GALOIS proposes a sketch-based program synthesis method to automatically generate white-box programs with generalizable and interpretable cause-effect logic. Extensive evaluations on various decision-making tasks with complex logic demonstrate the superiority of GALOIS over mainstream baselines regarding the asymptotic performance, generalizability, and great knowledge reusability across different environments.

Author Information

Yushi Cao (Nanyang Technological University)
Zhiming Li (Nanyang Technological University)
Tianpei Yang (University of Alberta)
Hao Zhang (Tianjin University)
YAN ZHENG (Tianjin University)
Yi Li (School of Computer Science and Engineering, Nanyang Technological University)
Jianye Hao (Tianjin University)
Yang Liu (Nanyang Technology University, Singapore)

More from the Same Authors