Strategy and structure in Codenames: Comparing human and GPT-4 gameplay
Abstract
Codenames is a rich interactive game that requires players to combine multifaceted conceptual understanding with their capacity to reason strategically about what inferences other players are likely to draw. While prior work has evaluated the extent to which LLMs are able to succeed in playing this game and methods for improving their performance, our approach is rather to investigate and compare the strategies and conceptual structure humans and a current LLM (GPT-4) employ when playing the game. We find evidence for a shared overarching cognitive strategy of relying on pairwise relatedness judgments to solve Codenames and also show that differences between humans and GPT-4 result from differences in underlying conceptual representations. However, we also find that humans spymasters' strategies reveal a sensitivity to the specific structure of the game that GPT spymasters lack.