Timezone: »

 
Poster
Safe Opponent-Exploitation Subgame Refinement
Mingyang Liu · Chengjie Wu · Qihan Liu · Yansen Jing · Jun Yang · Pingzhong Tang · Chongjie Zhang

Thu Dec 01 09:00 AM -- 11:00 AM (PST) @ Hall J #939

In zero-sum games, an NE strategy tends to be overly conservative confronted with opponents of limited rationality, because it does not actively exploit their weaknesses. From another perspective, best responding to an estimated opponent model is vulnerable to estimation errors and lacks safety guarantees. Inspired by the recent success of real-time search algorithms in developing superhuman AI, we investigate the dilemma of safety and opponent exploitation and present a novel real-time search framework, called Safe Exploitation Search (SES), which continuously interpolates between the two extremes of online strategy refinement. We provide SES with a theoretically upper-bounded exploitability and a lower-bounded evaluation performance. Additionally, SES enables computationally efficient online adaptation to a possibly updating opponent model, while previous safe exploitation methods have to recompute for the whole game. Empirical results show that SES significantly outperforms NE baselines and previous algorithms while keeping exploitability low at the same time.

Author Information

Mingyang Liu (Tsinghua University)
Chengjie Wu (Tsinghua University)
Qihan Liu (Tsinghua University)
Yansen Jing (Tsinghua University, Tsinghua University)
Jun Yang (Tsinghua University, Tsinghua University)
Pingzhong Tang (Tsinghua University)
Chongjie Zhang (Tsinghua University)

More from the Same Authors