ChinaTravel: An Open-Ended Benchmark for Language Agents in Chinese Travel Planning
Jie-Jing Shao ⋅ Bowen Zhang ⋅ Xiao-Wen Yang ⋅ Baizhi Chen ⋅ Siyu Han ⋅ Wen-Da Wei ⋅ Guohao Cai ⋅ Zhenhua Dong ⋅ Lan-Zhe Guo ⋅ Yu-Feng Li
Abstract
Recent advances in LLMs have spurred the development of \emph{Language Agents} for real-world applications such as travel planning, which involves complex multi-constraint challenges. Existing benchmarks, however, often oversimplify reality with synthetic queries and limited constraints. To bridge this gap, we introduce \emph{ChinaTravel}, the first open-ended benchmark based on authentic travel needs. We develop a domain-specific language (DSL) for compositional evaluation covering feasibility, constraints, and preferences. Experiments show neuro-symbolic agents achieve a 37.0\% constraint satisfaction rate on human queries, a 10× improvement over neural models, demonstrating their potential in complex planning scenarios.
Chat is not available.
Successful Page Load