Timezone: »

Learning Beam Search Policies via Imitation Learning
Renato Negrinho · Matthew Gormley · Geoffrey Gordon

Thu Dec 06 02:00 PM -- 04:00 PM (PST) @ Room 517 AB #104

Beam search is widely used for approximate decoding in structured prediction problems. Models often use a beam at test time but ignore its existence at train time, and therefore do not explicitly learn how to use the beam. We develop an unifying meta-algorithm for learning beam search policies using imitation learning. In our setting, the beam is part of the model and not just an artifact of approximate decoding. Our meta-algorithm captures existing learning algorithms and suggests new ones. It also lets us show novel no-regret guarantees for learning beam search policies.

Author Information

Renato Negrinho (Carnegie Mellon University)
Matt Gormley (Carnegie Mellon University)
Geoffrey Gordon (MSR Montréal & CMU)

Dr. Gordon is an Associate Research Professor in the Department of Machine Learning at Carnegie Mellon University, and co-director of the Department's Ph. D. program. He works on multi-robot systems, statistical machine learning, game theory, and planning in probabilistic, adversarial, and general-sum domains. His previous appointments include Visiting Professor at the Stanford Computer Science Department and Principal Scientist at Burning Glass Technologies in San Diego. Dr. Gordon received his B.A. in Computer Science from Cornell University in 1991, and his Ph.D. in Computer Science from Carnegie Mellon University in 1999.

More from the Same Authors