Tutorial Tue, Dec 2, 2025 • 9:30 AM – 12:00 PM PST Don Alberto 2

Efficient Transformers: State of the art in pruning, sparse attention, and transformer funneling

Lucas Spangher · Alejandro F Queiruga · Zachary Gleicher · Ramy Eskander

[ Slides]

Abstract

Transformer architectures consume the lionshare of computational budgets associated with today’s most powerful language and vision models, making research into greater computational efficiency a hot and essential direction. Our proposed tutorial surveys the bleeding edge of three complementary research threads that together comprise a significant part of the current industrial toolkit for achieving computational efficiency in Transformers: (1) pruning, the structured or unstructured removal of weights, layers and heads; (2) sparse attention & routing, including block, sliding-window, locality-sensitive hashing; and (3) funneling, which pools intermediate representations to shorten sequences through depth. We will then feature an expert industrial and academic panel of speakers from Google Deepmind, MIT, UC Berkeley, and Columbia, hearing about the latest trends seen in top industrial labs. Attendees will leave with actionable recipes for building sub-10 B-parameter models that match or exceed dense baselines on language, vision and multi-modal benchmarks.

The tutorial targets researchers and practitioners who build or deploy Transformer models and assumes familiarity with basic deep-learning concepts but not with any specific efficiency method. All slides and publication materials will be released under a permissive license.

Video

Chat is not available.

Schedule

Timezone: America/Los_Angeles

9:30 AM

Introduction to the importance of efficiency in LLMs

Zachary Gleicher

Video

10:00 AM

Funneling, Distillation, and other Google efficiency initiative

Ramy Eskander

Video

10:30 AM

Attention and Embedding Efficiency Techniques

Lucas Spangher

Video

11:00 AM

Pruning Demo

Alejandro F Queiruga

Video

11:30 AM

Fireside Chat

Ramy Eskander · Alejandro F Queiruga · Lucas Spangher · Zachary Gleicher

Video