Conditional Adapters: Parameter-efficient Transfer Learning with Fast Inference
Tao Lei · Junwen Bai · Siddhartha Brahma · Joshua Ainslie · Kenton Lee · Yanqi Zhou · Nan Du · Vincent Zhao · Yuexin Wu · Bo Li · Yu Zhang · Ming-Wei Chang
Great Hall & Hall B1+B2 (level 1) #319
We propose Conditional Adapter (CoDA), a parameter-efficient transfer learning method that also improves inference efficiency. CoDA generalizes beyond standard adapter approaches to enable a new way of balancing speed and accuracy using conditional computation.Starting with an existing dense pretrained model, CoDA adds sparse activation together with a small number of new parameters and a light-weight training phase.Our experiments demonstrate that the CoDA approach provides an unexpectedly efficient way to transfer knowledge.Across a variety of language, vision, and speech tasks, CoDA achieves a 2x to 8x inference speed-up compared to the state-of-the-art Adapter approaches with moderate to no accuracy loss and the same parameter efficiency.