Depth as a Scaling Vector: Simple Pruning and Evaluation of Emergent Abilities in Pruned LLMs
Chang Liu · Arjun Choudhry · Yifu Cai · Nina Żukowska · Mononito Goswami · Artur Dubrawski
Abstract
The evolving lifecycle of large language models (LLMs) calls for effective strategies for scaling them down for deployment without sacrificing core capabilities. In this work, we investigate depth as a primary architectural scaling vector, introducing simple methods for pruning layers of LLMs, and systematically evaluate how such scaling affects the emergent abilities of LLMs. Our evaluations demonstrate that these methods offer a practical path to facilitate LLM deployment, significantly reducing computational demands while retaining the emergent abilities that make these models powerful and attractive in a wide range of applications.
Chat is not available.
Successful Page Load