Skip to yearly menu bar Skip to main content


The Consequences of Massive Scaling in Machine Learning

Noah Goodman · Melanie Mitchell · Joelle Pineau · Oriol Vinyals · Jared Kaplan

Moderator : Jacob Steinhardt


Machine learning research has always prized algorithmic contributions. However, many recent big breakthroughs have been driven by scaling up the same basic algorithms and architectures. The most recent example is OpenAI’s massive language model GPT-3, which won a best paper award at NeurIPS in 2020. GPT-3 was based on the same Transformer architecture as its predecessors, but when scaled up, it resulted in remarkable unexpected behaviors, which had a massive impact on the way we think about language models. As more progress becomes driven by scaling, how should we adapt as a community? Should it affect what problems are considered interesting? Should publication norms take scale into account, or de-emphasize algorithmic contributions? How do we ensure that smaller institutions or academic labs can meaningfully research and audit large-scale systems? From a safety perspective, if behaviors appear emergently at scale, how can we ensure that systems behave as intended? In this panel, we will explore these critical questions so that the NeurIPS community at large can continue to make fundamental advances in the era of massive scaling.

Chat is not available.