Timezone: »

The Consequences of Massive Scaling in Machine Learning
Noah Goodman · Melanie Mitchell · Joelle Pineau · Oriol Vinyals · Jared Kaplan

Mon Dec 06 11:00 PM -- 12:00 AM (PST) @ None

Machine learning research has always prized algorithmic contributions. However, many recent big breakthroughs have been driven by scaling up the same basic algorithms and architectures. The most recent example is OpenAI’s massive language model GPT-3, which won a best paper award at NeurIPS in 2020. GPT-3 was based on the same Transformer architecture as its predecessors, but when scaled up, it resulted in remarkable unexpected behaviors, which had a massive impact on the way we think about language models. As more progress becomes driven by scaling, how should we adapt as a community? Should it affect what problems are considered interesting? Should publication norms take scale into account, or de-emphasize algorithmic contributions? How do we ensure that smaller institutions or academic labs can meaningfully research and audit large-scale systems? From a safety perspective, if behaviors appear emergently at scale, how can we ensure that systems behave as intended? In this panel, we will explore these critical questions so that the NeurIPS community at large can continue to make fundamental advances in the era of massive scaling.

Author Information

Noah Goodman (Stanford University)
Melanie Mitchell (Santa Fe Institute)

Melanie Mitchell is the Davis Professor at the Santa Fe Institute. Her current research focuses on conceptual abstraction, analogy-making, and visual recognition in artificial intelligence systems. Melanie is the author or editor of six books and numerous scholarly papers in the fields of artificial intelligence, cognitive science, and complex systems. Her latest book is Artificial Intelligence: A Guide for Thinking Humans (Farrar, Straus, and Giroux).

Joelle Pineau
Oriol Vinyals (DeepMind)

Oriol Vinyals is a Research Scientist at Google. He works in deep learning with the Google Brain team. Oriol holds a Ph.D. in EECS from University of California, Berkeley, and a Masters degree from University of California, San Diego. He is a recipient of the 2011 Microsoft Research PhD Fellowship. He was an early adopter of the new deep learning wave at Berkeley, and in his thesis he focused on non-convex optimization and recurrent neural networks. At Google Brain he continues working on his areas of interest, which include artificial intelligence, with particular emphasis on machine learning, language, and vision.

Jared Kaplan (Anthropic)

More from the Same Authors