Timezone: »
Language models have achieved remarkable performance on a wide range of tasks that require natural language understanding. Nevertheless, state-of-the-art models have generally struggled with tasks that require quantitative reasoning, such as solving mathematics, science, and engineering questions at the college level. To help close this gap, we introduce Minerva, a large language model pretrained on general natural language data and further trained on technical content. The model achieves strong performance in a variety of evaluations, including state-of-the-art performance on the MATH dataset. We also evaluate our model on over two hundred undergraduate-level problems in physics, biology, chemistry, economics, and other sciences that require quantitative reasoning, and find that the model can correctly answer nearly a quarter of them.
Author Information
Aitor Lewkowycz (Inflection AI)
Anders Andreassen (Google)
David Dohan (Google Brain)
Ethan Dyer (Blueshift, Google Research)
Henryk Michalewski (Google)
Vinay Ramasesh (Google)
Ambrose Slone (Google)
Currently at Google X as a part of team doing deep learning research. Formerly at Apple working on computer vision and deep learning.
Cem Anil (University of Toronto)
I'm a first year PhD student at the University of Toronto and Vector Institute, supervised by Roger Grosse and Geoffrey Hinton.
Imanol Schlag (IDSIA)
Theo Gutman-Solo
Yuhuai Wu (Google)
Behnam Neyshabur (Google)
Guy Gur-Ari (Google)
Vedant Misra (Google)
More from the Same Authors
-
2021 : Augmenting Classic Algorithms with Neural Components for Strong Generalisation on Ambiguous and High-Dimensional Data »
Imanol Schlag · Jürgen Schmidhuber -
2021 : Improving Baselines in the Wild »
Kazuki Irie · Imanol Schlag · Róbert Csordás · Jürgen Schmidhuber -
2021 : A Modern Self-Referential Weight Matrix That Learns to Modify Itself »
Kazuki Irie · Imanol Schlag · Róbert Csordás · Jürgen Schmidhuber -
2022 : Teaching Algorithmic Reasoning via In-context Learning »
Hattie Zhou · Azade Nova · aaron courville · Hugo Larochelle · Behnam Neyshabur · Hanie Sedghi -
2022 : Fast and Precise: Adjusting Planning Horizon with Adaptive Subgoal Search »
Michał Zawalski · Michał Tyrolski · Konrad Czechowski · Damian Stachura · Piotr Piękos · Tomasz Odrzygóźdź · Yuhuai Wu · Łukasz Kuciński · Piotr Miłoś -
2022 Panel: Panel 2B-4: Extreme Compression for… & Exploring Length Generalization… »
Cem Anil · Minjia Zhang -
2022 : MATH-AI: Toward Human-Level Mathematical Reasoning »
Francois Charton · Noah Goodman · Behnam Neyshabur · Talia Ringer · Daniel Selsam -
2022 : Teaching Algorithmic Reasoning via In-context Learning »
Hattie Zhou · Azade Nova · aaron courville · Hugo Larochelle · Behnam Neyshabur · Hanie Sedghi -
2022 : Panel Discussion »
Behnam Neyshabur · David Sontag · Pradeep Ravikumar · Erin Hartman -
2022 : Length Generalization in Quantitative Reasoning »
Behnam Neyshabur -
2022 Workshop: MATH-AI: Toward Human-Level Mathematical Reasoning »
Pan Lu · Swaroop Mishra · Sean Welleck · Yuhuai Wu · Hannaneh Hajishirzi · Percy Liang -
2022 Poster: Autoformalization with Large Language Models »
Yuhuai Wu · Albert Q. Jiang · Wenda Li · Markus Rabe · Charles Staats · Mateja Jamnik · Christian Szegedy -
2022 Poster: Insights into Pre-training via Simpler Synthetic Tasks »
Yuhuai Wu · Felix Li · Percy Liang -
2022 Poster: Multi-Game Decision Transformers »
Kuang-Huei Lee · Ofir Nachum · Mengjiao (Sherry) Yang · Lisa Lee · Daniel Freeman · Sergio Guadarrama · Ian Fischer · Winnie Xu · Eric Jang · Henryk Michalewski · Igor Mordatch -
2022 Poster: Thor: Wielding Hammers to Integrate Language Models and Automated Theorem Provers »
Albert Q. Jiang · Wenda Li · Szymon Tworkowski · Konrad Czechowski · Tomasz Odrzygóźdź · Piotr Miłoś · Yuhuai Wu · Mateja Jamnik -
2022 Poster: STaR: Bootstrapping Reasoning With Reasoning »
Eric Zelikman · Yuhuai Wu · Jesse Mu · Noah Goodman -
2022 Poster: Towards Learning Universal Hyperparameter Optimizers with Transformers »
Yutian Chen · Xingyou Song · Chansoo Lee · Zi Wang · Richard Zhang · David Dohan · Kazuya Kawakami · Greg Kochanski · Arnaud Doucet · Marc'Aurelio Ranzato · Sagi Perel · Nando de Freitas -
2022 Poster: Exploring Length Generalization in Large Language Models »
Cem Anil · Yuhuai Wu · Anders Andreassen · Aitor Lewkowycz · Vedant Misra · Vinay Ramasesh · Ambrose Slone · Guy Gur-Ari · Ethan Dyer · Behnam Neyshabur -
2022 Poster: Revisiting Neural Scaling Laws in Language and Vision »
Ibrahim Alabdulmohsin · Behnam Neyshabur · Xiaohua Zhai -
2022 Poster: Path Independent Equilibrium Models Can Better Exploit Test-Time Computation »
Cem Anil · Ashwini Pokle · Kaiqu Liang · Johannes Treutlein · Yuhuai Wu · Shaojie Bai · J. Zico Kolter · Roger Grosse -
2022 Poster: Block-Recurrent Transformers »
DeLesley Hutchins · Imanol Schlag · Yuhuai Wu · Ethan Dyer · Behnam Neyshabur -
2021 Poster: Going Beyond Linear Transformers with Recurrent Fast Weight Programmers »
Kazuki Irie · Imanol Schlag · Róbert Csordás · Jürgen Schmidhuber -
2021 Poster: Understanding How Encoder-Decoder Architectures Attend »
Kyle Aitken · Vinay Ramasesh · Yuan Cao · Niru Maheswaranathan -
2021 Poster: Learning to Elect »
Cem Anil · Xuchan Bao -
2020 : Is Transfer Learning Necessary for Protein Landscape Prediction? »
David Belanger · David Dohan -
2019 : Poster Session + Lunch »
Maxwell Nye · Robert Kim · Toby St Clere Smithe · Takeshi D. Itoh · Omar U. Florez · Vesna G. Djokic · Sneha Aenugu · Mariya Toneva · Imanol Schlag · Dan Schwartz · Max Raphael Sobroza Marques · Pravish Sainath · Peng-Hsuan Li · Rishi Bommasani · Najoung Kim · Paul Soulos · Steven Frankland · Nadezhda Chirkova · Dongqi Han · Adam Kortylewski · Rich Pang · Milena Rabovsky · Jonathan Mamou · Vaibhav Kumar · Tales Marra -
2019 Poster: Preventing Gradient Attenuation in Lipschitz Constrained Convolutional Networks »
Qiyang Li · Saminul Haque · Cem Anil · James Lucas · Roger Grosse · Joern-Henrik Jacobsen -
2018 : TimbreTron: A WaveNet(CycleGAN(CQT(Audio))) Pipeline for Musical Timbre Transfer »
Sicong (Sheldon) Huang · Cem Anil · Xuchan Bao -
2018 Poster: Learning to Reason with Third Order Tensor Products »
Imanol Schlag · Jürgen Schmidhuber