Skip to yearly menu bar Skip to main content


Continual Pre-Training of Large Language Models: How to (re)warm your model?

Kshitij Gupta · Benjamin Thérien · Adam Ibrahim · Mats L Richter · Quentin Anthony · Eugene Belilovsky · Irina Rish · Timothee Lesort

Abstract

Chat is not available.