firstbacksecondback
3 Results
Poster
|
Wed 8:45 |
Should Under-parameterized Student Networks Copy or Average Teacher Weights? Berfin Simsek · Amire Bendjeddou · Wulfram Gerstner · Johanni Brea |
|
Workshop
|
parameter averaging laws for multitask language models Woojin Chung · Hyowon Cho · James Thorne · Se-Young Yun |
||
Workshop
|
Beyond Parameter Averaging in Model Aggregation Pol Garcia Recasens · Jordi Torres · Josep Lluís Berral · Søren Hauberg · Pablo Moreno-Muñoz |