firstbacksecondback
86 Results
Poster
|
Tue 8:45 |
Rewarded soups: towards Pareto-optimal alignment by interpolating weights fine-tuned on diverse rewards Alexandre Rame · Guillaume Couairon · Corentin Dancette · Jean-Baptiste Gaya · Mustafa Shukor · Laure Soulier · Matthieu Cord |
|
Poster
|
Tue 8:45 |
On the Convergence and Sample Complexity Analysis of Deep Q-Networks with -Greedy Exploration Shuai Zhang · Hongkang Li · Meng Wang · Miao Liu · Pin-Yu Chen · Songtao Lu · Songtao Lu · Sijia Liu · Keerthiram Murugesan · Subhajit Chaudhury |