How Does Layer Normalization Improve Deep $Q$-learning?
Braham Snyder · Hadi Daneshmand · Chen-Yu Wei
Abstract
Normalization layers — including layer, batch, and weight normalization — improve the stability, generalization, and optimization speed of deep neural networks. Aligning with prior work, our offline reinforcement learning experiments on four classic control tasks suggest that layer normalization may be among the most effective normalizations for deep \$Q\$-learning. Then, with both theory and experiments, we aim to further understand how. We study (i) gradient interference and the relation to tabular \$Q\$-learning, (ii) isometry, (iii) how normalization accelerates regression, and (iv) implicit learning rate decay.
Chat is not available.
Successful Page Load