Skip to yearly menu bar Skip to main content

Workshop: Deep Reinforcement Learning

A Modern Self-Referential Weight Matrix That Learns to Modify Itself

Kazuki Irie · Imanol Schlag · Róbert Csordás · Jürgen Schmidhuber


The weight matrix (WM) of a neural network (NN) is its program. The programs of many traditional NNs are learned through gradient descent in some error function, then remain fixed. The WM or program of a self-referential NN, however, can keep rapidly modifying all of itself during runtime. In principle, such NNs can meta-learn to learn, and meta-meta-learn to meta-learn to learn, and so on, in the sense of recursive self-improvement. Here we revisit such NNs, building upon recent successes of fast weight programmers (FWPs) and closely related linear Transformers. We propose a scalable self-referential WM (SRWM) that uses outer products and the delta update rule to modify itself.We evaluate our SRWM in a multi-task reinforcement learning setting with procedurally generated ProcGen game environments.Our experiments demonstrate both practical applicability and competitive performance of the SRWM.

Chat is not available.