Skip to yearly menu bar Skip to main content

Workshop: 4th Workshop on Self-Supervised Learning: Theory and Practice

Self-Supervised Pretraining for Improved Downstream Decoding of Audio-Evoked fMRI Sequences

Sean Paulsen · Mike Casey


We present a sequential transfer learning framework for transformers on functional Magnetic Resonance Imaging (fMRI) data and demonstrate its significant benefits for decoding instrumental timbre. In the first of two phases, we pretrain our stacked-encoder transformer architecture on Next Thought Prediction, a self-supervised task of predicting whether or not one sequence of fMRI data follows another. This phase imparts a general understanding of the temporal and spatial dynamics of neural activity, and can be applied to any fMRI dataset. In the second phase, we finetune the pretrained models and train additional randomly initialized models on the supervised task of predicting whether or not two sequences of fMRI data were obtained while listening to the same musical timbre. The finetuned models achieve significantly higher accuracy on heldout participants than the randomly initialized models, demonstrating the efficacy of our framework for facilitating transfer learning on fMRI data. This work contributes to the growing literature on transformer architectures for sequential transfer learning on fMRI data.

Chat is not available.