Hierarchical Temporal Correspondence Between Brain Activity and Different Deep Neural Network Architectures
Abstract
The visual cortex and artificial neural networks both process images hierarchically through multiple computational stages. We investigated the temporal correspondence between activations from convolutional and transformer-based neural network models (VGG-19 and ViT) and human EEG responses recorded during rapid visual perception tasks. Using the public THINGS-EEG dataset of images presented for 50 ms, we developed a standardized pipeline to assess whether this correspondence reflects general computational principles across fundamentally different architectures. Our analysis revealed a robust mapping for both neural networks: early EEG components correlated with activations from initial network layers, whereas later components aligned with deeper layers. This finding highlights a consistent temporal alignment between biological and artificial vision across architectures as different as CNNs and transformers, suggesting that both systems may converge on similar computational strategies for visual recognition. This correspondence could reflect an underlying optimal algorithm for visual processing that emerges in hierarchical systems regardless of the specific architecture or computational substrate. We provide an open-source implementation of our standardized pipeline to facilitate future comparative studies.