A Cognitive Architecture for Probing Hierarchical Processing and Predictive Coding in Deep Vision Models
Abstract
Despite their success, understanding the internal cognitive processes of modern deep neural networks remains a critical challenge, situated between high-level behavioral evaluations and low-level mechanistic interpretability. Cognitive science, which seeks to explain cognition in biological systems, offers a rich theoretical foundation for bridging this gap. This paper introduces the Visual Cortex Network (VCNet), a novel neural architecture designed as a computational testbed for prominent cognitive theories of vision. VCNet explicitly operationalizes key neuroscientific principles, including the hierarchical organization of distinct cortical areas, dual-stream segregation of information, and top-down predictive feedback. We evaluate VCNet's emergent behaviors and processing capabilities on two specialized benchmarks chosen to probe its architectural priors: the Spots-10 animal pattern dataset, which tests for evolutionarily relevant feature learning, and the Stanford Light Field dataset, which examines the model's ability to process richer, more naturalistic visual data. Our results show that VCNet achieves state-of-the-art performance, with classification accuracies of 92.08% on Spots-10 and 74.42% on the light field dataset, surpassing contemporary models of comparable size. This work demonstrates how integrating principles of cognitive neuroscience into network design can foster more robust and efficient visual processing, offering a promising direction for building and interpreting more capable artificial vision systems.