Timezone: »
Transformer architectures in deep learning are increasingly relied on across domains with impressive results, but the observed growth of model parameters may be unsustainable and failures in robustness limit application. Tasks that are targeted across domains by transformers are enabled in biology by the mammalian neocortex, yet there is no clear understanding of the relationship between processing in the neocortex and the transformer architecture. While the relationship between convolutional neural networks (CNNs) and the cortex has been studied, transformers have more complex computations and multi-scale organization, offering a richer foundation for analysis and co-inspiration. We introduce a framework for enabling details of cortical connectivity at multiple organizational scales (micro-, meso-, and macro-) to be related to transformer processing, and investigate how cortical connectivity principles affect performance, using the CIFAR-10-C computer vision robustness benchmark task. Overall, we demonstrate the efficacy of our framework and find that incorporating components of cortical connectivity at multiple scales can reduce learnable attention parameters by over an order of magnitude, while being more robust against the most challenging examples in computer vision tasks. The cortical transformer framework and design changes we investigate are generalizable across domains, may inform the development of more efficient/robust attention-based systems, and further an understanding of the relationship between cortical and transformer processing.
Author Information
Brian Robinson (Johns Hopkins University Applied Physics Laboratory)
Nathan Drenkow (The Johns Hopkins University Applied Physics Laboratory)
More from the Same Authors
-
2022 : Informing generative replay for continual learning with long-term memory formation in the fruit fly »
Brian Robinson · Justin Joyce · Raphael Norman-Tenazas · Gautam Vallabha · Erik Johnson -
2022 : Fifteen-minute Competition Overview Video »
Nathan Drenkow · Raman Arora · Gino Perrotta · Todd Neller · Ryan Gardner · Mykel J Kochenderfer · Jared Markowitz · Corey Lowman · Casey Richardson · Bo Li · Bart Paulhamus · Ashley J Llorens · Andrew Newman -
2022 : Context-Adaptive Deep Neural Networks via Bridge-Mode Connectivity »
Nathan Drenkow · Alvin Tan · Clayton Ashcraft · Kiran Karra -
2022 Competition: Reconnaissance Blind Chess: An Unsolved Challenge for Multi-Agent Decision Making Under Uncertainty »
Ryan Gardner · Gino Perrotta · Corey Lowman · Casey Richardson · Andrew Newman · Jared Markowitz · Nathan Drenkow · Bart Paulhamus · Ashley J Llorens · Todd Neller · Raman Arora · Bo Li · Mykel J Kochenderfer -
2021 : Reconnaissance Blind Chess + Q&A »
Ryan Gardner · Gino Perrotta · Corey Lowman · Casey Richardson · Andrew Newman · Jared Markowitz · Nathan Drenkow · Bart Paulhamus · Ashley J Llorens · Todd Neller · Raman Arora · Bo Li · Mykel J Kochenderfer -
2017 : bossDB: A Petascale Database for Large-Scale Neuroscience Informing Machine Learning »
Nathan Drenkow