Timezone: »
Learning representations on large-sized graphs is a long-standing challenge due to the inter-dependence nature involved in massive data points. Transformers, as an emerging class of foundation encoders for graph-structured data, have shown promising performance on small graphs due to its global attention capable of capturing all-pair influence beyond neighboring nodes. Even so, existing approaches tend to inherit the spirit of Transformers in language and vision tasks, and embrace complicated models by stacking deep multi-head attentions. In this paper, we critically demonstrate that even using a one-layer attention can bring up surprisingly competitive performance across node property prediction benchmarks where node numbers range from thousand-level to billion-level. This encourages us to rethink the design philosophy for Transformers on large graphs, where the global attention is a computation overhead hindering the scalability. We frame the proposed scheme as Simplified Graph Transformers (SGFormer), which is empowered by a simple attention model that can efficiently propagate information among arbitrary nodes in one layer. SGFormer requires none of positional encodings, feature/graph pre-processing or augmented loss. Empirically, SGFormer successfully scales to the web-scale graph ogbn-papers100M and yields up to 141x inference acceleration over SOTA Transformers on medium-sized graphs. Beyond current results, we believe the proposed methodology alone enlightens a new technical path of independent interest for building Transformers on large graphs.
Author Information
Qitian Wu (Shanghai Jiao Tong University)
Wentao Zhao (Shanghai Jiao Tong University)
Chenxiao Yang (Shanghai Jiao Tong University)
Hengrui Zhang (University of Illinois, Chicago)
Fan Nie (Shanghai Jiaotong University)
Haitian Jiang (New York University)

I am a first-year PhD student at Courant Institute, NYU, advised by Jinyang Li. My current interests are Machine Learning Systems, Machine Learning, and Graph Neural Networks.
Yatao Bian (Tencent AI Lab)
Junchi Yan (Shanghai Jiao Tong University)
More from the Same Authors
-
2022 Poster: Improving Generative Adversarial Networks via Adversarial Learning in Latent Space »
Yang Li · Yichuan Mo · Liangliang Shi · Junchi Yan -
2022 Poster: Learning Causally Invariant Representations for Out-of-Distribution Generalization on Graphs »
Yongqiang Chen · Yonggang Zhang · Yatao Bian · Han Yang · MA Kaili · Binghui Xie · Tongliang Liu · Bo Han · James Cheng -
2022 Poster: ZARTS: On Zero-order Optimization for Neural Architecture Search »
Xiaoxing Wang · Wenxuan Guo · Jianlin Su · Xiaokang Yang · Junchi Yan -
2022 Poster: Learning Substructure Invariance for Out-of-Distribution Molecular Representations »
Nianzu Yang · Kaipeng Zeng · Qitian Wu · Xiaosong Jia · Junchi Yan -
2022 : Diversity Boosted Learning for Domain Generalization with A Large Number of Domains »
XI LENG · Yatao Bian · Xiaoying Tang -
2023 Poster: Going Beyond Linear Mode Connectivity: The Layerwise Linear Feature Connectivity »
Zhanpeng Zhou · Yongyi Yang · Xiaojiang Yang · Junchi Yan · Wei Hu -
2023 Poster: H2RBox-v2: Incorporating Symmetry for Boosting Horizontal Box Supervised Oriented Object Detection »
Yi Yu · Xue Yang · Qingyun Li · Yue Zhou · Feipeng Da · Junchi Yan -
2023 Poster: OpenLane-V2: A Topology Reasoning Benchmark for Unified 3D HD Mapping »
Huijie Wang · Tianyu Li · Yang Li · Li Chen · Chonghao Sima · Zhenbo Liu · Bangjun Wang · Peijin Jia · Yuting Wang · Shengyin Jiang · Feng Wen · Hang Xu · Ping Luo · Junchi Yan · Wei Zhang · Hongyang Li -
2023 Poster: Learning Invariant Molecular Representation in Latent Discrete Space »
Xiang Zhuang · Qiang Zhang · Keyan Ding · Yatao Bian · Xiao Wang · Jingsong Lv · Hongyang Chen · Huajun Chen -
2023 Poster: Relative Entropic Optimal Transport: a (Prior-aware) Matching Perspective to (Unbalanced) Classification »
Liangliang Shi · Haoyu Zhen · Gu Zhang · Junchi Yan -
2023 Poster: Unleashing the Power of Graph Data Augmentation on Covariate Distribution Shift »
Yongduo Sui · Qitian Wu · Jiancan Wu · Qing Cui · Longfei Li · Jun Zhou · Xiang Wang · Xiangnan He -
2023 Poster: Understanding and Improving Feature Learning for Out-of-Distribution Generalization »
Yongqiang Chen · Wei Huang · Kaiwen Zhou · Yatao Bian · Bo Han · James Cheng -
2023 Poster: From Distribution Learning in Training to Gradient Search in Testing for Combinatorial Optimization »
Yang Li · Jinpei Guo · Runzhong Wang · Junchi Yan -
2023 Poster: Does Invariant Graph Learning via Environment Augmentation Learn Invariance? »
Yongqiang Chen · Yatao Bian · Kaiwen Zhou · Binghui Xie · Bo Han · James Cheng -
2023 Poster: Fairness-guided Few-shot Prompting for Large Language Models »
Huan Ma · Changqing Zhang · Yatao Bian · Lemao Liu · Zhirui Zhang · Peilin Zhao · Shu Zhang · Huazhu Fu · Qinghua Hu · Bingzhe Wu -
2023 Poster: HubRouter: Learning Global Routing via Hub Generation and Pin-hub Connection »
Xingbo Du · Chonghua Wang · Ruizhe Zhong · Junchi Yan -
2022 Spotlight: Lightning Talks 5B-3 »
Yanze Wu · Jie Xiao · Nianzu Yang · Jieyi Bi · Jian Yao · Yiting Chen · Qizhou Wang · Yangru Huang · Yongqiang Chen · Peixi Peng · Yuxin Hong · Xintao Wang · Feng Liu · Yining Ma · Qibing Ren · Xueyang Fu · Yonggang Zhang · Kaipeng Zeng · Jiahai Wang · GEN LI · Yonggang Zhang · Qitian Wu · Yifan Zhao · Chiyu Wang · Junchi Yan · Feng Wu · Yatao Bian · Xiaosong Jia · Ying Shan · Zhiguang Cao · Zheng-Jun Zha · Guangyao Chen · Tianjun Xiao · Han Yang · Jing Zhang · Jinbiao Chen · MA Kaili · Yonghong Tian · Junchi Yan · Chen Gong · Tong He · Binghui Xie · Yuan Sun · Francesco Locatello · Tongliang Liu · Yeow Meng Chee · David P Wipf · Tongliang Liu · Bo Han · Bo Han · Yanwei Fu · James Cheng · Zheng Zhang -
2022 Spotlight: Lightning Talks 5A-2 »
Qiang LI · Zhiwei Xu · Jia-Qi Yang · Thai Hung Le · Haoxuan Qu · Yang Li · Artyom Sorokin · Peirong Zhang · Mira Finkelstein · Nitsan levy · Chung-Yiu Yau · dapeng li · Thommen Karimpanal George · De-Chuan Zhan · Nazar Buzun · Jiajia Jiang · Li Xu · Yichuan Mo · Yujun Cai · Yuliang Liu · Leonid Pugachev · Bin Zhang · Lucy Liu · Hoi-To Wai · Liangliang Shi · Majid Abdolshah · Yoav Kolumbus · Lin Geng Foo · Junchi Yan · Mikhail Burtsev · Lianwen Jin · Yuan Zhan · Dung Nguyen · David Parkes · Yunpeng Baiia · Jun Liu · Kien Do · Guoliang Fan · Jeffrey S Rosenschein · Sunil Gupta · Sarah Keren · Svetha Venkatesh -
2022 Spotlight: Improving Generative Adversarial Networks via Adversarial Learning in Latent Space »
Yang Li · Yichuan Mo · Liangliang Shi · Junchi Yan -
2022 Spotlight: Learning Causally Invariant Representations for Out-of-Distribution Generalization on Graphs »
Yongqiang Chen · Yonggang Zhang · Yatao Bian · Han Yang · MA Kaili · Binghui Xie · Tongliang Liu · Bo Han · James Cheng -
2022 Spotlight: Learning Substructure Invariance for Out-of-Distribution Molecular Representations »
Nianzu Yang · Kaipeng Zeng · Qitian Wu · Xiaosong Jia · Junchi Yan -
2022 Spotlight: Rethinking and Improving Robustness of Convolutional Neural Networks: a Shapley Value-based Approach in Frequency Domain »
Yiting Chen · Qibing Ren · Junchi Yan -
2022 Spotlight: Lightning Talks 2B-3 »
Jie-Jing Shao · Jiangmeng Li · Jiashuo Liu · Zongbo Han · Tianyang Hu · Jiayun Wu · Wenwen Qiang · Jun WANG · Zhipeng Liang · Lan-Zhe Guo · Wenjia Wang · Yanan Zhang · Xiao-wen Yang · Fan Yang · Bo Li · Wenyi Mo · Zhenguo Li · Liu Liu · Peng Cui · Yu-Feng Li · Changwen Zheng · Lanqing Li · Yatao Bian · Bing Su · Hui Xiong · Peilin Zhao · Bingzhe Wu · Changqing Zhang · Jianhua Yao -
2022 Spotlight: UMIX: Improving Importance Weighting for Subpopulation Shift via Uncertainty-Aware Mixup »
Zongbo Han · Zhipeng Liang · Fan Yang · Liu Liu · Lanqing Li · Yatao Bian · Peilin Zhao · Bingzhe Wu · Changqing Zhang · Jianhua Yao -
2022 Spotlight: NodeFormer: A Scalable Graph Structure Learning Transformer for Node Classification »
Qitian Wu · Wentao Zhao · Zenan Li · David P Wipf · Junchi Yan -
2022 Panel: Panel 1C-1: Learning Neural Set… & Holomorphic Equilibrium Propagation… »
Axel Laborieux · Yatao Bian -
2022 Spotlight: Lightning Talks 1B-1 »
Qitian Wu · Runlin Lei · Rongqin Chen · Luca Pinchetti · Yangze Zhou · Abhinav Kumar · Hans Hao-Hsun Hsu · Wentao Zhao · Chenhao Tan · Zhen Wang · Shenghui Zhang · Yuesong Shen · Tommaso Salvatori · Gitta Kutyniok · Zenan Li · Amit Sharma · Leong Hou U · Yordan Yordanov · Christian Tomani · Bruno Ribeiro · Yaliang Li · David P Wipf · Daniel Cremers · Bolin Ding · Beren Millidge · Ye Li · Yuhang Song · Junchi Yan · Zhewei Wei · Thomas Lukasiewicz -
2022 Poster: NodeFormer: A Scalable Graph Structure Learning Transformer for Node Classification »
Qitian Wu · Wentao Zhao · Zenan Li · David P Wipf · Junchi Yan -
2022 Poster: Geometric Knowledge Distillation: Topology Compression for Graph Neural Networks »
Chenxiao Yang · Qitian Wu · Junchi Yan -
2022 Poster: Rethinking and Improving Robustness of Convolutional Neural Networks: a Shapley Value-based Approach in Frequency Domain »
Yiting Chen · Qibing Ren · Junchi Yan -
2022 Poster: GraphDE: A Generative Framework for Debiased Learning and Out-of-Distribution Detection on Graphs »
Zenan Li · Qitian Wu · Fan Nie · Junchi Yan -
2022 Poster: The Policy-gradient Placement and Generative Routing Neural Networks for Chip Design »
Ruoyu Cheng · Xianglong Lyu · Yang Li · Junjie Ye · Jianye Hao · Junchi Yan -
2022 Poster: Towards Out-of-Distribution Sequential Event Prediction: A Causal Treatment »
Chenxiao Yang · Qitian Wu · Qingsong Wen · Zhiqiang Zhou · Liang Sun · Junchi Yan -
2022 Poster: Trajectory-guided Control Prediction for End-to-end Autonomous Driving: A Simple yet Strong Baseline »
Penghao Wu · Xiaosong Jia · Li Chen · Junchi Yan · Hongyang Li · Yu Qiao -
2022 Poster: GraphQNTK: Quantum Neural Tangent Kernel for Graph Data »
Yehui Tang · Junchi Yan -
2022 Poster: Learning Neural Set Functions Under the Optimal Subset Oracle »
Zijing Ou · Tingyang Xu · Qinliang Su · Yingzhen Li · Peilin Zhao · Yatao Bian -
2022 Poster: UMIX: Improving Importance Weighting for Subpopulation Shift via Uncertainty-Aware Mixup »
Zongbo Han · Zhipeng Liang · Fan Yang · Liu Liu · Lanqing Li · Yatao Bian · Peilin Zhao · Bingzhe Wu · Changqing Zhang · Jianhua Yao -
2021 Poster: From Canonical Correlation Analysis to Self-supervised Graph Neural Networks »
Hengrui Zhang · Qitian Wu · Junchi Yan · David Wipf · Philip S Yu -
2021 Poster: Towards Open-World Feature Extrapolation: An Inductive Graph Learning Approach »
Qitian Wu · Chenxiao Yang · Junchi Yan -
2021 Poster: Not All Low-Pass Filters are Robust in Graph Convolutional Networks »
Heng Chang · Yu Rong · Tingyang Xu · Yatao Bian · Shiji Zhou · Xin Wang · Junzhou Huang · Wenwu Zhu -
2021 Poster: Bridging Explicit and Implicit Deep Generative Models via Neural Stein Estimators »
Qitian Wu · Rui Gao · Hongyuan Zha -
2020 Poster: Graduated Assignment for Joint Multi-Graph Matching and Clustering with Application to Unsupervised Graph Matching Network Learning »
Runzhong Wang · Junchi Yan · Xiaokang Yang -
2020 Poster: The Diversified Ensemble Neural Network »
Shaofeng Zhang · Meng Liu · Junchi Yan -
2020 Poster: Adversarial Learning for Robust Deep Clustering »
Xu Yang · Cheng Deng · Kun Wei · Junchi Yan · Wei Liu -
2019 Poster: Learning Latent Process from High-Dimensional Event Sequences via Efficient Sampling »
Qitian Wu · Zixuan Zhang · Xiaofeng Gao · Junchi Yan · Guihai Chen -
2018 Poster: Generalizing Graph Matching beyond Quadratic Assignment Model »
Tianshu Yu · Junchi Yan · Yilin Wang · Wei Liu · baoxin Li