Timezone: »
The neural attention mechanism has been incorporated into deep neural networks to achieve state-of-the-art performance in various domains. Most such models use multi-head self-attention which is appealing for the ability to attend to information from different perspectives. This paper introduces alignment attention that explicitly encourages self-attention to match the distributions of the key and query within each head. The resulting alignment attention networks can be optimized as an unsupervised regularization in the existing attention framework. It is simple to convert any models with self-attention, including pre-trained ones, to the proposed alignment attention. On a variety of language understanding tasks, we show the effectiveness of our method in accuracy, uncertainty estimation, generalization across domains, and robustness to adversarial attacks. We further demonstrate the general applicability of our approach on graph attention and visual question answering, showing the great potential of incorporating our alignment method into various attention-related tasks.
Author Information
Shujian Zhang (UT Austin)
Xinjie Fan (UT Austin)
Huangjie Zheng (University of Texas, Austin)
Korawat Tanwisuth (The University of Texas at Austin)
Mingyuan Zhou (University of Texas at Austin)
More from the Same Authors
-
2021 Poster: Exploiting Chain Rule and Bayes' Theorem to Compare Probability Distributions »
Huangjie Zheng · Mingyuan Zhou -
2021 Poster: Probabilistic Margins for Instance Reweighting in Adversarial Training »
qizhou wang · Feng Liu · Bo Han · Tongliang Liu · Chen Gong · Gang Niu · Mingyuan Zhou · Masashi Sugiyama -
2021 Poster: Convex Polytope Trees »
Mohammadreza Armandpour · Ali Sadeghian · Mingyuan Zhou -
2021 Poster: TopicNet: Semantic Graph-Guided Topic Discovery »
Zhibin Duan · Yi.shi Xu · Bo Chen · dongsheng wang · Chaojie Wang · Mingyuan Zhou -
2021 Poster: A Prototype-Oriented Framework for Unsupervised Domain Adaptation »
Korawat Tanwisuth · Xinjie Fan · Huangjie Zheng · Shujian Zhang · Hao Zhang · Bo Chen · Mingyuan Zhou -
2021 Poster: CARMS: Categorical-Antithetic-REINFORCE Multi-Sample Gradient Estimator »
Alek Dimitriev · Mingyuan Zhou -
2020 Poster: Bidirectional Convolutional Poisson Gamma Dynamical Systems »
wenchao chen · Chaojie Wang · Bo Chen · Yicheng Liu · Hao Zhang · Mingyuan Zhou -
2020 Poster: Implicit Distributional Reinforcement Learning »
Yuguang Yue · Zhendong Wang · Mingyuan Zhou -
2020 Poster: Deep Relational Topic Modeling via Graph Poisson Gamma Belief Network »
Chaojie Wang · Hao Zhang · Bo Chen · Dongsheng Wang · Zhengjue Wang · Mingyuan Zhou -
2020 Poster: Bayesian Attention Modules »
Xinjie Fan · Shujian Zhang · Bo Chen · Mingyuan Zhou -
2019 Poster: Variational Graph Recurrent Neural Networks »
Ehsan Hajiramezanali · Arman Hasanzadeh · Krishna Narayanan · Nick Duffield · Mingyuan Zhou · Xiaoning Qian -
2019 Poster: Semi-Implicit Graph Variational Auto-Encoders »
Arman Hasanzadeh · Ehsan Hajiramezanali · Krishna Narayanan · Nick Duffield · Mingyuan Zhou · Xiaoning Qian -
2019 Poster: Poisson-Randomized Gamma Dynamical Systems »
Aaron Schein · Scott Linderman · Mingyuan Zhou · David Blei · Hanna Wallach -
2018 Poster: Nonparametric Bayesian Lomax delegate racing for survival analysis with competing risks »
Quan Zhang · Mingyuan Zhou -
2018 Poster: Deep Poisson gamma dynamical systems »
Dandan Guo · Bo Chen · Hao Zhang · Mingyuan Zhou -
2018 Poster: Dirichlet belief networks for topic structure learning »
He Zhao · Lan Du · Wray Buntine · Mingyuan Zhou -
2018 Poster: Parsimonious Bayesian deep networks »
Mingyuan Zhou -
2018 Poster: Masking: A New Perspective of Noisy Supervision »
Bo Han · Jiangchao Yao · Gang Niu · Mingyuan Zhou · Ivor Tsang · Ya Zhang · Masashi Sugiyama -
2018 Poster: Bayesian multi-domain learning for cancer subtype discovery from next-generation sequencing count data »
Ehsan Hajiramezanali · Siamak Zamani Dadaneh · Alireza Karbalayghareh · Mingyuan Zhou · Xiaoning Qian -
2016 Poster: Poisson-Gamma dynamical systems »
Aaron Schein · Hanna Wallach · Mingyuan Zhou -
2016 Oral: Poisson-Gamma dynamical systems »
Aaron Schein · Hanna Wallach · Mingyuan Zhou -
2015 Poster: The Poisson Gamma Belief Network »
Mingyuan Zhou · Yulai Cong · Bo Chen -
2014 Poster: Beta-Negative Binomial Process and Exchangeable Random Partitions for Mixed-Membership Modeling »
Mingyuan Zhou -
2012 Poster: Augment-and-Conquer Negative Binomial Processes »
Mingyuan Zhou · Lawrence Carin -
2012 Spotlight: Augment-and-Conquer Negative Binomial Processes »
Mingyuan Zhou · Lawrence Carin -
2009 Poster: Non-Parametric Bayesian Dictionary Learning for Sparse Image Representations »
Mingyuan Zhou · Haojun Chen · John Paisley · Lu Ren · Guillermo Sapiro · Lawrence Carin -
2009 Oral: Non-Parametric Bayesian Dictionary Learning for Sparse Image Representations »
Mingyuan Zhou · Haojun Chen · John Paisley · Lu Ren · Guillermo Sapiro · Larry Carin