Skip to yearly menu bar Skip to main content


Oral
in
Workshop: eXplainable AI approaches for debugging and diagnosis

[O6] Explaining Information Flow Inside Vision Transformers Using Markov Chain

Tingyi Yuan · Xuhong Li · Haoyi Xiong · Dejing Dou


Abstract:

Transformer-based models are receiving increasingly popularity in the field of computer vision, however, the corresponding interpretability is limited. As the simplest explainability method, visualization of attention weights exerts poor performance because of lacking association between the input and model decisions. In this study, we propose a method to generate the saliency map concerning a specific target category. The proposed approach connects the idea of the Markov chain, in order to investigate the information flow across layers of the Transformer and combine the integrated gradients to compute the relevance of input tokens for the model decisions. We compare with other explainability methods using Vision Transformer as a benchmark and demonstrate that our method achieves better performance in various aspects. Our code is available in the anonymized repository: https://anonymous.4open.science/r/TransitionAttentionMaps-8C62.