Skip to yearly menu bar Skip to main content

Workshop: Table Representation Learning Workshop

InterpreTabNet: Enhancing Interpretability of Tabular Data Using Deep Generative Models and Large Language Models

Jacob Yoke Hong Si · Rahul Krishnan · Michael Cooper · Wendy Yusi Cheng

Keywords: [ KL divergence ] [ GPT-4 ] [ Gumbel-Softmax distribution ] [ Large language models ] [ Conditional Variational Autoencoder ] [ interpretability ] [ Deep generative models ] [ tabular data ]


Tabular data are omnipresent in various sectors of industries. Neural networks for tabular data such as TabNet have been proposed to make predictions while leveraging the attention mechanism for interpretability. We find that the inferred attention masks on high-dimensional data are often dense, hindering interpretability. To remedy this, we propose the InterpreTabNet, a variant of the TabNet model that models the attention mechanism as a latent variable sampled from a Gumbel-Softmax distribution. This enables us to regularize the model to learn distinct concepts in the attention masks via a KL Divergence regularizer. It prevents overlapping feature selection which maximizes the model's efficacy and improves interpretability. To automate the interpretation of the features from our model, we employ GPT-4 and use prompt engineering to map from the learned feature mask onto natural language text describing the learned signal. Through comprehensive experiments on real-world datasets, we demonstrate that our InterpreTabNet Model outperforms previous methods for learning from tabular data while attaining competitive accuracy and interpretability.

Chat is not available.