Skip to yearly menu bar Skip to main content


Bivariate Causal Discovery for Categorical Data via Classification with Optimal Label Permutation

Yang Ni

Hall J (level 1) #114

Keywords: [ Qualitative Data ] [ causal discovery ] [ discrete data ] [ Categorical Data ] [ bayesian network ]


Causal discovery for quantitative data has been extensively studied but less is known for categorical data. We propose a novel causal model for categorical data based on a new classification model, termed classification with optimal label permutation (COLP). By design, COLP is a parsimonious classifier, which gives rise to a provably identifiable causal model. A simple learning algorithm via comparing likelihood functions of causal and anti-causal models suffices to learn the causal direction. Through experiments with synthetic and real data, we demonstrate the favorable performance of the proposed COLP-based causal model compared to state-of-the-art methods. We also make available an accompanying R package COLP, which contains the proposed causal discovery algorithm and a benchmark dataset of categorical cause-effect pairs.

Chat is not available.