Skip to yearly menu bar Skip to main content

Workshop: Causal Representation Learning

Mixup-Based Knowledge Distillation with Causal Intervention for Multi-Task Speech Classification

Kwangje Baeg · Hyeopwoo Lee · Yeomin Yoon · Jongmo Kim

Keywords: [ age group hierarchy ] [ mixup ] [ causal intervention ] [ hierarchical multi-task learning ] [ knowledge distillation ]


Speech classification is an essential yet challenging subtask of multitask classification, which determines the gender and age groups of speakers. Existing methods face challenges while extracting the correct features indicative of some age groups that have several ambiguities of age perception in speech. Furthermore, the methods cannot fully understand the causal inferences between speech representation and multilabel spaces. In this study, the causes of ambiguous age group boundaries are attributed to the considerable variability in speech, even within the same age group. Additionally, features that indicate speech from the 20’s can be shared by some age groups in their 30’s. Therefore, a two-step approach to (1) mixup-based knowledge distillation to remove biased knowledge with causal intervention and (2) hierarchical multi-task learning with causal inference for the age group hierarchy to utilize the shared information of label dependencies is proposed. Empirical experiments on Korean open-set speech corpora demonstrate that the proposed methods yield a significant performance boost in multitask speech classification.

Chat is not available.