NeurIPS Hierarchically Clustered PCA and CCA via a Convex Clustering Penalty

Poster
in
Affinity Workshop: Women in Machine Learning

Hierarchically Clustered PCA and CCA via a Convex Clustering Penalty

Amanda Buch · Conor Liston · Logan Grosenick

[ Abstract ]

Abstract: We introduce an unsupervised learning approach that combines the truncated singular value decomposition with convex clustering to estimate within-cluster directions of maximum variance/covariance (in the variables) while simultaneously hierarchically clustering (on observations). In contrast to previous work on joint clustering and embedding, our approach has a straightforward formulation, is readily scalable via distributed optimization, and admits a direct interpretation as hierarchically clustered principal component analysis (PCA) or hierarchically clustered canonical correlation analysis (CCA). Through numerical experiments and real-world examples relevant to precision medicine, we show that our approach outperforms traditional and contemporary clustering methods on underdetermined problems (

p ≫ N

$p \gg N$ with tens of observations) and scales to large datasets (e.g.,

N = 100, 000

$N=100,000$ ;

p = 1, 000

$p=1,000$ ) while yielding interpretable dendrograms of hierarchical per-cluster principal components or canonical variates.

Chat is not available.

Poster in Affinity Workshop: Women in Machine Learning

Hierarchically Clustered PCA and CCA via a Convex Clustering Penalty

Amanda Buch · Conor Liston · Logan Grosenick

Poster
in
Affinity Workshop: Women in Machine Learning