Paper Oral
in
Workshop: Machine Learning for Creativity and Design

Controllable and Interpretable Singing Voice Decomposition via Assem-VC

Kangwook Kim · Junhyeok Lee

[ Abstract ]
Mon 13 Dec 12:50 p.m. PST — 1 p.m. PST

Abstract:

We propose a singing decomposition system that encodes time-aligned linguistic content, pitch, and source speaker identity via Assem-VC. With decomposed speaker-independent information and the target speaker's embedding, we could synthesize the singing voice of the target speaker. In conclusion, we made a perfectly synced duet with the user's singing voice and the target singer's converted singing voice.