Skip to yearly menu bar Skip to main content


Poster

NaRCan: Natural Refined Canonical Image with Integration of Diffusion Prior for Video Editing

Ting-Hsuan Chen · Shih-Han Yen · Jie Wen Chan · Hau-Shiang Shiu · Changhan Yeh · Yu-Lun Liu

East Exhibit Hall A-C #1802
[ ] [ Project Page ]
Wed 11 Dec 11 a.m. PST — 2 p.m. PST

Abstract:

Video editing approaches, especially through image-based diffusion models, have seen significant advancements but struggle with temporal consistency in video-to-video tasks. Existing models often fail to maintain sequence temporal consistency, disrupting frame transitions. To tackle this issue, this paper introduces NaRCan, a video editing framework that integrates a hybrid deformation field network with diffusion priors. Unlike conventional methods that generate a canonical image as a singular representation of video content, our approach ensures the production of high-quality, natural canonical images, which are crucial for downstream tasks like handwriting, style transfer, and dynamic segmentation. By leveraging a hybrid deformation field module and a sophisticated-designed scheduling method, NaRCan offers improved adaptability and superior editing capabilities across various video editing applications. Extensive experimental results show that our method outperforms existing approaches in producing coherent and high-quality video sequences. This work advances the state of video editing approaches and provides a robust solution for maintaining sequence temporal consistency using diffusion-based methods.

Live content is unavailable. Log in and register to view live content