Skip to yearly menu bar Skip to main content


Poster

FactorizePhys: Effective Spatial-Temporal Attention in Remote Photo-plethysmography through Factorization of Voxel Embeddings

Jitesh Joshi · Sos Agaian · Youngjun Cho

East Exhibit Hall A-C #2010
[ ] [ Project Page ]
Fri 13 Dec 4:30 p.m. PST — 7:30 p.m. PST

Abstract:

Remote photo-plethysmography (rPPG) enables the non-invasive recovery of blood volume pulse signals through imaging, transforming spatial-temporal data into time-series signals. Recent advancements in end-to-end rPPG approaches have focused on this transformation. Attention mechanisms play a critical role in feature extraction, but current mechanisms often compute attention disjointly across spatial, temporal, and channel dimensions. We propose the Factorized Self-Attention Module (FSAM), which jointly computes multi-dimensional attention from voxel embeddings using non-negative matrix factorization. To demonstrate FSAM's efficacy, we developed FactorizedPhys, an end-to-end 3D-CNN architecture for estimating blood volume pulse signals from raw video frames. Our approach effectively factorizes voxel embeddings to achieve comprehensive spatial-temporal attention, enhancing generic signal extraction tasks. Additionally, we deployed FSAM within an existing 2D-CNN-based rPPG architecture to showcase its versatility. FSAM and FactorizedPhys were rigorously evaluated against state-of-the-art rPPG methods, each representing different architectural types and attention mechanisms. We conducted an ablation study to examine FSAM's capability in factorizing voxel embeddings. Experiments on four publicly available datasets demonstrated FSAM's effectiveness and superior inter-dataset generalization. The results indicate significant potential for adopting the presented method in imaging based physiological sensing. The code for our methods is available at ANONYMISED.

Live content is unavailable. Log in and register to view live content