QuadForecaster: Diffusion-Based Quadruped Pose Prediction for Animal Communication Analysis
Abstract
Animal communication relies on subtle temporal patterns in movement that current pose estimation systems cannot anticipate, thus limiting their utility. Existing frameworks excel at detecting present configurations but fail to predict future poses, forcing interaction systems to remain reactive rather than proactive. We introduce QuadForecaster, the first diffusion-based model specifically designed for quadrupedal pose prediction, enabling automated systems to decode animal communication through movement forecasting. Our temporally cascaded diffusion architecture treats pose prediction as structured denoising, iteratively refining uncertain future poses while providing essential uncertainty quantification for safe deployment. Evaluated on the cheetah and cow datasets, QuadForecaster achieves 0.116m MPJPE for cheetah behaviors and 0.86m MPJPE for complex cow social interactions, successfully capturing rapid behavioral transitions and multi-modal dynamics. QuadForecaster paves the way for robust animal motion and communication analysis, enabling proactive cross-species interaction across conservation, agriculture, and research applications.