Characterizing young children’s everyday activities using video question-answering models
Tarun Sepuri · Khai Loong Aw · Alvin Tan · Robert Sparks · Virginia Marchman · Michael C Frank · Bria Long
Abstract
Children are remarkably efficient learners compared to our most advanced computational models of learning. One key difference is that children seem to leverage regularities in the activities (e.g., $\textit{eating}$) in which they participate to learn about words or objects (e.g., "pomegranate"), even under skewed, long-tailed distributions. While everyday activities have long been theorized to be important as supports for children's learning, our understanding of the types, frequencies, and rhythms of these activities has been out of reach due to both a lack of naturalistic video datasets and the necessity for manual annotations. Here, we use the recent release of a large, egocentric dataset of children's everyday experience (BabyView) ($N$=31 children, $N$=868 hours) and capitalize on innovations in video question-answering (VideoQA) models to quantify the $\textit{what}$ and $\textit{where}$ of children's everyday experiences. Using these models, we classify both the activities (e.g., $\textit{eating, dancing, exploring}$) and physical locations (e.g., $\textit{living room, garage}$) in the infant view and to generate natural-language descriptions for contiguous 10-second videos across the entire dataset. We provide convergent validity for our classifications by recovering expected trends (e.g., high frequency of $\textit{play}$ in the $\textit{living room}$ in this dataset). Further, our analyses highlight the variability in children's everyday activities across locations and across time. Compared with prior work analyzing static image content, our work highlights the advances possible by using VideoQA models to analyze the dynamic nature of children's experiences. A better understanding of how children learn in everyday contexts should inform developmentally-inspired models of early learning and cognitive development.
Chat is not available.
Successful Page Load