Unified Pretraining on Mixed Optophysiology and Electrophysiology Data Across Brain Regions
Abstract
Building models that unify diverse neural recordings is a crucial step toward scalable foundation models for neuroscience. However, most large-scale models remain tied to a single modality, which limits our ability to integrate information across different spatiotemporal scales. We introduce a POYO-based universal encoder that learns a shared latent representation of electrophysiology (irregular spike times) and optophysiology (regular calcium fluorescence timeseries) without requiring simultaneous recordings. Across large datasets from the Allen Institute spanning both calcium imaging and Neuropixels, we show that joint pretraining outperforms uni-modal baselines and strengthens cross-region transfer. These results show that our mixed-modality pretraining framework can integrate independently collected recordings into a common representational space, advancing the path toward foundation models for diverse multi-modal neural data.