Poster
M3LEO: A Multi-Modal, Multi-Label Earth Observation Dataset Integrating Interferometric SAR and RGB Data
Matthew Allen · Francisco Dorr · Joseph Alejandro Gallego Mejia · Laura Martínez-Ferrer · Anna Jungbluth · Freddie Kalaitzis · Raul Ramos-Pollán
East Exhibit Hall A-C #1010
Satellite-based remote sensing has revolutionised the way we address global chal-lenges in a rapidly evolving world. Huge quantities of Earth Observation (EO) dataare generated by satellite sensors daily, but processing these large datasets for use inML pipelines is technically and computationally challenging. Specifically, differenttypes of EO data are often hosted on a variety of platforms, with differing degreesof availability for Python preprocessing tools. In addition, spatial alignment acrossdata sources and data tiling for easier handling can present significant technicalhurdles for novice users. While some preprocessed Earth observation datasets exist,their content is often limited to optical or near-optical wavelength data, which isineffective at night or in adverse weather conditions. Synthetic Aperture Radar(SAR), an active sensing technique based on microwave length radiation, offersa viable alternative. However, the application of machine learning to SAR hasbeen limited due to a lack of ML-ready data and pipelines, particularly for the fulldiversity of SAR data, including polarimetry, coherence and interferometry. In thiswork, we introduce M3LEO, a multi-modal, multi-label Earth observation datasetthat includes polarimetric, interferometric, and coherence SAR data derived fromSentinel-1, alongside Sentinel-2 RGB imagery and a suite of labelled tasks formodel evaluation. M3LEO spans 17.5TB and contains approximately 10M datachips, each measuring 4x4 km, across six diverse geographic regions. The datasetis complemented by a flexible PyTorch Lightning framework, with configurationmanagement using Hydra, to accommodate its use across diverse ML applicationsin Earth observation. Additionally, we provide tools to process any dataset availableon popular platforms such as Google Earth Engine for seamless integration withour framework. Initial experiments validate the utility of our data and framework,and show that SAR imagery contains information additional to that extractablefrom RGB data alone. Data is available at huggingface.co/M3LEO, and code atgithub.com/spaceml-org/M3LEO.
Live content is unavailable. Log in and register to view live content