Skip to yearly menu bar Skip to main content

Workshop: Machine Learning and the Physical Sciences

Physical Data Models in Machine Learning Imaging Pipelines

Marco Aversa · Luis Oala · Christoph Clausen · Roderick Murray-Smith · Bruno Sanguinetti


Light propagates from the object through the optics up to the sensor to create an image. Once the raw data is collected, it is processed through a complex image signal processing (ISP) pipeline to produce an image compatible with human perception. However, this processing is rarely considered in machine learning modelling because available benchmark data sets are generally not in raw format. This study shows how to embed the forward acquisition process into the machine learning model. We consider the optical system and the ISP separately. Following the acquisition process, we start from a drone and airship image dataset to emulate realistic satellite raw images with on-demand parameters. The end-to-end process is built to resemble the optics and sensor of the satellite setup. These parameters are satellite mirror size, focal length, pixel size and pattern, exposure time and atmospheric haze. After raw data collection, the ISP plays a crucial role in neural network robustness. We jointly optimize a parameterized differentiable image processing pipeline with a neural network model. This can lead to speed up and stabilization of classifier training at a margin of up to 20\% in validation accuracy.

Chat is not available.