NeurIPS Patch Gradient Descent: Training Neural Networks on Very Large Images

Poster
in
Workshop: Workshop on Advancing Neural Network Training (WANT): Computational Efficiency, Scalability, and Resource Optimization

Patch Gradient Descent: Training Neural Networks on Very Large Images

Deepak Gupta · Gowreesh Mago · Arnav Chavan · Dilip K. Prasad · Rajat Thomas

[ Abstract ] [ Project Page ]

[ OpenReview]

Abstract:

Current deep learning models falter when faced with large-scale images, largely due to prohibitive computing and memory demands. Enter Patch Gradient Descent (PatchGD), a groundbreaking learning technique that seamlessly trains deep learning models on expansive images. This innovation takes inspiration from the standard feedforward-backpropagation paradigm. However, instead of processing an entire image simultaneously, PatchGD smartly segments and updates a core information-gathering element using portions of the image before the final evaluation. This ensures wide coverage across iterations, bringing in notable memory and computational efficiencies. When tested on the high-resolution PANDA and UltraMNIST datasets using ResNet50 and MobileNetV2 models, PatchGD clearly outstrips traditional gradient descent techniques, particularly under memory constraints. The future of handling vast image datasets effectively lies with PatchGD.

Chat is not available.

Poster in Workshop: Workshop on Advancing Neural Network Training (WANT): Computational Efficiency, Scalability, and Resource Optimization

Patch Gradient Descent: Training Neural Networks on Very Large Images

Deepak Gupta · Gowreesh Mago · Arnav Chavan · Dilip K. Prasad · Rajat Thomas

Poster
in
Workshop: Workshop on Advancing Neural Network Training (WANT): Computational Efficiency, Scalability, and Resource Optimization