Poster
in
Workshop: Workshop on Advancing Neural Network Training (WANT): Computational Efficiency, Scalability, and Resource Optimization

DAREL: Data Reduction with Losses for Training Acceleration of Real and Hypercomplex Neural Networks

Alexander Demidovskij ⋅ Aleksei Trutnev ⋅ Artyom Tugaryov ⋅ Igor Salnikov ⋅ Stanislav Pavlov

Project Page [ Slides] [ Poster] [ OpenReview]

Abstract

Neural network training requires a lot of resources, and there are situations where training time and memory usage are limited. In such instances, the undertaking of devising specialized algorithms for training neural networks within the constraints of resource limitations holds significance. Data Reduction with Losses is a novel training data reduction method that operates with training samples based on losses obtained from a currently trained model or a pre-trained one. Proposed method is applicable to training Deep Neural Networks for both Computer Vision and Natural Language Processing tasks. Applied to Large Language Models fine-tuning, Data Reduction with Losses can be combined with existing methods for Parameter-Efficient fine-tuning, such as LoRA. Computational experiments demonstrate superiority of the proposed approach to pre-training neural networks for Computer Vision tasks over existing methods and draws clear evidence of improving Large Language Models fine-tuning quality and time. Training acceleration for ResNet18 is up to 2.03x, while for fine-tuning DAREL allows to achieve 1.43x acceleration for GPT2-M fine-tuning with corresponding increase of BLEU by 1.81 p.p.

Video

Chat is not available.