Gradient Accumulation is an optimization technique that is used for training large Neural Networks on GPU and help reduce memory requirements and resolve Out-of-Memory OOM errors while training. We have explained the concept along with Pytorch code.
Table of contents:
Background on training Neural NetworksWhat is the problem in this training process?Gradient AccumulationGradient Accumulation in PytorchProperties of Gradient AccumulationConcluding Note
Following table summarizes the concep...
Published on November 25, 2022 19:55