Question - How important is it to shuffle the training data when using batch gradient descent?
Answer -
Shuffling the training dataset will not make much of a difference because the gradient is calculated at every epoch using the complete training dataset.