Question - Explain the difference between Gradient Descent and Stochastic Gradient Descent.
Answer -
To begin with, Gradient descent and stochastic gradient descent both are popular machine learning and deep learning optimization algorithms which are used for updating a set of parameters in an iterative way in order to minimize an error function. In gradient descent in order to update parameters, the entire dataset set is to be considered for a particular iteration while in stochastic gradient descent, computation is carried over only one single training sample. For example, if a dataset has 10000 datapoints, then GD, will train on all the 10000 datapoints and this will take a longer time, while on the other hand, Stochastic GD, will be much faster as we will train on only a single sample and update the parameters. This is because Stochastic gradient descent usually converges faster than gradient descent on large datasets, because updates are more frequent.