Stochastic Gradient Descent:

Uses only single training example to calculate the gradient and update parameters.

Batch Gradient Descent:

Calculate the gradients for the whole dataset and perform just one update at each iteration.

Mini-batch Gradient Descent:

Mini-batch gradient is a variation of stochastic gradient descent where instead of single training example, mini-batch of samples is used. It’s one of the most popular optimization algorithms. 

Similar questions and discussions