As most of the literature says that this step is necessary because for performance reasons. Is there any other reason. I found these algorithms under performs when the faces are black and when images are converted into grayscale.
If you use color images, you have to use more complex models. For example, to handle color images with Convolutional Neural Network, 3 channels are required (the RGB components) instead of 1 with grayscale image. This is more computationally expensive and it is more complicated to converge.
Nicolas Martin and Ali Tourani already said the main reason of using grayscale images. But you also can mention that many of known image processing methods were developed only for grayscale images [Computer vision. Shapiro, Stockman].
Thank you every one for your valuable response. So to do computations on grayscale image is less computationally expensive. But there is less accuracy to detect faces of black people from the grayscale images. So what is the alternative for these kind of datasets.