For my academic project this year that involves some image processing requirements, I am given a large number of images (about 86,000) images, each of 5 Megapixels (2480x2480), in which I need to separate out black and non-black images.

My team's proposed approach is that we scale down the 2480x2480 images to 240x240 (by picking an arbitrary pixel every, lets say 8 pixels) and then plotting a histogram of pixel value vs no.of pixels, where the pixel value of 0 is black and the highest pixel value (e.g. 255 for 8 bits per pixel) is white. If the mean of the histogram is very close to 0 and if the variance from this value of the mean is very small (to make sure we don't include images of low brightness in the black category), we categorize the image as a black image.

My team is wondering, do you know of a fastest image processing approach to identify if an image is purely black or is our above approach fine (please comment if theres any mistakes in our approach)?

Similar questions and discussions