Let me leave some comment. From a theoretical point of view It seems to be not entirely reliable to calculate variance of 9 elements. Because the sample is very small and calculated estimation of the variance poorly reflects with the real value. For better results sample should contain 30-40 elements, 100 elements is much better.
Is there a tool in Python that can help visualizing the variance of images provided a bunch of images ? Let's say 4000 images. I want to now if there are very similar which I think would result in a poor dataset or if they are well different (provided the same class) which would infer that the images are well diverse in order to improve the training on image classes.