I am working on the implementation of the variable threshold algorithm for image segmentation. This algorithm requires the computation of mean and standard deviation of the 3x3 neighborhood of all pixels in the image. I have been able to get good results but the code is a bit slow to work in real-time. Of course, since I need only the comparison, I have substituted the standard deviation with variance to avoid the square root.
My question: I am computing mean and variance over each neighborhood independently right now. Can someone suggest a way to improve the run-time performance of the code to compute these quantities any faster? Or if someone has a way to apply the variable threshold algorithm in a different way to achieve the same result?