I am using kmeans clustering for image segmentation however there is a problem that the output image changes from one execution to another. How can i solve this problem?
finding the global minimum of a k-means clustering is NP-hard. Therefore, normally, you randomly initialize your algorithm with different random seeds and use the best outcome (sub-optimal solution).
Because k_means is an unsupervised clustering method. For each execution, it does not have any pre-knowledge about the input data, so, for example, it does not know which cluster should be cluster number one, and just considers one number for each cluster during the process. But there is a solution for it, if your data is not stochastic. During all executions center of each specific cluster would not change. Hence, alongside output clustering of K-means, you can also read center of clusters that is calculated with k-means, and use them based on their distance from original coordinates, or based on their coordinates; which during all executions will not be changed, and use them to make your own unchanged clusters. For instance, you can make your own definition that cluster with the smallest coordinates should be cluster one, and so on...
Hello Salwa, as Christoph Jud said, this is a consequence of the random initialization of the clusters in the first iteration. To avoid different results, you should always select the same initial centroids. As Chistoph also said, selecting the optimal set of centroids is an NP-hard problem. For a better initialization I suggest that you consider the following method:
Arthur, D., & Vassilvitskii, S. (2007, January). k-means++: The advantages of careful seeding. In Proceedings of the eighteenth annual ACM-SIAM symposium on Discrete algorithms (pp. 1027-1035). Society for Industrial and Applied Mathematics.