23 October 2022 2 5K Report

I understand that one of the methods to initialize the k medoids is by finding the minimum sum of the distance to every data point. In terms of the SWAP stage, some resources stop the iteration after a predefined number of iterations. However, the original PAM algorithm stops iterating when the Total Deviation (SSE)is minimized.

So, could you please confirm my method of doing the SWAP stage?

I sort the data points from the lowest sum of the distance to the highest. If the K is set to 3 then the first three data points are the initial medoids.

{M3, M1, M5, x0, x2, x7, x9, x8, x6, x4}

In the SWAP stage, first I calculate the Total Deviation (TD) for the initial medoids and set it aside for comparison.

Then I calculate the TD when swapping M3 with x0, x2, x7, x9, x8, x6, x4 and stop this process when the previous TD (within the loop) is smaller than the current TD. And I set the outcome of this process aside for comparison.

I repeat the last process for M1 and M5 and set the outcomes aside for comparison.

Finally, I will end up with four TD; one from the initial medoids and the other three from checking the initial medoids against the non-medoids data points. So, I just pick the outcome of the smallest TD.

Is this the correct way to do the SWAP stage?

More Fady Samann's questions See All
Similar questions and discussions