One of the most important components of a clustering algorithm is the measure of similarity used to determine how close two patterns are to one another. K-means clustering groups data vectors into a predefined number of clusters, based on Euclidean distance as similarity measure. Data vectors within a cluster have small Euclidean distances from one another, and are associated with one centroid vector, which represents the “midpoint” of that cluster. The centroid vector is the mean of the data vectors that belong to the corresponding cluster.

More Maysam Toghraee's questions See All
Similar questions and discussions