What is difference between the number of seeds and number of clusters in K-Mean Clustering algorithm?

More Muddsair Sharif's questions See All

Connected Vehicle Current Research Areas?

Can you please pinup few current research question in connected vehicle analytics?

11 December 2014 3,162 0 View

How to Calculate Engine Performance if you will have following set of value?

11 December 2014 6,741 4 View

MD MOE RMSD

How to calculate the RMSD values for a MD simulation using MOE?

07 August 2021 0 0 View

MD MOE RMSD

How to calculate the RMSD values for a MD simulation using MOE?

07 August 2021 0 0 View

fatal error

When I tried to energy minimization my system, I got fatal error as below. Fatal error: Atomtype opls_116 not found Although I've already added this line: ; include water #include "oplsaa.ff/spc.itp" to [molecultype] directive in my topology.

16 June 2021 0 0 View

fatal error

16 June 2021 0 0 View

fatal error

16 June 2021 0 0 View

fatal error

16 June 2021 0 0 View

How to compare two groups with only two measurements?

Hiiiii everyone! I have an enquiry on statistical analysis. I was looking for many forum and it's still cannot solve my problem. I want to compare means of two groups of data but only with two...

03 March 2021 8,796 3 View

Why should we choose deep learning to machine learning? Why is CNN better for image classification? Can I get Typical Answer?

What Characteristics makes CNN work better?

03 March 2021 1,458 4 View

What are some of the research gaps in machine learning and artificial intelligence in africa?

i would to know some of the research gaps in the artificial intelligence field in most african countries.

03 March 2021 6,145 3 View

Enhanced Yellow Fluorescent Protein (Aequorea victoria) DNA sequence?

I am on the lookout for the Enhanced Yellow Fluorescent Protein (Aequorea victoria) DNA sequence. Does anyone know where I can find it? Thank you in advance

03 March 2021 3,568 1 View

Glen Dario Rodriguez Popular answer

To decide what is the best number of cluster is a different problem than to decide how to set the values of the seeds.

The first problem is how to decide the"value of k" in k-means (k= amount of clusters), because any additional cluster improves the quality of the clustering but at a decreasing rate, and having too many clusters may be useless to decision makers, data comprehension, data explanation, etc.

The number of initial seeds (initial centers of clusters) is the same as number of clusters (at leats in the original k-means). The problem of the VALUES of the seeds is different than problem of number of clusters... normally you would use random cluster centers, but some research points to better ways to choose them. With better seeds, k-means converges faster and the quality of the clusters is good.

I remember that there is variations of k-means mixed with hieralchical methods, in those, you use more than k seeds, and later you must collapse (unify) some clusters, like in hieralchical clustering, until reducing the number of clusters to k. In that method, final number of clusters is not equal to initial number of seeds.

Glen Dario Rodriguez

Muddsair Sharif

@Alexey, yes i am using weka for a traffic dataset to fetch out some interesting result to make some future trends. thanks for your appreciation of time.

regards

Muddsair

Mariana Soffer

They are the same in classical algorithm

Said Fouchal

We start by seeds, as bases of the algorithm, to define the clusters by aggregation. The number of seeds is the same as the number of clusters = k.

Anteneh Ayanso

The seed number (any integer) is the randomization for your initial K points. K represents the number of clusters. Because Kmeans is sensitive to initial points, you will have to try experimentation on the stability of your clusters with different seeds. However, K is user defined which could also be guided by domain knowledge and other practical factors.

Haneeh Al-Anize

Whats seed does mean in weka? such an example if i configured seed = 100 with noting that i used split dataset test option in weke, what will happen to my dataset?