Sometimes a clustering algorithm which is used for a dataset with a high number of features can not be used for a dataset with low numbers of features. What is the effect of numbers of features to select a suitable clustering algorithm?
Well, I wrote an algorithm based on Type-2 fuzzy. I ran it on 10 standard datasets of UCI. In some datasets which have about 4 features my algorithm lose the accuracy. For example: Iris dataset has 150 samples and 4 features and my algorithm lose the accuracy, in the other hand, Wine dataset has 178 samples and 13 features, but it does not lose the accuracy. Even my algorithm can find higher accuracy agaist General Type-2 FCM (GT2 FCM) for Wine dataset.
Also, I add some noise to these datasets and the results were the same.
A classic approach to testing clustering algorithms is to generate data from mixture distributions of various kinds. This will allow you to vary different parameters like dimensionality without changing the difficulty of the actual clustering.
For instance, one test for initialization of k-means uses an even mixture of very small symmetric clusters positioned at a few corners of the unit hyper cube. You can adjust the dimensionality of this data very easily and test to see how various algorithms treat it. Since you know that the data is highly separated, you also know that you have isolated the convergence properties from the initialization properties.
The higher the number of features, the higher the higher the amount of information, and the higher the degrees of freedom, which means the higher flexibility.
BUT, more features does not necessarily provide more useful information! So, the results of your decision making will be worst if you increase irrelevant features as they cause higher uncertainity and noise.
So, the isssue lies on how much you know about the features and how you choose the best combination of features.