Hello to everyone. I am trying to implement KNN analysis to fix minPts in the DBSCAN clustering algorithm. My dataset is composed only of 4 variables and 935 observations. I have found that if k = 5 (no. of variables + 1) I get as output of DBASCAN 2 clusters: one of 911 observations and one of 8 observations. If I use a larger k, according to many papers as sqrt(no. of observations), I get 909 observation in only one cluster and the other are classified as noise points.

Both could be possible results, but their meaning is fundementally different. How can I get rid of this arbitrary choise of minPts hence k?

Thanks!

More Vincenzo Guida's questions See All
Similar questions and discussions