I have a two dimensional data, i.e. 2 variables (e.g. x and y) with 1500 observations each. Visual inspection shows clearly that the data are clustered according to a third dimension in a binary matter. Meaning that:  for z threshold data are clustered.

I am looking for a clustering algorithm that can choose the best threshold that for example minimizes the variance. I need something so not to "just choose a threshold by eye".

See examples below:

Similar questions and discussions