31 December 2015 4 9K Report

Hierarchical clustering algorithm is a well-known algorithm for clustering data points into different clusters. The number of clusters may not be specified apriori unlike K-means. Although, a cutting point value can be used to obtain clusters at a particular point during the clustering operation. The algorithm is either bottom-up (agglomerative) or top-down (divisive).

Recently I needed to cluster large dataset using HAC algorithm, so I checked Mahout list of clustering algorithms. Mahout has limited algorithm for clustering operation and all of these require the knowledge of cluster k. I tried HAC implementation on WEKA but the algorithm run for two days with 2GB heap memory size without producing any results. Any advice on this? Thanks

More Adewole K. S.'s questions See All
Similar questions and discussions