After forming the clusters of customers of Banks, using various algorithms like - kmeans and decision tree. What are the ways we can validate the clusters formed by our method.
I have recently published a paper on clustering in banking, by using SOM methodology, and we used a predetermined methodology provided by the Viscovery (software that we have used). However, we have provided a logical validation of the clusters. It would be even better if you can apply a validation by the experts from the field, and also present this in the paper (we do not have this in our paper).
Bach, M. P., Juković, S., Dumičić, K., & Šarlija, N. (2014). Business Client Segmentation in Banking Using Self-Organizing Maps. South East European Journal of Economics and Business, 8(2), 32-41.
Article Business Client Segmentation in Banking Using Self-Organizing Maps
I have recently published a paper on clustering in banking, by using SOM methodology, and we used a predetermined methodology provided by the Viscovery (software that we have used). However, we have provided a logical validation of the clusters. It would be even better if you can apply a validation by the experts from the field, and also present this in the paper (we do not have this in our paper).
Bach, M. P., Juković, S., Dumičić, K., & Šarlija, N. (2014). Business Client Segmentation in Banking Using Self-Organizing Maps. South East European Journal of Economics and Business, 8(2), 32-41.
Article Business Client Segmentation in Banking Using Self-Organizing Maps
To evaluate the result of a clustering algorithm, we distinguish three kinds of techniques:
-External Index: Based on previous knowledge about the data, it is used to measure the extent to which cluster labels match externally supplied class labels.
-Internal Index: Based on the information intrinsic to the data alone, it is used to measure the goodness of a clustering structure without respect to external information.
-Relative Index: Used to compare two different clusterings or clusters.
There are many useful indexes that can be used according to your goal of clustering. some of them can be named as Graph-based cohesion, Prototype-based cohesion, Graph-based separation and cohesion, sum of Squared Error(SSE), Between group sum of Squares (SSB), Davies Bouldin , Dunn & so on. After that you can ask experts to give their ideas about the clusters.
There are several methods to cluster performance evaluation and, unfortunately but rather to be expected, no consensus on what is best :-) For an excellent, concise intro on a few of these methods including brief explanation, mathematical background and references see http://scikit-learn.org/stable/modules/clustering.html#clustering-performance-evaluation. If you are a python user you can also use the scikit code described there to achieve your goal or at least to get started.