How to validate the clusters formed for the Banking Customers Data?

More Mayur Narkhede's questions See All

How to check statistically whether two or more variables can be combined into one derived variable?

I have scenario where, It needs to decide whether we can combine two or more variables to form single derived variable. for example: if we have 100 samples of distance traveled and time taken to...

31 December 2018 2,041 55 View

What is Eigen Values and Eigen Vectors for Data Science? How important it is for Data Understanding?

Can Eigen value and Eigen vector be explained in terms of its use in Data Science. How it can be described in layman's terms to normal amateur in this field? Any help in terms of applications,...

06 July 2018 4,730 12 View

What if the statistical correlation does not speak exact knowledge or theory as business/ domain says?

I meant to say that if we take correlation of two variables then it gives contradictory results to the domain understanding of those two variables. How to explain this from data side of the...

05 June 2018 2,422 3 View

How to apply statistical model in scenarios where dependent/response variable is not defined/labeled?

I have a scenario where many people are using a software and while utilizing they are assigned values between 0 to 1 based on their good or bad usage. e.g. good use would provide value to "usage"...

02 March 2018 9,558 5 View

What machine learning algorithms are used to get the summary of the conversation text?

While developing a summarization algorithm, which algorithms are used and how to provide more weight to domain related terms in the process so that summary should contain those terms keeping the...

02 March 2018 2,812 1 View

What type of analytics is used by call center companies to improve the business?

What type of models, analytics, data science is used by call center companies where the large number of calls are made everyday. Does it involves audio analytics or speech to text conversion and...

02 March 2018 8,927 4 View

Are you all really convinced for the adaption of Artificial Intelligence for future?

"The AI Takeover Is Coming" this is what is the news these days. Is it really a trend setter for future years. What is the impact over manual work due to this? just needed the audience thoughts...

07 August 2017 6,494 37 View

What are the common steps involved in text analytics projects?

If i have to get the most possible generic steps of text analytics, what are the most commonly used steps for any text analysis model. Any help and your expert guidance/ suggestions are...

07 August 2017 4,014 9 View

How is the working of logistic regression? Can it be applicable for Online Machine Learning Problems?

Can logistic regression be applicable to online machine learning problems. Could you please explain the detailed working of logistic regression i.e. The type of problems which are solved by Logit,...

03 April 2017 8,666 2 View

How do you know that a model is over fitted by the Linear Regression? Once identified how to fix it?

In data science, specifically in predictive use cases, I have seen application of Linear Regression plays bigger role. Most of the predictive problems undergo this method knowingly or unknowingly...

03 April 2017 6,307 6 View

Feedback defines the constitution of an organism?

“Here is a thought experiment. Let's place Rodolpho Llinas's jarred-brain on top of a body (Fig. 1). I bet Llinas would argue that his jarred-brain retains its own consciousness, and the android...

11 August 2024 2,483 1 View

How to learn more about SPSS and its Application?

I would like to learn more about SPSS and Its application especially in regards to data analysis. Please suggest me how I can learn more about it. Thank you so much.

11 August 2024 9,101 4 View

Can I base on reverse DNA sequences to perform alignment, convert to amino acids and GenBank submission?

I have reverse sequences (AB1 format), can I base on reverse DNA sequences to perform nucleotide alignment, convert nucleotides to amino acids and deposit the sequence in GenBank database?

11 August 2024 5,138 1 View

Baseline drift in HPLC? What causes this?

Hello, Why do i see this baseline drift when i compare my blank (black) to the sample (blue)? Any suggestions as to why this happened? Thank you!

11 August 2024 3,770 4 View

Text-Communication from the M1 Hand Area using BCI—and then there is Elon Musk?

Willett, Shenoy et al. (2021) have developed a brain computer interface (BCI) that used neural signal collected from the hand area of the motor cortex (area M1) of a paralyzed patient. The...

10 August 2024 7,180 0 View

Handling Missing Data and Building a Predictive Model with Incomplete Information ?

I am developing a predictive model for a water supply network that involves 20 influencing points. However, I only have historical data for 10 out of these 20 points. I would like to know how to...

10 August 2024 4,005 2 View

Has anyone applied Python in the field of textile engineering for data analysis, automation, or smart textiles?

I'm currently exploring the application of Python in textile engineering, specifically in areas like data analysis, process automation, and the development of smart textiles. I'm interested in...

10 August 2024 7,429 2 View

Which Scopus Journal provides the most affordable fees?

"PUBLISHING IN A SCOPUS JOURNAL" Researchers are now at a cross road. The critical need to publish in a Scopus or ISI, etc journal is ever vital. Journal Publication fees must be submitted....

10 August 2024 8,621 1 View

Seeking Advice on Viability and Execution of Undergraduate Thesis Topic?

Hello everyone, I am currently developing a thesis proposal and would appreciate your input on its viability and how to effectively carry it out. My proposed topic is: "Does the perceived threat...

10 August 2024 8,992 0 View

How can I use the cif data obtained from rietveld refinement extracted via gsas2, for microstructural analysis using ETEX software?

09 August 2024 7,718 0 View

Mirjana Pejic Bach Popular answer

Dear Mayur, here is the link to the clustering validation in general:

http://web.itu.edu.tr/sgunduz/courses/verimaden/paper/validity_survey.pdf

I have recently published a paper on clustering in banking, by using SOM methodology, and we used a predetermined methodology provided by the Viscovery (software that we have used). However, we have provided a logical validation of the clusters. It would be even better if you can apply a validation by the experts from the field, and also present this in the paper (we do not have this in our paper).

Here are the links:

https://www.researchgate.net/publication/276089828_Business_Client_Segmentation_in_Banking_Using_Self-Organizing_Maps?ev=prf_pub

Bach, M. P., Juković, S., Dumičić, K., & Šarlija, N. (2014). Business Client Segmentation in Banking Using Self-Organizing Maps. South East European Journal of Economics and Business, 8(2), 32-41.

Article Business Client Segmentation in Banking Using Self-Organizing Maps

Mirjana Pejic Bach

Mohamed Ben Mzoughia

To evaluate the result of a clustering algorithm, we distinguish three kinds of techniques:

-External Index: Based on previous knowledge about the data, it is used to measure the extent to which cluster labels match externally supplied class labels.

-Internal Index: Based on the information intrinsic to the data alone, it is used to measure the goodness of a clustering structure without respect to external information.

-Relative Index: Used to compare two different clusterings or clusters.

Behzad Maleki Vishkaei

There are many useful indexes that can be used according to your goal of clustering. some of them can be named as Graph-based cohesion, Prototype-based cohesion, Graph-based separation and cohesion, sum of Squared Error(SSE), Between group sum of Squares (SSB), Davies Bouldin , Dunn & so on. After that you can ask experts to give their ideas about the clusters.

Oleksii Tyshchenko

There are some popular clustering validity measures:

Partition Coeffcient (PC)

Classiffcation Entropy (CE)

Partition Index (SC)

Separation Index (S)

Xie and Beni's Index (XB)

Dunn's Index (DI)

Alternative Dunn Index (ADI).

You should try one of them.

Christos Nicolaou

There are several methods to cluster performance evaluation and, unfortunately but rather to be expected, no consensus on what is best :-) For an excellent, concise intro on a few of these methods including brief explanation, mathematical background and references see http://scikit-learn.org/stable/modules/clustering.html#clustering-performance-evaluation. If you are a python user you can also use the scikit code described there to achieve your goal or at least to get started.

Nabila Nisha

https://www.researchgate.net/publication/287162366_Transaction_Banking_Services_The_Case_of_Bangladesh

Book Transaction Banking Services: The Case of Bangladesh