How can we use unsupervised clustering models for classification tasks?

More Negar Ahmadi's questions See All

How can I generate signals from Kuramoto model?

I've solved the Kuramoto model for 100 oscillators and have calculated phase of each oscillator (theta) at each time step. Now by using these phases (thetas) I need to generate two/more signals to...

09 October 2015 1,275 1 View

How can I find correlation/synchronization between 2 seperate graphs?

I converted some EEG time series to their related visibility graphs, now I need to find correlation between each 2 graphs, please note that there is no any interlink between each 2 graphs hence...

06 July 2015 1,523 8 View

What are the complexity measures in chaos systems?

I want to use VG algorithm (visibility graph) to convert EEG time series to graphs while preserving the dynamic characteristics such as complexity, Now I want to know what the important complexity...

03 April 2015 4,883 29 View

How can I convert ASCII(text files) to *.mat files in order to use them in matlab?

I have some EEG data sets and need to convert them to mat files

31 December 2014 2,681 4 View

Is it possible to use a genetic algorithm for finding the correlation among time series?

I've defined all parameters for GA and each chromosome defined as a 21*21 matrix (because of 21 channels in EEG test) to find pairwise correlation between each two time series and now the problem...

06 July 2014 4,021 6 View

What are the best and most efficient correlation measures in medical signal processing?

For example the correlation among time series of functional brain activities, signals related to EEG test, MEG test, time series, etc.

04 May 2014 3,911 3 View

Is it better to publish some of my papers with the same topic as a book chapter (nova publisher) together?

Or publish each of them separately as research paper?

10 November 2013 6,600 15 View

Standard genetic algorithm or online genetic algorithm?

Which structure is better for classification? I would like to know their advantages and disadvantages for clustering and classifying.

10 November 2013 8,479 7 View

When is it better to use Hidden Markov Model instead of other pattern recognition techniques?

HMM is a powerful technique for speech processing. Now I want to know the merits of it for other areas.

10 November 2013 1,615 14 View

What topics have recieved much attention in signal processing?

I want to start a new research in the field of signal processing.

10 November 2013 8,197 4 View

Feedback defines the constitution of an organism?

“Here is a thought experiment. Let's place Rodolpho Llinas's jarred-brain on top of a body (Fig. 1). I bet Llinas would argue that his jarred-brain retains its own consciousness, and the android...

11 August 2024 2,483 1 View

How can I prepare virus for a TEM or SEM imaging?

I have virus (viral hemorrhagic septicemia virus) in suspension and the experiment will not involve cells. What level of TCID50 is preferred?

11 August 2024 3,115 1 View

How to learn more about SPSS and its Application?

I would like to learn more about SPSS and Its application especially in regards to data analysis. Please suggest me how I can learn more about it. Thank you so much.

11 August 2024 9,101 4 View

Text-Communication from the M1 Hand Area using BCI—and then there is Elon Musk?

Willett, Shenoy et al. (2021) have developed a brain computer interface (BCI) that used neural signal collected from the hand area of the motor cortex (area M1) of a paralyzed patient. The...

10 August 2024 7,180 0 View

Handling Missing Data and Building a Predictive Model with Incomplete Information ?

I am developing a predictive model for a water supply network that involves 20 influencing points. However, I only have historical data for 10 out of these 20 points. I would like to know how to...

10 August 2024 4,005 2 View

Has anyone applied Python in the field of textile engineering for data analysis, automation, or smart textiles?

I'm currently exploring the application of Python in textile engineering, specifically in areas like data analysis, process automation, and the development of smart textiles. I'm interested in...

10 August 2024 7,429 2 View

Which Scopus Journal provides the most affordable fees?

"PUBLISHING IN A SCOPUS JOURNAL" Researchers are now at a cross road. The critical need to publish in a Scopus or ISI, etc journal is ever vital. Journal Publication fees must be submitted....

10 August 2024 8,621 1 View

Seeking Advice on Viability and Execution of Undergraduate Thesis Topic?

Hello everyone, I am currently developing a thesis proposal and would appreciate your input on its viability and how to effectively carry it out. My proposed topic is: "Does the perceived threat...

10 August 2024 8,992 0 View

Is it possible to use the Fused Deposition Modeling (FDM) to additively manufacture interconnected porous structure generation of >100-200 micrometer?

Usually, additive manufacturing techniques like SEBM, SLS, and SLM are used for interconnected porous lattice structure generation with sizes of >100–200 micrometers. Can the Fused Deposition...

09 August 2024 7,892 0 View

How to define an anisotropic material with asymmetric elastic compliance/stiffness matrix in ANSYS APDL?

I need to model an anisotropic material in which the Poisson's ratio ν_12 ≠ ν_21 and so on. Therefore, the elastic compliance matrix wouldn't be a symmetric one. In ANSYS APDL, for TB,ANEL...

09 August 2024 5,048 2 View

Daniel Urda

Dear Negar,

Unsupervised models are used when the outcome (or class label) of each sample is not available in your data. If you want to use your method to perform a classification task, you should have those labels in order to assess how good the method is. If this is the case, i.e class labels are available, I recomment you to test and compare your method with other well-known supervised machine learning models.

Negar Ahmadi

Behzad Rouhanizadeh , thanks a lot

Daniel Urda , thank you so much Daniel, Yes class labels are available, the dataset has different brain EEG recordings from patients with different brain disorders. for another clustering task I made that model and it worked really well, so I wanted to try the model again (by some modification or post processing in order to convert it to a classification model) for classification of brain disorders

Elena Mikhalkova

You can use your clustering method on data with labels removed and then check its efficiency by counting how many samples labeled with a similar class went to the same clusters. The trick here is that you cannot use precision, recall etc. metrics that you usually use to check the efficiency of classification. The most common metrics for clustering evaluation are Rand Jaccard, B-cubed. Here in paragraph "5.3 Evaluating clusters" I suggest to use F-measure: Preprint A Linguistic Model of Classifying Community Pages in a Socia...

You can see in the formula how different it is from the F-measure used for analysis of classification. It is important that you will not be able to compare efficiency of your clustering method to classification ones.

But if by classification you don't mean a Machine Learning method, but just that you want to use your clusters as a basis for terminological research - for example, that these clusters relate to some expert-defined classes of disorders, then you need to compare your own clustering method to other clustering (sic!) methods, not ML classification ones.

Elena Mikhalkova , thanks a lot Elena

Emilio Rodigues

Hello. Unsupervised clustering methods create groups with instances that have similarities. If you do not have the classes associated with data set, you can use clustering methods for finding out related instances. An especialist can verify and define labels (classes) for groups.

After that, you can use supervised methods to learn from your new labeled data set. Good luck!

Chuanxing Geng

I hope that our work (Preprint Collective Decision for Open Set Recognition (Previous title...

) can give you some inspiration.

It just came to me that if your clustering method requires to specify the number of clusters in the output, then you can (sic!) compare its result to a classifier. A classifier assigns every item to a class in a given set of classes. If your clustering method always returns the same number of clusters as there are classes in the classifier, then you can check which cluster has the largest number of similar items (items belonging to the same class). It will be the so-called "best cluster". Then you can compare your best clusters to classes returned by the classifier.

Also, maybe, if your clustering method does not require to specify the number of clusters yet, it will be a good idea to introduce this feature, and it will move your method closer to classification. IMHO.

Ajit kumar Roy

In clustering, main goal is to group the data points in data set into disjoint sets. The first clustering algorithm implement is k-means, which is the most widely used clustering algorithm. To scale up k-means,one will learn about the general MapReduce framework for parallelizing and distributing computations, and then how the iterates of k-means can utilize this framework.

Million Meshesha

For supervised learning we need to have a labeled data set. If not, it is good to run unsupervised learning algorithms for automatically labeling unlabeled data. Once the data is labelled using clustering algorithms, then it is possible to use supervised learning algorithms. For linking the two tasks a simple script can be written that connect the output of clustering as an input for the classification task.

Yuriy P. Zaychenko

It's very simple. If you have efficient reliable clustering algorithm apply it to wholew data set, split into clusters, each cluster would represent separate class and after that train your classifier using special training algorithm

Amir Hossein Poorjam

An efficient approach to classify the clusters is to use "multinomial Naive Bayes classifier".

In this framework, feature vectors for training the classifier are the frequencies with which certain events (clusters) have been generated by a multinomial (p_1 , ... , p_n), where p_i is the probability that event i occurs. A feature vector x=(x_1,...,x_n) is a histogram with x_i counting the number of times event i was observed.

So, this way, you train the classifier with some observations and let the trained classifier to make decision about the rest of the unlabeled data.

Swastik Roy

Take a look at this blog post, I think it covers an interesting perspective on the question asked. - https://medium.com/datadriveninvestor/can-all-classification-problems-be-solved-by-unsupervised-clustering-3a9f3e1f72c0

Let me know if you are in agreement with the author.

Erik Cuevas

having the cluster protopipes, a new data can classified with regard to the minimal distance