What is the maximum/suitable number of categories of target variable in multi-label classification?

More Hikmet Yavuz's questions See All

Do you think can be any Uranium bearing rocks in Eastern part of Iran and western part of Afghanistan?

I want to know more about Uranium ore deposits in world.

11 August 2024 6,720 0 View

Do you think can be any diamond bearing rocks in Eastern part of Iran and western part of Afghanistan?

I want to know more about diamond ore deposits in world.

11 August 2024 2,167 1 View

What is the difference between mathematical R^4 space and physical 4D unit space?

We assume that the difference is huge and that it is not possible to compare the two spaces. The R^4 mathematical space considers time as an external controller and the space itself is immobile in...

10 August 2024 6,678 14 View

If Banks do not provide credit facility, what are the options available for FPOs and impact on producer’s income?

10 August 2024 8,198 5 View

Controlling for pupil light reflex when analyzing pupil size time course?

I used eye tracking to examine how participants from two different populations (A and B) react to an image. Participants in population A exhibit larger pupil sizes over time, but they also have...

10 August 2024 3,229 0 View

What are a “Farmers Producer Organization” (FPO) and its essential features?

10 August 2024 477 5 View

Strugglling with m6A dot blot any suugesstion ?

I have been doing the m6A dot blot for a while with no improvement, I am extracting the RNA, and I can see the dots although the three biological replicas give a different reading on the memberan...

10 August 2024 8,539 5 View

Do interactions between biosphere, carbon cycle, & water cycle impact global warming & interaction between atmosphere & hydrosphere?

How do interactions between the biosphere, the carbon cycle, and the water cycle impact global warming and interaction between the atmosphere and the hydrosphere?

09 August 2024 3,291 2 View

How to get moment output in Abaqus Standart?

I have input a moment load in module load Abaqus, i put my moment load on the node surface (using reference point). I have define moment in history output and make a set for moment too. But the...

08 August 2024 4,831 4 View

How is energy cycled through the Earth's climate system and how do matter cycle and energy flow through the rock cycle?

08 August 2024 8,162 0 View

Feedback defines the constitution of an organism?

“Here is a thought experiment. Let's place Rodolpho Llinas's jarred-brain on top of a body (Fig. 1). I bet Llinas would argue that his jarred-brain retains its own consciousness, and the android...

11 August 2024 2,483 1 View

Which type of compound does lamda max of 218 indicate in a uv-vis spectrum of a partially purified compound through column and TLC?

A crude extract of fungal culture using EtOH was subjected to column and TLC and partially purified compound was obtained. UV vis spectrum of the compound/s has max absorbance at 218nm. The...

11 August 2024 9,801 2 View

Can you connect an HPLC to a Mass Spec only at a certain time point?

Can anyone explain this method? Especially the last statement where it says only at 1.5 to 2.5mins was the MS/MS connected to the UPLC. How is that possible, is it a feature in this specific...

11 August 2024 8,141 3 View

RNA Extraction Using Hot Borate Method No Longer Working?

I've been performing RNA extraction on cotton petiole tissue for a few months now using the method described in the following paper, a derivative of the typical hot borate method...

08 August 2024 9,882 2 View

Self-Organizing Superorganisms—as envisaged by Nenad Sestan (2018)?

The rate of glucose consumption by the neocortex is reduced by over 80% during anesthesia (Sibson et al. 1998), which disables the synapses (Richards 2002) that are inundated by glial tissue (Engl...

08 August 2024 3,118 0 View

Can I use a HisTRAP column for affinity chromatography?

I'm working on selecting antibodies against a recombinant protein that has a His-tag. My idea is to first bind the recombinant protein to a HisTRAP column and then use this column for an affinity...

07 August 2024 505 3 View

Hi, please what is the best tool on Aspen plus that can help me get the best inlet temperature and pressure for cryogenic distillation?

I am trying to recover liquid CO2 from a mixture of 0.6 CO2, 0.3 N2, and 0.1 O2. My aim is to recover about 99% liquid CO2 at the bottom of the column and make sure the amount of gaseous CO2...

06 August 2024 4,611 0 View

Measuring the Intelligence of a Species?

Larger brains, which typically contain more neurons, store and transfer more information (Tehovnik and Chen 2015), but the precise relationship between number of neurons and information has yet to...

05 August 2024 1,238 2 View

How can i do multivariate Time Series forecast using MLP, ANFIS and LSTM?

I need the python code to forecast what crop production will be in the next decade considering climate and crop production variables as seen in the attached.csv file.

05 August 2024 2,977 3 View

The Curse of Evolution and Complexity?

Brain and body mass together are positively correlated with lifespan (Hofman 1993). The duration of neural development is one of the best predictors of brain size, and conception is the best...

05 August 2024 6,247 3 View

Muhammad Ali

I suggest you to follow https://scikit-learn.org/stable/modules/multiclass.html

https://www.depends-on-the-definition.com/guide-to-multi-label-classification-with-neural-networks/

https://medium.com/@saugata.paul1010/a-detailed-case-study-on-multi-label-classification-with-machine-learning-algorithms-and-72031742c9aa

Hikmet Yavuz

Hi Muhammad Ali ,

Thank you for sharing your links that are about multi-label classification.

I looked at the links. They explain the topic. But, i am interested in my own case which is the feasibility of predicting almost 90.000 categories using methods like Binary Relevance, Classifier Chains and Label Powerset.

Paola Galdi

In theory there is no upper bound for the number of categories as long as you have a reasonable amount of samples per category, although in practice there are of course limitations.

You might consider a hierarchical approach, i.e. building a first classifier with a few (k) macro-categories, then build k new classifiers for the samples in each macro-category to predict sub-categories, and so on until you reach the desired granularity.

This might be especially useful if the sample distribution is not balanced among categories, because you could merge several small classes together at the beginning to separate them from over-represented classes to obtain balanced groups.

If you don't have prior knowledge to define such macro-categories and sub-categories, you might consider unsupervised clustering and decomposition methods to observe how your samples cluster together.

Hi Paola Galdi,

Thank you for your answer.

It sounds interesting. I don't have prior knowledge to define such macro-categories and sub-categories. Should i cluster only features or only targets or both features and targets?

What I would try is to explore how your samples (described by your features) group together, and then see if there are clusters containing only samples from a subset of categories: this would form a macro-category.

After k-means clustering (creating macro-categories/cluster ids), should i build classification models for each cluster's target variables (sub-categories) ?

Ritika Lohiya

You can try tokenization that will splits a texts into tokens and hence each label can be represented as number that can be used as a feature vector in machine learning. Scikit-learn provides a CountVectorizer and a TfidfVectorizer to vectorize text. Keras also comes with several text preprocessing classes - one of these classes is the Tokenizer. The labels can also be encoded using MultiLabelBinarizer. For instance, if you have 100 labels then it would be represented as 100 binary elements.

Hi Hikmet Yavuz

yes my idea is that you can train a first classifier to assign samples to the clusters that you found, and then build an additional classifier per each cluster. But as I mentioned before you might need to iterate this process if your sub-categories are still too big.

Thank you so much Paola Galdi !!! You've made my day. I will offer this method to my company.

Hi Ritika Lohiya ,

I think your advice is for my text feature.Right?

Hikmet Yavuz ...yes