How to handle test set labels which are not in training set in Multi Class Text Classification?

More Kalyan Katikapalli's questions See All

How to solve highly non linear coupled eqn. ?

send me a matlab program

22 August 2023 7,317 2 View

What is the concentration of antibiotic for PCR 2.1 TOPO cloning?

Ampicillin 50ug/ml is what specified in the kit, means should I use 5 grams of ampicillin for one liter of LB agar?

15 August 2023 7,603 2 View

Where do i get elastic compliance constants required for the orthorhombic crystal structure?

USDM and UDEDM model of Williamson Hall plot, elastic constants are required to calculate the Youngs modulus which further usefull in calculating stress and deformation energy density. I am...

12 June 2023 1,425 0 View

Why is beta carotene not ionizing in LCMSMS even though the source has been changed to APCI?

I have tried beta carotene detection in LCMSMS with an APCI source using different solvents. But ionization is not happening even when reproducing the previously available data. Please help me.

25 December 2022 5,798 2 View

How to distinguish triplet-triplet annihilation band from phosphorescence and TADF?

If an organic molecular emitter shows multiple photoluminescence bands, what tools and techniques can one use to confirm the triplet-triplet annihilation (TTA) band?

18 November 2022 6,879 4 View

What could be the reason(s) for an organic material to show irreversible phase transition?

I am studying an crystalline organic material which shows birefringence and melts at ~75 degree Celsius. But, the material doesn't solidify upon cooling. What could be the reasons?

11 December 2021 6,015 3 View

Can anyone suggest any script/program that can be used to extract spin orbit coupling matrix elements from Gaussian TDDFT output file?

I want to calculate the spin orbit coupling matrix elements (SOCME) between first singlet excited state (S1) and triplet excited states (T1, T2, T3, ....). I have done the TDDFT calculation using...

02 September 2021 5,231 6 View

We have deposited CZTS thin films using sputtering on Mo coated glass substrate. UV absorbance shows multiple peaks what is reason behind this?

CZTS thin films are deposited by RF sputtering on MO coated glass substrate. UV absorbance of these thin films shows multiple fringes/peaks.

13 July 2021 7,961 4 View

What are the pros and cons of choosing a research oriented industry job over an academic job for PhD (in AI related fields) graduates?

I would like to know, What are the pros and cons of choosing a research oriented industry job over an academic job for PhD (in AI related fields) graduates?

11 March 2021 196 8 View

I want to simulate a system containing protein and heavy metals; which force field is most reliable for this purpose?

The protein I want to simulate has multiple heavy metals. I wanted to know which force-field I should use with GROMACS to achieve this. Most importantly, do the latest updates in the force-field...

16 February 2021 2,713 1 View

Feedback defines the constitution of an organism?

“Here is a thought experiment. Let's place Rodolpho Llinas's jarred-brain on top of a body (Fig. 1). I bet Llinas would argue that his jarred-brain retains its own consciousness, and the android...

11 August 2024 2,483 1 View

Is there an English Translation of the Carl Moller text: ZUR VERGLEICHENDEN ANATOMIE DER SILURIDEN?

I recently came across an anatomy text by Carl Moller that was published in 1915 but it is in German or Dutch neither of which I can understand. I would like to know if there is an English...

10 August 2024 4,347 1 View

Self-Organizing Superorganisms—as envisaged by Nenad Sestan (2018)?

The rate of glucose consumption by the neocortex is reduced by over 80% during anesthesia (Sibson et al. 1998), which disables the synapses (Richards 2002) that are inundated by glial tissue (Engl...

08 August 2024 3,118 0 View

The Bigger You Are, the Harder You Fall (some lessons from Dinosaurs)?

Evolutionary fitness is based on an organism’s ability to adapt rapidly to changing environmental circumstances. Large-bodied mammals have been equipped with large brains (and hence a high...

06 August 2024 4,849 2 View

Are air moisture harvesting technologies effective in combating desertification?

Air moisture harvesting Air water collection devices

06 August 2024 5,473 2 View

How to convert a privately loaded document into a public document?

I attempted to make a privately uploaded text public but a window appeared that said an error occurred. There was no explanation provided as to why there was an error or what might be done to...

05 August 2024 8,025 7 View

Measuring the Intelligence of a Species?

Larger brains, which typically contain more neurons, store and transfer more information (Tehovnik and Chen 2015), but the precise relationship between number of neurons and information has yet to...

05 August 2024 1,238 2 View

How can i do multivariate Time Series forecast using MLP, ANFIS and LSTM?

I need the python code to forecast what crop production will be in the next decade considering climate and crop production variables as seen in the attached.csv file.

05 August 2024 2,977 3 View

The Curse of Evolution and Complexity?

Brain and body mass together are positively correlated with lifespan (Hofman 1993). The duration of neural development is one of the best predictors of brain size, and conception is the best...

05 August 2024 6,247 3 View

Need help with my research project on open source SIEM and machine learning?

Hello everyone, I am currently working on a research project that aims to integrate machine learning techniques into an open source SIEM tool to automate the creation of security use cases from...

04 August 2024 3,196 2 View

Daqing Chen

Would suggest:

1. Have a look at the number of samples from each class ;

2. Identify outliers. i.e., any class that has only a few samples, for example, less than 1%;

3. Depending on the context of the data, remove the outliers, or merge them into a class, say, class "other";

4. Randomly select samples from each class, so both the training and test data sets will have samples from the same classes.

5. Deal with imbalanced classification problems.

In addition, you may not need to do one-hot encoding - depending on which model you use, for example, decision trees.

You can also consider dimensionality reduction - do classification in a low dimensional space.

Hope this makes sense and helps.

Niloy Sikder

A classifier cannot classify the samples of a class if some samples of the class aren't present in the training set. It's like one won't be able to recognize a certain species of birds if he hasn't already seen some birds of that species before, and marked the distinctive features that he will look for in a new bird to categorize it.

A good approach to solve your problem can be building a customized training set. Instead of splitting the whole set of samples randomly to train and test sets, split the samples of each class (randomly) separately (preferably 70%-30%) and then construct the train and test sets by joining the subsets of each class. This process will ensure that there is no class present in the test set which is unknown to the classifier. Hope this helps.

Azhar Imran

Hello Kalyan

1. Create a separate file (.txt) for labels and and mark them accordingly

2. Follow the code as in attach file may help you to figure out your question.

Regards