Any benefit to balancing nearly-balanced classes?

More Raz Malka's questions See All

CRISPR Cas9 gRNA design tools - which one to use?

Hello, I hope that if you are reading this you are doing an interesting science :) I want to design guide RNAs for CRISPR Cas9 knockout screen I'm planning (for human cells cultures and mouse...

04 May 2024 4,727 1 View

How do efficiently dissolve H3BO3 (Boric acid) in UPW?

Hello :) I am trying to prepare saturated Boric Acid in water. According to the protocol i'm using, I supposed to dissolve 100mg of Boric Acid in 200ul of UPW, vortex it, incubate it for 20...

02 July 2023 5,688 0 View

Ways to check oligo annealing?

Hi, Are there any ways to check for oligo annealing? Thanks, Linoy

23 August 2022 9,522 0 View

Sorting cells in metaphase?

Hello, I want to sort cells in metaphase using FACS, by specific metaphase markers. 1. Which markers of metaphase would be relevant? 2. I tried fixing the cells with ethanol and staining for...

15 July 2022 1,801 0 View

Calibrating UVB induced DNA damage?

Hello all, I'm looking to calibrate the UVB dose that will induce DNA damage in a monolayer of cancer cell lines/ primary cells but will keep most of them cycling. I want to follow the DNA damage...

15 July 2022 6,515 0 View

FUCCI for imaging cell cycle?

Hello, Can anyone recommend a specific FUCCI vector to image cell cycle in mammalian cells? We are looking for a vector/s with all four tagged proteins, and preferably lentiviral. Thanks in...

18 May 2022 7,332 0 View

Which type of EFA to proceed with in scale development?

I am working on Scale development, about efa should i use PCA with varimax or PAF with oblique. i have set of statements not dimensions so i dont know how to see the concept of correlated or...

02 January 2020 3,206 3 View

What are the methods to analyse daily rainfall data?

I have daily rainfall data for 30 years for a specific location. I want to derive extreme rainfall scenarios in different recurrence intervals. This is basically to calculate storm water runoff.

23 October 2019 439 4 View

Hall effect: what is the electric conductivity tensor for AC magnetic field and DC electric field?

solid-state-physics I need to define electrical conductivity anisotropic tensor in a metal caused by Hall effect. My problem is that i don't know how to define it in case of oscillating magnetic...

19 January 2019 6,529 4 View

What are the advantages and disadvantages of using the Singular Value Decomposition++ (SVD++) algorithm?

What are the advantages of using the Singular Value Decomposition++ (SVD++) algorithm except for much faster algorithm and what are the disadvantages of using SVD++?

18 October 2017 6,374 4 View

Feedback defines the constitution of an organism?

“Here is a thought experiment. Let's place Rodolpho Llinas's jarred-brain on top of a body (Fig. 1). I bet Llinas would argue that his jarred-brain retains its own consciousness, and the android...

11 August 2024 2,483 1 View

How to learn more about SPSS and its Application?

I would like to learn more about SPSS and Its application especially in regards to data analysis. Please suggest me how I can learn more about it. Thank you so much.

11 August 2024 9,101 4 View

Handling Missing Data and Building a Predictive Model with Incomplete Information ?

I am developing a predictive model for a water supply network that involves 20 influencing points. However, I only have historical data for 10 out of these 20 points. I would like to know how to...

10 August 2024 4,005 2 View

Has anyone applied Python in the field of textile engineering for data analysis, automation, or smart textiles?

I'm currently exploring the application of Python in textile engineering, specifically in areas like data analysis, process automation, and the development of smart textiles. I'm interested in...

10 August 2024 7,429 2 View

Self-Organizing Superorganisms—as envisaged by Nenad Sestan (2018)?

The rate of glucose consumption by the neocortex is reduced by over 80% during anesthesia (Sibson et al. 1998), which disables the synapses (Richards 2002) that are inundated by glial tissue (Engl...

08 August 2024 3,118 0 View

The Bigger You Are, the Harder You Fall (some lessons from Dinosaurs)?

Evolutionary fitness is based on an organism’s ability to adapt rapidly to changing environmental circumstances. Large-bodied mammals have been equipped with large brains (and hence a high...

06 August 2024 4,849 2 View

Are air moisture harvesting technologies effective in combating desertification?

Air moisture harvesting Air water collection devices

06 August 2024 5,473 2 View

Is Galaxy.org good to use for research for analyzing data and for publication?

Hello all, I wanted to know, can I use galaxy (USA, Europe or Australia) platform for analyzing the shotgun data, and can it be used for publication purpose as well? Thanks :)

06 August 2024 6,610 4 View

Do experts have journals in the field of artificial intelligence and big data that are not indexed by SCI or EI?

05 August 2024 8,836 2 View

Measuring the Intelligence of a Species?

Larger brains, which typically contain more neurons, store and transfer more information (Tehovnik and Chen 2015), but the precise relationship between number of neurons and information has yet to...

05 August 2024 1,238 2 View

Luigi Borzí

Dear Raz Malka ,

my suggestion is to perform class balancing only when necessary.

Most classification algorithms are unaffected by small unbalanced classes.

Moreover, to perform class balancing you should either:

- undersample the most represented class, thus losing information

- oversample the less represented class (either directly or using some algorithm), thus promoting overfitting

You may slightly change the cost function, assigning a weight to each error, which is inversely proportional to the class size.

Mohamed Elhadad

Hi Raz Malka,

I think this is not a necessary step; the data in real life by its nature is unbalanced. Deal with the data you have it as it is!!

Qamar Ul Islam

Dear Raz Malka

These article might be useful, have a look

1. https://towardsdatascience.com/balancing-is-unbalancing-5f517936f626

2. https://www.r-bloggers.com/2020/06/why-balancing-your-data-set-is-important/

Kind Regards

Sayan Surya Shaw

When the imbalance ratio is nearly 1:1, i.e., 55:45 or 60:40 majority and minority class ratio, You may not need to balance the dataset either using oversampling or undersampling. This much imbalance is very trivial in real-life datasets.

But, if the minority class examples are very important for the correct predictions (e.g- disease datasets), and you don't want any minority class data to be left out due to an imbalanced dataset as it may cause severe effect, you can perform undersampling or oversampling to have better and accurate results.

Regards