Why is scaling important for the linear SVM classification?

07 July 2013 1 1K Report

When performing the linear SVM classification, it is often helpful to normalize the training data, for example by subtracting the mean and dividing by the standard deviation, and afterwards scale the test data with the mean and standard deviation of training data. Why this process changes dramatically the classification performance?

Qinghua Xu

As i understand, the main advantage of scaling is to avoid attributes in greater numeric ranges dominating those in smaller numeric ranges. Another advantage is to avoid numerical difficulties during the kernel calculation.However, i am not quite clear why the test set needs to be scaled with the mean and std of the training set instead of its own? In some case, the later seems perform euqlly well or even better when the two classes of samples are well balanced in the test set.

Badges
Science topic

More Qinghua Xu's questions See All

Are there any pitfalls for using 18S rRNA as reference gene for QPCR in formalin-fixed paraffin-embedded tissue?

In our recent study, the Ct values of 18S rRNA could vary between 22 to 30 for same FFPE samples in repeated experiments. We have no idea about the underlying major factor that induce this...

02 March 2016 3,353 3 View

How to interpret the results of SAM sample size estimation?

I performed a microarray study of 60 samples. 20 samples for each of Group A and B, C. I used the "samr" package to look for differentially expressed genes between subgroups and calculate the...

05 June 2012 2,552 7 View

Probeset mapping between Affymetrix U133plus2 and Human Exon ST 1.0 microarray?

Does anyone have any idea how to map the probesets between Affymetrix U133plus2 and Human Exon ST 1.0 microarray? Is there any useful bioconductor package or bioinformatics program?

01 February 2012 6,265 4 View

Feedback defines the constitution of an organism?

“Here is a thought experiment. Let's place Rodolpho Llinas's jarred-brain on top of a body (Fig. 1). I bet Llinas would argue that his jarred-brain retains its own consciousness, and the android...

11 August 2024 2,483 1 View

Self-Organizing Superorganisms—as envisaged by Nenad Sestan (2018)?

The rate of glucose consumption by the neocortex is reduced by over 80% during anesthesia (Sibson et al. 1998), which disables the synapses (Richards 2002) that are inundated by glial tissue (Engl...

08 August 2024 3,118 0 View

Measuring the Intelligence of a Species?

Larger brains, which typically contain more neurons, store and transfer more information (Tehovnik and Chen 2015), but the precise relationship between number of neurons and information has yet to...

05 August 2024 1,238 2 View

Why did the authors extrapolate a phenotype that they experimentally proved in one bacterial strain across the whole genus of the organism?

I aim to be as skeptical as possible regarding whether a pair of orthologous genes results in the same phenotype in their different but related bacterial organisms under similar environmental...

05 August 2024 6,787 4 View

How can i do multivariate Time Series forecast using MLP, ANFIS and LSTM?

I need the python code to forecast what crop production will be in the next decade considering climate and crop production variables as seen in the attached.csv file.

05 August 2024 2,977 3 View

The Curse of Evolution and Complexity?

Brain and body mass together are positively correlated with lifespan (Hofman 1993). The duration of neural development is one of the best predictors of brain size, and conception is the best...

05 August 2024 6,247 3 View

Need help with my research project on open source SIEM and machine learning?

Hello everyone, I am currently working on a research project that aims to integrate machine learning techniques into an open source SIEM tool to automate the creation of security use cases from...

04 August 2024 3,196 2 View

Swimming/space travel depends on the proprioceptive muscle spindles?

When the entire neocortex is ablated in rodents, although they are still able to swim, all the limbs move continuously and asynchronously (Vanderwolf 2006; Vanderwolf et al. 1978). Normal animals...

03 August 2024 835 3 View

What are the limitations and challenges of using machine learning for predicting concrete compressive strength in practical applications?

Machine learning (ML) has shown great potential in predicting the compressive strength of concrete, an important property for structural engineering. However, its practical application comes with...

03 August 2024 2,546 2 View

Some new emerging problems on application of RL for scheduling in IoT networks?

I have seen plenty of existing works on applied Reinforcement Learning (RL) policies for optimized scheduling in IoT networks including Q-learning, DQNs, and the newer ones including PPO for...

01 August 2024 8,754 2 View