Is there a relationship between how to normalize the data and the kernel is chosen for support vector machine SVM?

More Walter Julián Gil González's questions See All

Is it possible to plot the atom-projected band structure using GPAW?

Hi, I'm currently working on a project where I need to plot the atom-projected band structure using GPAW. I've been able to calculate the band structure for my material, but I'm having trouble...

07 August 2024 269 3 View

How to explain the different interaction between soluble and insoluble polymer in water?

Hello everyone, I am try to understand the differences interaction between insoluble and soluble polymers in water with metal ions and how theit charge density (of metal ions) affect the interaction.

29 July 2024 6,350 3 View

Are the coordinates the center of the cell or a corner in GADM / worldclim maps in R?

I am currently using a map downloaded from https://gadm.org/maps/DEU/bayern.html and I would like to assign the value of a variable to each cell with R. I was also trying to get this information...

15 July 2024 8,844 1 View

Can anyone recommend bibliography on the analysis of rage from an affect studies' perspective?

I'm analysing a novel in which the said emotion is fundamental to understand the development of characters. Thank you in advance.

22 June 2024 4,859 8 View

Why are my nanoliposomes turning purple??

Hello, I am making lipid nanoparticles using the thin film hydration method. Formula is DODMA:DSPC:Cholesterol:DSG-PEG2000 at 60:20:17.5:2.5 %mM in 5mM total. After hydration in sodium acetate...

16 June 2024 6,842 6 View

What skills should today's teachers take into account to teach students and prepare them for a long-term future?

Taking into account that each society incorporates into the curricula the necessary knowledge to teach depending on the social demands of each country.

06 June 2024 9,710 14 View

Isssues making a thin film of lipids for nanoliposomes?

Hello, Im trying to make a nanoliposome to encapsulate plasmid DNA. I am using the standard thin film method followed by resuspension in sodium acetate 25mM pH4 with my DNA (Im using the...

05 June 2024 3,646 3 View

Uncertainty estimation (standard deviation) of XPS peaks in CasaXPS?

I'm looking for a way to measure the uncertainty (standard deviation) on the quantification (area) of the components of an XPS spectrum using CasaXPS. I found these options in the software, but...

04 June 2024 902 2 View

What considerations should be taken when using proprioceptive stimulation in patients with multiple sclerosis?

By working on proprioception, we send stimuli to our brain through multiple pathways (senses, muscle spindles, ligaments, etc.) that will help us improve our balance.

01 June 2024 2,344 3 View

What are the principles to take into account when carrying out techniques with the Kabat Method?

The Kabat Method is a proprioceptive neuromuscular facilitation technique, which uses a series of principles and relaxation or contraction techniques according to the desired objective. This...

01 June 2024 3,107 3 View

Feedback defines the constitution of an organism?

“Here is a thought experiment. Let's place Rodolpho Llinas's jarred-brain on top of a body (Fig. 1). I bet Llinas would argue that his jarred-brain retains its own consciousness, and the android...

11 August 2024 2,483 1 View

Self-Organizing Superorganisms—as envisaged by Nenad Sestan (2018)?

The rate of glucose consumption by the neocortex is reduced by over 80% during anesthesia (Sibson et al. 1998), which disables the synapses (Richards 2002) that are inundated by glial tissue (Engl...

08 August 2024 3,118 0 View

Measuring the Intelligence of a Species?

Larger brains, which typically contain more neurons, store and transfer more information (Tehovnik and Chen 2015), but the precise relationship between number of neurons and information has yet to...

05 August 2024 1,238 2 View

How can i do multivariate Time Series forecast using MLP, ANFIS and LSTM?

I need the python code to forecast what crop production will be in the next decade considering climate and crop production variables as seen in the attached.csv file.

05 August 2024 2,977 3 View

The Curse of Evolution and Complexity?

Brain and body mass together are positively correlated with lifespan (Hofman 1993). The duration of neural development is one of the best predictors of brain size, and conception is the best...

05 August 2024 6,247 3 View

Could dyes amplify the spectrum of light to a specific wavelength?

I am interested to know the behavior of dyes toward light. Specifically, Blue dyes re-emit the spectrum, especially from the green zone (known as principal in LED lamps, and blue dyes are known...

05 August 2024 3,290 1 View

How to report results of Generalised Linear Mixed Models in a journal article?

Hi everyone, If you have written or come across any papers where Generalised Linear Mixed Models are used to examine intervention (e.g., in mental health) efficacy, could you please share the...

04 August 2024 4,130 4 View

Need help with my research project on open source SIEM and machine learning?

Hello everyone, I am currently working on a research project that aims to integrate machine learning techniques into an open source SIEM tool to automate the creation of security use cases from...

04 August 2024 3,196 2 View

Swimming/space travel depends on the proprioceptive muscle spindles?

When the entire neocortex is ablated in rodents, although they are still able to swim, all the limbs move continuously and asynchronously (Vanderwolf 2006; Vanderwolf et al. 1978). Normal animals...

03 August 2024 835 3 View

What are the limitations and challenges of using machine learning for predicting concrete compressive strength in practical applications?

Machine learning (ML) has shown great potential in predicting the compressive strength of concrete, an important property for structural engineering. However, its practical application comes with...

03 August 2024 2,546 2 View

Marco Manfredi

Normalizing features before classification is indeed important.

I recently read two articles that found similar conclusions: when dealing with Linear classifiers (like linear SVMs) the L2 norm performs very well.

I do not know which are proper normalizations when using different kernels.

You can find attached the two publications I mentioned.

http://www.robots.ox.ac.uk/~vedaldi/assets/pubs/chatfield11devil.pdf

http://www.robots.ox.ac.uk/~vgg/rg/papers/peronnin_etal_ECCV10.pdf

Michael Kemmler

Which kernels go well with what normalization depends, as usual, on the data at hand. However, if you use kernels such as the exponential (rbf) kernel, normalization can be very important indeed. Since the rbf kernel is defined as k(x,y)=exp(-gamma*||x-y||^2) or similar variants thereof, each dimension of the feature vector is treated equally, since every dimension contributes similarly to the euclidean norm ||x-y||^2 = sum_i (x_i-y_i)^2.

Thus, the degree of variation (in terms of the absolute size |x_i-y_i|) of features can have a huge impact upon the result. Just imagine the first feature being uninformative for the given problem, but highly variable. An informative feature with only little amount of variation might easily suppressed by the first.

In such cases, methods such dimension-wise re-weighting or principal component projections might tremendously improve the classification result.

Also note that such dimension-wise normalization is not that important in the linear kernel, as the learning algorithms implicitly scales the dimensions on its own.

Yenisel Plasencia-Calaña

A good practice especially before applying kernels such as the RBF is data standardization which includes data centering and scaling. This family of kernels assumes that the data is centered around zero. In general you can achieve this by substracting the mean of all features. In addition, the features can be scaled to have variance 1.

I probably miss somthing, but I'd rather say that the rbf kernel does not benefit from the mean centering step, since the mean is a constant offset to each feature. This offset is effectively cancelled out when computing differences, i.e. ||(x+m)-(y+m)||^2=||x-y||^2. Hence, no benefit, but no harm done either.

As pointed out before, however, scaling might be a sensible thing to do.

Indeed, centering is not needed for distance-based kernels which are translation invariant. Thanks to Michael Kemmler for correcting. I find centering meaninful for higher degree polynomial kernels. However, I am not sure that centering around zero is always the best choice. I think it depends on the data distribution, but probably the mean should still be close to zero.

Peshawa Jammal Muhammad Ali

Read the link

https://docs.google.com/a/koyauniversity.org/document/d/1x0A1nUz1WWtMCZb5oVzF0SVMY7a_58KQulqQVT8LaVA/edit#