The accuracy of k-NN is greater than SVM. What would be the main reason?

There is no specific reason as such. Performance of both the classifiers initially depends on the dataset chosen. Data sets with different dimensional space will attained varied results for both. For example, in most cases KNN performs better in dataset with low dimensional space and SVM does in dataset with high dimensional space. Secondly the performance also depends on the choice of k in KNN and choice of hyper-plane and kernel function in SVM. Although, KNN gives better results but SVM is more trusted and is considered as a real time classifier. For example, If we are having a fixed data, and divided it into training and testing, then KNN may perform better but if we have to built a prediction model and have to test it on real time samples which were not previously available with the given dataset with whom we did the training then SVM will perform better because it has good learning approach. Mostly in static data's KNN perform better.

Puja Bharti

Great thanks Benjamin sir.

Evan Hughes

Hi,

The comparison of the different classifiers actually provides you with some information about the topology of the distribution of your class data.

As KNN works better than SVM, it indicates that your data set is not easily separable using the decision planes that you have let SVM use; i.e. the basic SVM uses linear hyperplanes to separate classes, and if you provide a different kernel, then that will change the shape of the decision manifold that can be used. KNN can generate a highly convoluted decision boundary as it is driven by the raw training data itself. For example, think of how a Voronoi diagram can separate multiple regions with a non-convex boundary made up of piecewise linear hyperplanes; KNN ultimately behaves in a similar way. SVM uses a highly restricted parametric approximation of the decision boundary, which is an excellent trade-off for classification performance against data storage space/ processing speed.

If KNN can provide good results, then it suggests that your classes are quite separable; if KNN failed, then it would indicate that the metric vector you have chosen does not produce separable classes.

Have you tried looking at scatter plots of your data? Using scatter plots you can try different transforms of your input data and see by eye if it seems to make the classes more linearly separable. Fundamentally, if you can find a set of transforms of the input data (e.g. try raising each metric element to a power, taking logs etc.) that makes the classes more linearly separable, then many more linear (or close to linear) classification techniques will work well and you can trade classification performance against processing speed far more easily.

The hard work with any classification problem is choosing your metric vector components and how they are pre-transformed, rather than choosing the specific classification algorithm you are going to apply. Ultimately for every classification task, there exists a transform which will project your input data down into one dimension and simple thresholds can be used to split the classes; the hard problem is finding that projection equation! Many people forget that methods like KNN and SVM are simply tools we use to 'cheat' by approximating the ideal projection equation, and SVM/KNN etc. are not the real solutions to a classification problem.

Best regards,

Evan

Jayaram M.A

A simple reason could be , the discriminating functions/ kernels may miss some points or they may categorize some points in space to an inappropriate region. K-means being based on vector distance concept, errors are bound to be less.

Bogdan Oancea

Ussually kNN with k=1 gives better results. The error on the training data

will always be 0 for k = 1 but using k=1 results in a large variance and possibly in large errors on test data. It may be also possible that your data cannot by separated by decision hyperplanes.

regards,

Bogdan

Mohd Nurul Al Hafiz Sha'abani

It cannot be compare directly. It always depend on the distribution of the dataset.

If we observe in the literature, different results were obtained. Finally, it can be concluded that, the right chose of the classifier, the correct tuned parameter and the appropriately selected features of a dataset are the criteria to achieve a good result of classification.

kNN and SVM Classification for EEG: A Review, https://link.springer.com/chapter/10.1007%2F978-981-15-2317-5_47

Can combining the (ZMET) technique and Content Analysis prove to be a good triangulation of methods?

When splitting the data for EFA and CFA, is it acceptable to have unequal sample sizes for each analysis?

How to use Blinder Oaxaca Decomposition Method ?

Could you pls confirm my cDNA is atleast ok..?

Immunostaining for NPC?

Endnote for references for Mac ?

How to report common mediators and moderators while conducting Systematic Literature Review-TCCM framework?

Is g_mmpbsa calcualtion can be done using 2022.2 version of gromacs ? If yes how can it be intergrated with 2022.2 version of gromacs ?

Is it safe to comment, that if a research article has not been cited even once, in 13 years then it is not of good quality?

I am working with iPSC cell culture and finding contamination. I am want help to figure out what kind of contamination it is and how to prevent?

Feedback defines the constitution of an organism?

Text-Communication from the M1 Hand Area using BCI—and then there is Elon Musk?

Can we mark 'EFL Learners shifting from general digital to AI technologies' as technological transition?

What are examples of AI for good projects a teacher can assign to students?

Self-Organizing Superorganisms—as envisaged by Nenad Sestan (2018)?

How to design human-centered classroom in the age of A.I.?

Do experts have journals in the field of artificial intelligence and big data that are not indexed by SCI or EI?

Measuring the Intelligence of a Species?

What's the role of IT & AI in Telecommunication Industry?

Can usage of AI tools like chat GPT in research work is recommendable ?