Koczkodaij and I answer for the different type of data and problems because Anumol's question is not clear.
If his question is the discrimination for continuous and/or binomial scale, my new theory of the discriminant analysis is the best among SVM, MP-based LDFs, logistic regression and Fisher's LDF.
His answer "Without any doubt, the best method is AUC of ROC. I used it in my recent papers " is true. I search the minimum error rate of logistic regression on ROC because theoretical discriminant hyperplane does not suggest the minimum error rate. However, my Revised IP-OLDF looks for the minimum Number of Misclassification (MNM) directly.
I discriminate many data by eight LDFs including SVM by mean error rate in the validation samples.
In many cases, Revised IP-OLDF is the best and logistic regression is the second best.
Anumol, you must explain your problems with more detail of data.
The misclassification error refer to the number of individual that we know that bellow to a category that are classified by the method in a different category. Normally we do not try to model misclassification we try to minimize it, and the best method depend on the problem and data type you have in your problem, The best method will be the method that allow you to minimize the misclassification error
I developed Revised IP-OLDF based on the minimum number of misclassifications using integer programming. I believe there is no LDF except for Revised IP-OLDF to minimize error rate.
The matter of minimizing the misclassification error focus on the objective function you use, as you know there are different types of misclassification error that can be or not symmetrically, I means some time the misclassification in one class can be more important than other.
Here you are different objective function that are frequently used on the literature, and you can find more on several books.
The most used are Lift and ROC curves, but in my experience we need to analyze each problem and decide which one is the most convenient
Let suppose you are using a classification method for medical purpose and you are testing a new treatment that, if its used when the person it is not sick there are not several damage but not use it when the person is not sick can create secondary non desirable effect
In this case we need to concentrate on the minimization of the FP and may be used as an objective function the Rate of FP positives (FPR).
If the case is the contrary for the treatment may be used the sensibility, if both cases are symmetric it means have de same cost you can use par example the precision, lift
see the attached file for definitions of FP, TP, FPR, etc
The choice of the algorithm (IP-OLDF, decision tree, neural networks, etc.) it is something relate of the individual, variable type, balance on the different classes present in your population, the sampling method if you are not using the all population, etc., but in any algorithm you need to try to minimize an objective function,
In particular the balance problem can affect significantly the result for any of the previous algorithm and give you a good misclassification error with a very bad model. You must take care of those aspects
Misclassification may occur due to selection of property which is not suitable for classification. When all classes, groups, or categories of a variable have the same error rate or probability of being misclassified then it is said to be misclassification. SVM algorithm can be used for analysis of misclassification.
What I do in practice instead of look for the best method that not exist (I totally agree with Koczkodaij , this universal method does not exist) is to try to use methods like Bagging, random forest to avoid samples problems or staking to improve the results of each particular method minimizing the misclassification error
Classification error means that your classifier are not able to identity correct class of your test tuple. These error are normaly are called FP and FNs. Means negative result declared as positive. To improve the classifier we can use GA to tune parameter of your classifier in case of parametric classifier. We also can search (Selection of Model) correct classifier using WEKA tool.
I developed the optima discriminant function based on the minimum number of misclassifications (MNM). If you read my papers, you get the new world of discrimination and/or classifications. Download my papers from RG.