We used SVM_light package for binary classification, we are interested to apply best software for classification of our data. Your help will be highly apreciated
It is depending on your application. There's lots of classification algorithms implemented in WEKA. You can use WEKA to test different algorithms in your application first. Then you may look for some better implementation of that algorithm or just use WEKA.
We frequently used weka in our group, our experience is that it implements many algorithms but performance is not comparable to standalone package. For example SVM implemented in weka can not handle large dataset where as SVM_light perform fast and can handle large dataset. Similarily ANN implemented in SNNS is much better than ANN implemented in Weka. We are looking for recent technique which is better than previous techniques as well as developed recently as SVM, ANN, HMM, KNN developed long time back.
SVM tools are wonderful for binary classification. if u want for multi classificatiion then i can suggest u some of the latest algorithm. But binary SVM is best.
In order to adequately respond your question, I would like to make a distinction between "software" and "algorithms". If your request was aimed at the former, then the answer would mainly be formed by practical challenges such as who will eventually be using the software (programmers vs experts vs "leek" users). Software such as R and Weka propose an excellent solution; yet more efficiency could be gained by moving towards compiled software (based on languages such as C++ or FORTRAN), or software that is developed by leading experts in the field (eg. large companies that develop classification software for optimizing processes in finance or medical decisiion making).
However, if your request is related to choosing between the cacophony of available algorithms, I can say the following. In terms of classification, support vector machines (SVM) have gained a lot of popularity during the past decades. Unfortunately, SVM's and many other machine learning approaches are not very transparent (a good example is a neural network where it is not very clear how an individual prediction is achieved, and is based on highly non-linear sums of predictors), such that good performance may not always be explained on the individual prediction level. Particularly, if a question is posed to a black box, and answered very accurately most of the time, it may become very useful to understand the process how these predictions were achieved. In addition, the gathering of knowledge often benefits from making uncertainty explicit. Bayesian statistics (which are also implemented in ensemble learning such as model averaging) have therefore become very popular, although their implementation may be hampered by several challenges such as computational power. Although many algorithms are possible within a Bayesian framework, (logistic) regression analysis remains very effective in the field of classification. Finally, I would like to add that addressing heterogeneity (i.e. differences between the datasets' underlying populations that are used for model derivation and implementation) has received limited attention in the field of machine learning, and is currently being integrated in the field of (medical) prediction research. Although this issue may not directly be related to your original question, it may be useful to keep in mind when performance is lower than expected.
LibLinear (http://www.csie.ntu.edu.tw/~cjlin/liblinear/) may be efficient. It handles large data and can provide superior performance compared with SVMs in some cases.
To select the most appropriate classification method, you will have to get a feel for the degree of non-linearity of data and its underlying dimensionality. The answer to your question will depend a lot on the characteristics and size of your data, the tendency of individual classification methods to get stuck at local minima and the usage of heuristics and other techniques to get out of these local phenomena. Even though a toolbox or application might use the same underlying algorithm, it might use different cirteria to verify and get out of local mimina. For my work, I have used HMMs and an implementation of the EM algorithm to reduce the dimensionality of sets of binary data vectors.