I am working on the data mining area leading to knowledge acquisition. Can we try our own algorithm in the WEKA tool? How far it is feasible and accepted ?
Recently I have been working on several classification algorithms and I strongly recommend a Java Machine Learning library: http://java-ml.sourceforge.net/ it is rich and already integrated with WEKA. Java-ML is well documented so You can use it from Your source code. In my opinion WEKA is more tool oriented (You often use WEKA as an application to evaluate some data mining problems).
As far as I know you can adopt your code to WEKA but you can find a lot of documentation and source codes about almost every algorithms. You can use Java or MATLAB. MATLAB includes very rich third party source codes but you can trust them.
In 2011 my team and I conducted a survey of data miners (N=1,319). So, if you are looking for answers to which data mining algorithms are most frequently used by data miners, which software tools are used most frequently, or software satisfaction ratings, then I suggest looking at the study's highlights (http://rexeranalytics.com/Data-Miner-Survey-Results-2011.html). For more detail, just email me ([email protected]), and I will send you the full 37-page summary report. We make it freely available to everyone who requests it.
If you are looking for information about which algorithms people have been using to achieve the best predictive results, I have a different answer:
1) Ensemble models won the $1 Million Netflix Prize (and were involved in each of the top placing entries). Anthony Goldbloom, CEO of Kaggle, presented at Predictive Analytics World in Boston int he Fall of 2012 (www.predictiveanalyticsworld.com), and he also highlighted that he's seen that ensemble models can be powerful in Kaggle competitors quickly achieving good results in the competitions. In my own experience we have also achieved good results with ensemble models.
2) But Goldbloom also went on to discuss which algorithms have been used by the winning Kaggle teams, and he said that Random Forests, Gradient Boosting, or Linear / Logistic Regression are frequently used by the winning teams. In terms of software, R is one of the most frequently used tools among Kaggle competitors.
I hope this is useful information for you. Good luck!