Metaheuristic algorithm usually used in optimization's field. Could someone explain what is the role of metaheuristic algorithm and how it work in the field of data classification?
I am not an expert in data classification. Just to be sure, by data classification you mean: given a set of data, cluster each element into groups. E.g. given a set of leukemia patients, classify them into degrees of severity, say 0-no leukemia; 10- quite serious.
So if we are in the same channel, here goes my answer, which is based more in my own background than experience in field work. I have studied optimization and artificial neural networks and read some works about data classification all over my academic life.
To the best of my knowledge, all data classification algorithm is based somehow in an optimization algorithm, directly or indirectly. For instance, you apply a multilayer perceptron to classify a data set. This aforementioned artificial neural network is based on optimization, back-propagation or any variation, but you must minimize an error function in the training step. meta-heuristic will make possible to apply the algorithm to generic cases.
Other algorithms may use "indirect" optimization, such as the concept of unsupervised learning or even self-organization.
So, to the best of my knowledge, metaheauristic algorithms play a central role in data classification, mainly in "supervised" classification, based on pre-training. Its central role is based on the fact that most classification algorithm applies a "cost" function in order to have an idea of error values, a distance between a set and data.
An objective function need to be formulated in such as way that its minimization can be able to classify the variable on which the objective function is depend. Following articles may be useful:
Metaheuristic algorithms have various applications in classification. Examples follow:
1) Feature set partitioning generalizes the task of feature selection by partitioning the feature set into subsets of features that are collectively useful, rather than by finding a single useful subset of features.
Example: Rokach, L. (2008). Genetic algorithm-based feature set partitioning for classification problems. Pattern Recognition, 41(5), 1676-1700.
2) Mining classification rules from large databases
Example:Dehuri, S., Patnaik, S., Ghosh, A., & Mall, R. (2008). Application of elitist multi-objective genetic algorithm for classification rule generation. Applied Soft Computing, 8(1), 477-487.
3) Metaheuristics like GA can be used to optimize classification models obtained using other data mining techniques (deduction trees, ANNs, Svm,...).
Example: Nasiri, J. A., Naghibzadeh, M., Yazdi, H. S., & Naghibzadeh, B. (2009, November). ECG arrhythmia classification with support vector machines and genetic algorithm. In Computer Modeling and Simulation, 2009. EMS'09. Third UKSim European Symposium on (pp. 187-192). IEEE.