Text classification (TC) is a popular technique in the data mining and it used to get valuable information from the vast amount of data. The process of classifying or grouping documents into predefined set of classes based on a set of criteria that defined in advance is called text classification. TC has been exploited in various applications such as: documents organization, automated documents indexing, filtering of spams, text filtering, word sense disambiguation. The process of reducing data dimensions is needed to accurately and efficiently generate a low-dimensional data from high-dimensional data. Dimension reduction is an important process in data mining for analyzing massively high dimensional data by eliminating unnecessary properties. One of dimension reduction processes is called feature selection. FS has two wider approaches: wrapper and filter. In the wrapper approach, a subset of the features is selected depending on the accuracy of the classifiers while in the filter approach, a subset of features is selected or filtered using feature scoring metric.

Papers:

M. A. Basir, Y. Yusof, and M. Saifullah, “OPTIMIZATION OF ATTRIBUTE SELECTION MODEL USING BIO-INSPIRED ALGORITHMS,” J. ICT, vol. 18, no. 1, pp. 35–55, 2019.

H. Naji, W. Ashour, and M. Al Hanjouri, “Text Classification for Arabic Words Using BPSO/REP-Tree,” Int. J. Comput. Linguist. Res., vol. 9, no. 1, 2018.

Y. Wang, L. Feng, and J. Zhu, “Novel artificial bee colony based feature selection method for filtering redundant information,” Appl. Intell., vol. 48, no. 4, pp. 868–885, 2018.

P. Shunmugapriya and S. Kanmani, “A hybrid algorithm using ant and bee colony optimization for feature selection and classification (AC-ABC Hybrid),” Swarm Evol. Comput., vol. 36, pp. 27–36, 2017.

More Eduard Babulak's questions See All
Similar questions and discussions