well, i guess if your data is images, you would rather do feature extraction and then, if necessary (the vector of extracted features is still large and probably redundant), you can perform feature selection.
Some very popular features include for example SIFT or SURF, but there are many other features. I think you have to perform this choice based on what are the salient characteristics that will allow you to differentiate the classes of textures you have in mind.
I am sorry but i am not sure i got it right. Please correct me if I am wrong.
Do you extract 5 features from every image and then you classify them based on this 5 features?
If so, your feature space has already low dimensionality, so I guess you can just train a classifier on these 5 features and see whether it works well.
Another thing you can do (given that you have only five features) is to exhaustively search all the possible features sets which are only 32 (2^5), and therefore the search should be not too time consuming.
You can also compute co-occurrence distributions (e.g. GLCM) on the texture and get more features just to increase a dimensionality of your feature vectors. Then you can use WEKA data mining tool to find the most important features. I do not know if it is applicable in your case, but generally if you have several image representing the same object, you can apply PCA on these images and use the most relevant principal component as “selected features”.
You could try PLOFS: It is a fast wrapper approach with a subset evaluation satisfying monotonicity.
Jiang Li, Jianhua Yao, Ronald M. Summers, Nicholas Petrick, Michael T. Manry, and Amy K. Hara, “An Efficient Feature Selection Algorithm for Computer-Aided Polyp Detection,” special issue of the International Journal on Artificial Intelligence Tools (IJAIT), vol. 15, no. 6, December 2006, pp. 893-915.
Jiang Li, Michael T. Manry, Pramod Narasimha, and Changhua Yu, “Feature Selection Using a Piecewise Linear Network”, IEEE Trans. on Neural Networks, Vol. 17, no. 5, September 2006, pp. 1101-1115.
Feature extraction is highly subjective in nature, it all depends on what type of problem you are trying to handle. There is no generic feature extraction scheme which works in all cases.
If you are handling images, you extract features (appropriate) and if the feature dimension is high then try to do the feature selection or feature transformation using PCA where you will get high quality discriminant features.
Its difficult to say which feature extraction and selection method is good. It is highly subjective in nature and very specific for application at hand. Coming back to your question regarding symbolic data handling. May not be true always.
Before, feature extraction or feature selection, feature definition is an important step, and actually it determines the core of the solution. If you are starting from the point after that step, i.e., the features are pre-fixed, the Marco's comments above is a good starting point.
Feature extraction and feature selection are two techniques tied to hand crafted features. From my experience, if you have a wide matrix (more features than data points), lasso/lars might be a good choice. If you have a tall matrix (more data points than features), on the other hand, the PLOFS algorithm mentioned above might be used. However, if you have a large date set for this case, you might try deep learning on raw data to automatically extract features and try to replace the feature engineering.
With the increasing research with deep learning, I could suggest try with convolution neural networks and you don't need to be worried for features or how to select them.
Thanks for your answer and I have my own dataset and I would to create them from the scratch. Do you have any idea about how I can create these data in order to train my Machine Learning
You can use SURF algorithm (Speeded Up Robust Features) for
feature extraction and feature selection. The main advantages of using this algorithm are this is Rotational invariant and faster in computation because here we deal only 64 Dimension vectors.
PCA is generally only good if the feature space is not terribly large, because the computational efficiency is very low. It will work with other feature extraction methods to further define the features that are best for your particular problem, though. However, if you want a faster way, you can use a classification method that uses an L1-norm function. This will naturally reduce some coefficients to 0 and they will effectively become drop out terms. However, with a data set that has many parameters this still might not reduce the parameters enough for your particular problem.
There are many feature selection techniques that are frequently used now in these days, such as Independent Component Analysis(ICA), Principal Component Analysis(PCA), and Linear Discriminant Analysis(LDA). You can perform it in Matlab