Can someone recommend an efficient feature selection methods in a database?

More Sajjad Fouladvand's questions See All

I have an accepted paper in a conference. They will published papers in a volume by IEEE Press as post-proceedings. What exactly is a post-procedeeng?

I have an accepted paper in an international conference. They are going to published all accepted papers in a volume by IEEE Press as post-proceedings. What exactly is an IEEE...

04 May 2015 9,335 1 View

Any advice on detection and molecular classification of birds using molecular detector DNA Barcoding ?

I'm intrigued if there exist a way to do this project "Detecting and molecular classification of lightweight birds using a molecular detector DNA Barcoding" using pattern recognition and machine...

02 March 2015 8,801 3 View

What is the bests ways for getting a high score in GRE exam?

I've read 504 words, essential words for toefl and1100 words. I'm going to take the GRE exam in 7 or 8 month (I just wanna apply for fall 2016 and so I guess I'd better take the exam in 7 or 8...

11 December 2014 3,617 2 View

How can we use Cross Validation methods for both parameter optimization and error evaluation simultaneously?

I want to build a machine learning model and unfortunately I have a limit and small number of samples for both training and testing phase. I've always used a validation set for parameter...

11 December 2014 8,220 10 View

Does anybody know an ISI journal with quick process in the area of natural computing, artificial immune systems, biologically inspired algorithm?

I'll appreciate if somebody introduces me a good ISI journal with quick process time. My paper is in the field of artificial immune system, biologically inspired algorithm, natural computing,...

10 November 2014 10,002 2 View

I'm creating a data set for machine learning tasks. How should I decide on the number of samples in the data set?

I'm gathering data and trying to generate a data set. I'm wondering how should I decide on the number of samples. How many samples should be measured. Besides, any other recommendation about...

10 November 2014 6,540 18 View

What are the best methods for discretization of continuous features?

I'm looking for a strong method to discretization of continuous features. I prefer implemented methods with prepared libraries or functions. Thanks in advanced

10 November 2014 730 13 View

How should we deal with the lack of training data in a machine learning task?

I want to work on a machine learning and pattern recognition task, but the size of data set is small and there are only 43 samples for both training and testing purposes. How about the bagging...

09 October 2014 5,057 15 View

Which are the most efficient feature selection methods for one-class classification problems?

I want to rank features in a one-class classification problem. I'm looking for proper feature selection methods for one-class classification problems. Thanks in advance.

09 October 2014 4,166 17 View

What are the most interesting application of image processing, machine learning and pattern recognition in medicine these days?

Image Processing and machine learning concepts has been widely used in medicine. I'm going to work on the use of Artificial Intelligence techniques (preferably image processing and machine...

08 September 2014 835 5 View

Feedback defines the constitution of an organism?

“Here is a thought experiment. Let's place Rodolpho Llinas's jarred-brain on top of a body (Fig. 1). I bet Llinas would argue that his jarred-brain retains its own consciousness, and the android...

11 August 2024 2,483 1 View

How to learn more about SPSS and its Application?

I would like to learn more about SPSS and Its application especially in regards to data analysis. Please suggest me how I can learn more about it. Thank you so much.

11 August 2024 9,101 4 View

Text-Communication from the M1 Hand Area using BCI—and then there is Elon Musk?

Willett, Shenoy et al. (2021) have developed a brain computer interface (BCI) that used neural signal collected from the hand area of the motor cortex (area M1) of a paralyzed patient. The...

10 August 2024 7,180 0 View

Handling Missing Data and Building a Predictive Model with Incomplete Information ?

I am developing a predictive model for a water supply network that involves 20 influencing points. However, I only have historical data for 10 out of these 20 points. I would like to know how to...

10 August 2024 4,005 2 View

Has anyone applied Python in the field of textile engineering for data analysis, automation, or smart textiles?

I'm currently exploring the application of Python in textile engineering, specifically in areas like data analysis, process automation, and the development of smart textiles. I'm interested in...

10 August 2024 7,429 2 View

Can we mark 'EFL Learners shifting from general digital to AI technologies' as technological transition?

After COVID-19 it has seen that EFL learners technological affiliation has raised. In addition, in the post-COVID period learners started to engage AI technologies like ChatGPT while learning...

08 August 2024 8,964 4 View

What are examples of AI for good projects a teacher can assign to students?

So I am organizing an AI seminar. What are possible AI projects in the AI for good spirit? something the students can do and have an impact?

08 August 2024 9,437 4 View

Self-Organizing Superorganisms—as envisaged by Nenad Sestan (2018)?

The rate of glucose consumption by the neocortex is reduced by over 80% during anesthesia (Sibson et al. 1998), which disables the synapses (Richards 2002) that are inundated by glial tissue (Engl...

08 August 2024 3,118 0 View

How to design human-centered classroom in the age of A.I.?

08 August 2024 347 5 View

Do you know best mines of western part of Afghanistan?

I want to know more about Mn deposits in west of Afghanistan.

07 August 2024 3,427 1 View

Muhammad Shahzad Cheema

A simple solution is to use PCA or LDA, if the problem is dimensionality reduction. If the problem is Feature Subset Selection (FSS) i.e. selecting subset of original features set,

Principal Feature Analysis (unsupervised) venom.cs.utsa.edu/dmz/techrep/2007/CS-TR-2007-011.pdf

and SVM Recursive Feature Elimination SVM-RFE (supervised method) www.eecis.udel.edu/~yuy/report0531.pdf

are two good approaches. Particularly SVM-RFE has sound theoretical basis as compared to Decision Tree, Genetic Algorithm, or Heuristic Search based FSS methods.

Kaveh Mollazade

You can use PCA or LDA to compress features. But, after compression, you have to determine the number of appropriate PC scores.

Another way is to use a feature selection procedure. You may use a "filter" or a "wrapper" technique. My suggestion is to apply a wrapper technique like Genetic Algorithms (GA).

Umme Zahoora

if your problem is feature subset selection and your data is not noisy data at all then Wrapper with KNNC might be a good choice ,But Wrapper is slow .you can use any learning algorithm with wrapper that takes less time to build and test model.If there is no time constraints you can try any other combination as well.

Mahmoud Omid

This is a good example for curse of dimentionality. If your aim is to map original data onto new space where features are non-correlated, then PCA (or ICA, CCA, etc.) can help. We used PCA in [1] and [2}. By using PCA in [1] more than 99% and in [2] more than 98% reduction in the dimension of feature vectors was achieved. Try using Unscrambler software, which is the complete multivariate analysis and experimental design software, equipped with powerful methods including PCA, MCR, (PLS-R, 3-Way PLS Regression, K-Means Clustering and SIMCA Classification. You can visualize PCs easily.

[1] Omid, M., A. Mahmoudi and MH. Omid (2010). Development of pistachio sorting system using PCA assisted artificial neural networks of impact acoustics, Expert Systems with Applications 37(10): 7205-7212.

[2] Omid, M., A. Mahmoudi and MH. Omid (2009). An intelligent system for sorting pistachio nut varieties, Expert Systems with Applications 36(9): 11528–11535.

Bojan Ploj

Bipropagation could be the answer

Alexandre Lacoste

Random Forest is known to be fast and can be used to select good features.

Eric Nunes

Although random forest is a good choice for projecting high dimensional data to a lower dimensional subspace, since the projection is random you will have less variance and better accuracy. But if you are using a sparse feature set, which is the case in most high dimensional data, random forest may just backfire as it may select useless or uninformative features , thereby degrading the classifier.

The best choice could be using a weighted feature method , where features that have more information are assigned more weight, hence while selecting features you could use a set of informative features rather than useless ones and apply random forest to it, which I guess should give you better results. Best of luck.

Sajjad Fouladvand

Thank you all friends for your advantageous replies. They are pretty informative.

Serkan Gunal

You may also use feature transformation and feature selection methods together as a hybrid solution. For example, you may first perform transformation (e.g., PCA, LDA, etc.) and then apply a feature selection method (either wrapper or filter) to further decrease the feature dimension.

Thank you all again

Dear Gunal I think I'd better first use a feature selection and then perform feature extraction method because the data set is noisy and feature extraction methods like PCA use all features.

The data has very high dimensional and it is a sparse matrix.

Dear Al-Azzawy

Thanks for your reply. I don't know about SIFT but I'm intrigued about it. I'll appreciate if you give me a good reference for theoretical concepts of SIFT and your source code. I'll try it on my data set and inform you about the results.

Ehsan Mirsadeghi

I think you should give us more information about your dataset, Goal of classification and your limitation in your project.

If you are working on images, you have many choices and methods to perform this task.

if you are working on numerical data, and looking for feature extraction or selection from these kind of databases, you should apply data analysis methods.

Manizheh Ghaemidizaji

My MS thesis relates to "feature selection" in order to improve the performance of machine learning Algorithms like KNN. I suggest you to try feature selection by evolutionary algorithms like GA, ICA, and PSO; of course if you are interested in optimization algorithms ,otherwise, study methods like PCA (Principle component Alalysis) .

János Abonyi

I developed clustering based techniques, e.g. "clustering based model order selection of input-output models" You can find the related paper at: http://www.abonyilab.com/clustering