What is the best metaheuristic to optimize the weights for a neural network used for classification?

Nils Goerke Popular answer

Dear Manar,

there are several methods (sometimes called Metheuristics) that can be applied to optimize the weights for Neural Networks.

But their use and their efficiency depends on several aspects

1: What type of Neural Network do you have in mind? A Multi Layer Perceptron (MLP)? a Radial Basis Functions Network (RBF)? a recurrent MLP structure? a Deep learning convolutional network? a Restricted Boltzmann Machine?

or else?

2. you have mentioned a classification tasks, thus I assume you have a supervised learning problem, and I hope you have labeled teacher data.

Now since you have teacher data, what is the special reason in your application for not using a gradient descent based algorithm?

3. Do you require to find the global optimum? or is it O.K for you to find a suitable, local optimum with respect to your application?

Since the the choice of the neural network depends on the nature, and dimension of the input patterns, and the number of classes (at the output of the network) and the goal, and the choice for an algorithm for training depends on the network type, number of training data, as well as the optimization goal, it would be helpful to get an idea of the application you have in mind.

As far as I have experienced within the last 20 years of neural network research, there is no unique network-learning combination that is universal for all types of problems.

I think it is necessary to have a closer look at the problem description before proposing an adequate methods.

Can you tell us?

Regards

Nils Goerke

Mustafa Servet Kıran

Generally, the criterion in selection a method is performance. Low computation cost, measument accuracy, implementation difficulty can be accepted as one of this criteria.

There are many metaheuristic method applied to train NN in literature. Recently you can use Bat Algorithm, Firefly etc. You can see implementation of Artificial bee colony and particle swarm optimization to NN training.

Gerson Flavio Mendes de Lima

Training neural networks is a complex task that is important for supervised learning. A few metaheuristic optimization techniques have been applied to increase the effectiveness of the training process. The Cuckoo Search (CS) algorithm is a

recently developed meta-heuristic optimization algorithm which is suitable for solving optimization problems. Also, Guaranteed Convergence Particle Swarm Optimization (GCPSO) which is a PSO variant is . CS proved to be superior to PSO and GCPSO in all benchmark problems. take a look at:

http://www.ijcit.com/archives/volume1/issue2/Paper010221.pdf

Hossein Izadi

Dear . Manar,

Salam Aleikom,

There is many algorithms that you can use them to optimize the weighs of ANN, however some algorithms can not minimize the errors accurately. If your number of weights of ANN are greater that 70, please use hybrid evolutionary algorithms or improved one. Using which one of these algorithms cited in the literature depends on your problem, and you have to read algorithm's papers to find the best algorithm based on your problem.

Good Luck,

Hossein Izadi.

Nils Goerke

Dear Manar,

there are several methods (sometimes called Metheuristics) that can be applied to optimize the weights for Neural Networks.

But their use and their efficiency depends on several aspects

or else?

2. you have mentioned a classification tasks, thus I assume you have a supervised learning problem, and I hope you have labeled teacher data.

Now since you have teacher data, what is the special reason in your application for not using a gradient descent based algorithm?

3. Do you require to find the global optimum? or is it O.K for you to find a suitable, local optimum with respect to your application?

As far as I have experienced within the last 20 years of neural network research, there is no unique network-learning combination that is universal for all types of problems.

I think it is necessary to have a closer look at the problem description before proposing an adequate methods.

Can you tell us?

Regards

Nils Goerke

Manar I. Hosny

Thank you everyone for your great answers. This is really very helpful.

Dear Nils,

I will try to answer your questions as much as I can, since my research is at a very early stage.

The application I am trying to use is classification of BCI (EEG) signals for emotion detection. The NN to be used is MLP. I am not yet familiar with the data, but I expect it to have a large number of input features, and a few number of classifications categories as output. And yes, there should be a training data set available. But until now, I haven't seen gradient decent applied for training the NN in similar applications. I have seen PSO and other swarm intelligence and evolutionary algorithms, but I am not sure which one is best. Regarding whether I can be satisfied with local optimum, the answer is yes.

I hope my answers can help you know more about the problem, and thanks again for your contribution to my enquiry.

Regards,

Manar

Habib Shah

Dear Sir,

I have used Artificial Bee Colony algorithm for optimizing ANN weight for classification of Boolean functions, Cancer Classification, and time series Prediction tasks. I found ABC algo is more efficient than others supervised learning algorithms.

Remember,,,,,NO FREE LUNCH THEOREM:

Thanks.

Is the number of objective function evaluations an accurate measure of the complexity/runtime of a metaheuristic algorithm?

What do you consider the diversification and intensification strategies in GAs?

Which software tools are best for enhancing diagnostic accuracy in chest X-ray imaging using image reconstruction and neural networks?

How can I extract the mathematical equation from existing Neural Network Model?

What is the current status of augmented learning in robotic surgery?

How can I improve the purity of NPC cultures derived from human iPSCs during neural rosette selection?

How to choose best solution from Meta-heuristic solution?

Is it possible to use neural network models for prediction if the sample size for the time series is very small??

What is information diffusion in the social network?How a message got viral in social network?

In CNN, is the feature map obtained randomly by convolution kernel?

How does a Man-in-the-Middle (MitM) attack work in the context of Transport Layer Security (TLS), and what specific mechanisms can be employed ?

How to reduce the number of measurements/iterations needed in deep reinforcement learning?