Kernel or Feature map which is the best for Dropout Regularization in CNN?

More Imrus Salehin's questions See All

How can we calculate the Chanel maximum activation value in an RGB channel ( in Feature map)?

In CNN, I want to find out which activation function value has a high value. because each Image channel has a separate activation value. Highest activation value in RGB channel.

06 March 2023 1,587 4 View

How should i write my thesis proposal about the topics cargo aircraft operational efficiency evaluation ?

hello. i have my undergraduate thesis on the topics on cargo aircraft operational efficiency evaluation. actually i am not getting understand how to start my thesis proposal . i asked many of my...

16 February 2023 6,541 2 View

Why we can not use Logistic and Linear regression at the same time? is it possible at any cost?

Machine Learning Problem

30 May 2021 1,449 10 View

Can we apply LSTM in human behavior and computer interaction?

LSTM uses in behavioural science

19 May 2021 3,319 5 View

Is there any research regarding how training and development is going to change in the era of Fourth Industrial Revolution?

As the organizations getting prepared for the fourth industrial revolution, to enhance the skills of their workforce what type of changes are necessary and going to occur in the training and...

01 March 2021 3,744 3 View

XRD characterization of Fe-rich water treatment sludge

As per the conventional way, I will have to analyze dried and powdered samples for XRD peaks. In this way, I wouldn't be able to examine the changes (interaction of Fe species with other species,...

21 July 2017 8,489 3 View

How can we differentiate between the Transmission and Absorption mode of an FTIR analysis?

From previous articles, I have seen that Transmission mode gives downward peaks (if any) during FTIR analysis, while Absorption mode refers upward peaks. But for my sample, it is very confusing....

21 February 2015 3,899 7 View

What does Pseudo-second-order kinetics model indicate in liquid-solid adsorption system?

In my research, I performed batch adsorption experiments on Cu (II) and Pb (II) with activated carbons prepared from oil fly ash. Now to interpret the results, I used kinetics models which showed...

27 October 2014 7,619 37 View

How do I get (mg/g) concentration value from (mg/l) units?

My experiment is about metals leaching from e-waste (PCB). To know the metals composition of my sample, I did metals digestion that was 1g of sample into 100 ml of aqua regia. Then, after...

12 June 2014 5,950 17 View

How can I determine the endothermic/exothermic characteristics of my adsorption system?

In my experiment, I used activated carbon as adsorbents, Cu (II) and Pb (II) as adsorbates, contact time 1 hr, stirring speed 120 rpm, optimum pH 5, concentration 25 ppm, adsorbent dose 0.1g/200...

29 May 2014 2,836 6 View

Feedback defines the constitution of an organism?

“Here is a thought experiment. Let's place Rodolpho Llinas's jarred-brain on top of a body (Fig. 1). I bet Llinas would argue that his jarred-brain retains its own consciousness, and the android...

11 August 2024 2,483 1 View

Self-Organizing Superorganisms—as envisaged by Nenad Sestan (2018)?

The rate of glucose consumption by the neocortex is reduced by over 80% during anesthesia (Sibson et al. 1998), which disables the synapses (Richards 2002) that are inundated by glial tissue (Engl...

08 August 2024 3,118 0 View

How are iso-frequency contours plotted?

Let's say we have a standard, regular hexagonal honeycomb with a 3-arm primitive unit cell (something like the figure attached; the figure is only representative and not drawn to scale). The...

07 August 2024 1,937 1 View

Measuring the Intelligence of a Species?

Larger brains, which typically contain more neurons, store and transfer more information (Tehovnik and Chen 2015), but the precise relationship between number of neurons and information has yet to...

05 August 2024 1,238 2 View

How can i do multivariate Time Series forecast using MLP, ANFIS and LSTM?

I need the python code to forecast what crop production will be in the next decade considering climate and crop production variables as seen in the attached.csv file.

05 August 2024 2,977 3 View

The Curse of Evolution and Complexity?

Brain and body mass together are positively correlated with lifespan (Hofman 1993). The duration of neural development is one of the best predictors of brain size, and conception is the best...

05 August 2024 6,247 3 View

Need help with my research project on open source SIEM and machine learning?

Hello everyone, I am currently working on a research project that aims to integrate machine learning techniques into an open source SIEM tool to automate the creation of security use cases from...

04 August 2024 3,196 2 View

Swimming/space travel depends on the proprioceptive muscle spindles?

When the entire neocortex is ablated in rodents, although they are still able to swim, all the limbs move continuously and asynchronously (Vanderwolf 2006; Vanderwolf et al. 1978). Normal animals...

03 August 2024 835 3 View

What are the limitations and challenges of using machine learning for predicting concrete compressive strength in practical applications?

Machine learning (ML) has shown great potential in predicting the compressive strength of concrete, an important property for structural engineering. However, its practical application comes with...

03 August 2024 2,546 2 View

Some new emerging problems on application of RL for scheduling in IoT networks?

I have seen plenty of existing works on applied Reinforcement Learning (RL) policies for optimized scheduling in IoT networks including Q-learning, DQNs, and the newer ones including PPO for...

01 August 2024 8,754 2 View

Janak Trivedi

Imrus Salehin

Applying dropout to kernels in a CNN means that, during training, some of the kernels are randomly turned off, effectively reducing the number of parameters in the model. This can help prevent overfitting by reducing the complexity of the model and making it less likely to memorize the training data.

Applying dropout to feature maps in a CNN means that, during training, some of the activations in the feature maps are randomly set to zero. This can help prevent overfitting by reducing the ability of the model to learn high-frequency details in the training data.

In general, applying dropout to feature maps is more effective in improving the generalization performance of a CNN, as it is more computationally efficient and less prone to overfitting than applying dropout to kernels. However, the best approach will depend on the specific problem and the architecture of the CNN, and may require experimentation to determine the optimal configuration.

Vrushali Lanjewar

Generally, dropout is applied to fully connected layers to prevent overfitting.

https://arxiv.org/pdf/1904.03392.pdf

https://www.sciencedirect.com/science/article/abs/pii/S0893608018301096

http://mipal.snu.ac.kr/images/archive/1/16/20190516013446%21Dropout_ACCV2016.pdf

Thanks For your recommended paper. And I know about this paper. The second recommended paper was published by my lab!

Thanks for your valuable answer. But in the normal case, we drop the neuron with some weight. in the feature map which is the output of the kernel. feature map has no weight. Kernel has some weight if you consider normal neural network neurons. So, is it a good approach to use a feature map for dropout?

Yes, you are correct. In the traditional dropout technique, the idea is to randomly drop out (i.e., set to zero) some of the activations in the hidden layer with a certain probability (e.g., 0.5) during each training iteration. This has the effect of reducing the representational capacity of the network and preventing overfitting, as the network cannot rely on any single activation to make predictions.

Applying dropout to the feature map in a Convolutional Neural Network (CNN) is equivalent to dropping out some of the activations in the hidden layer, as you mentioned. This can help prevent overfitting, since the network will not be able to rely on any single feature to make its predictions.

So, to answer your question, it is a valid approach to use the feature map for dropout in a CNN. However, as I mentioned earlier, the best choice of regularization technique depends on the specific problem and architecture, and it may be necessary to experiment with different methods to find the best solution.

Manuel Günther

I don't believe that dropout in front of another convolutional layer makes sense. Dropout does not really ignore the feature in the feature map, but it sets it to 0, which is a valid number. This can introduce patterns in the feature map that do not exist without dropout. When applying another convolution on top, the kernels need to learn to deal with random patterns that they never see during deployment.

Dropout on top of the *last* convolutional layer is surely useful, but not in between convolutional layers. This is identical to dropout after linearization in front of the first fully-connected layer, which is how dropout was designed to work.

However, I have to admit that even experts in deep learning (and I am giving a lecture on that topic) have no real insights on how these beasts learn their jobs, so I agree with Imrus Salehin that the best is to try it out.

Manuel Günther Thanks, Professor! For your valuable hypotheses!