What are the reasons for non-monotonicity in GDM?

More Usman Saeed's questions See All

• What the possible Persistent Organic Pollutants and Heavy metals present in fluorspar, sediments, and water bodies around its mining area?

Approximate concentrations are require in compared with the WHO permissible limts

11 August 2024 2,723 1 View

Why is electronic energy 0.000 for liquid crystal compounds and is invariable with temperature in Guassian 09 software?

Are there any suggestions or insights you can provide to help address this problem

06 August 2024 1,443 2 View

Provide a questionnaire for positive role of politics in nursing ?

I want to write article in a topic of positive role of politics in nursing, and I'm confused how can we add add questionnaire and what will be the questionnaires.

18 July 2024 10,021 3 View

Issues in coating of cathode material on Al foil, why we are not getting proper coating and results when the cathode was coated on Al-foil..?

We are trying to coat some cathode materials on Al-foil ( ex - LiFePO4 , S@CNT, etc). When we coated the Al foil with the cathode (ex - S@CNT) material , the cycling result was annoying. But the...

09 July 2024 5,123 4 View

Taking a cut line through deck build in silvaco?

Hi everyone hope your all doing good. I’ve been looping my scripts and am trying to take a cut line of the conduction band energy with respect to y for each loop. Basically, I want to take a...

03 July 2024 9,134 1 View

My hippocampal mouse primary neurons clump when I plate them on microfluidic devices for axonal isolation. Any tip?

Hi all, I am working on both primary hippocampal mouse neurons and i3Neurons to isolate axonal material. I plate them in microfluidic devices in house made (like any other you can find on the...

01 July 2024 8,070 0 View

Please, what are the recent/top/free software for predicting data from MALDI-TOF that you have tried?

Please, can the fragmentation obtained from the mass spectrometry be predicted more easily using any free software? Kindly recommend it, thanks.

01 July 2024 5,327 6 View

How to calculate corrosion rate from potentiodynamic polarization test data for dissimilar welds?

we need surface area, corrosion current, equivalent weight and density of the metal involved for corrosion rate calculation in mm/year (lets say). For dissimilar metal welds, how can we measure...

30 June 2024 5,943 2 View

Can we use MTCMOS instead of CMOS for designing SRAM Latch ?

Basically CMOS are used for designing the SRAM cells . How the functionality will differ if we use Multi threshold CMOS instead of CMOS generally MTcmos decrease the leakage power but it will...

28 June 2024 9,845 0 View

How to synthesize Dichloro(p-cymene)ruthenium(II) dimer in a good yield?

Suggest some procedure where i can synthesize [Ru(p-cymene)Cl2]2 in a good yield.

23 June 2024 9,244 2 View

Feedback defines the constitution of an organism?

“Here is a thought experiment. Let's place Rodolpho Llinas's jarred-brain on top of a body (Fig. 1). I bet Llinas would argue that his jarred-brain retains its own consciousness, and the android...

11 August 2024 2,483 1 View

Text-Communication from the M1 Hand Area using BCI—and then there is Elon Musk?

Willett, Shenoy et al. (2021) have developed a brain computer interface (BCI) that used neural signal collected from the hand area of the motor cortex (area M1) of a paralyzed patient. The...

10 August 2024 7,180 0 View

Can we mark 'EFL Learners shifting from general digital to AI technologies' as technological transition?

After COVID-19 it has seen that EFL learners technological affiliation has raised. In addition, in the post-COVID period learners started to engage AI technologies like ChatGPT while learning...

08 August 2024 8,964 4 View

What are examples of AI for good projects a teacher can assign to students?

So I am organizing an AI seminar. What are possible AI projects in the AI for good spirit? something the students can do and have an impact?

08 August 2024 9,437 4 View

Self-Organizing Superorganisms—as envisaged by Nenad Sestan (2018)?

The rate of glucose consumption by the neocortex is reduced by over 80% during anesthesia (Sibson et al. 1998), which disables the synapses (Richards 2002) that are inundated by glial tissue (Engl...

08 August 2024 3,118 0 View

How to design human-centered classroom in the age of A.I.?

08 August 2024 347 5 View

Hello all, Looking for international reviewer to review Ph.D thesis in wireless sensor network.Can anybody help?

My name is Apurva Saoji. I am a Ph.D scholar in Computer engineering in India. I am looking for international expert in reviewing my PhD thesis, "Competitive Optimization Techniques to Minimize...

07 August 2024 4,600 2 View

Do experts have journals in the field of artificial intelligence and big data that are not indexed by SCI or EI?

05 August 2024 8,836 2 View

Measuring the Intelligence of a Species?

Larger brains, which typically contain more neurons, store and transfer more information (Tehovnik and Chen 2015), but the precise relationship between number of neurons and information has yet to...

05 August 2024 1,238 2 View

What's the role of IT & AI in Telecommunication Industry?

05 August 2024 8,264 3 View

Fabrice Clerot

I suppose your question is about a non monotonous behabiour of the error as training proceeds

stochastic gradient descent are intrinsically non monotonous but at small scale only

if you have strong non monotonicity on long time scales (error going down, then strongly going up, then down again etc), it might just be that your momentum term is too high

check your implementation with a zero momentum term, then add a small momentum term to see if it either speeds up the training and/or improves the performance

choose the value of this momentum term from the best performance on a validation set

Usman Saeed

Thanks, please is it possible to explain in detail about momentum term. I will be thankful for you.

Michal Hradis

Usman,

momentum in SGD smooths updates by combining gradients from several last mini-batches (running average). The update becomes something like 0.8*g_t + (0.8)^2*g_t-1 + (0.8)^3*g_t-2 ..., where g_t are gradients from the mini-batches and 0.8 is the momentum. Normally, SGD would use just the last gradient.

Reasonable momentum will improve convergence speed. Also, it may lead to better results, as the momentum can help to avoid local minima.

Generally, you have to be careful with SGD. High learning rate, especially combined with high momentum, will result in unstable convergence, and may even result in complete divergence of your solution. What is "high" depends on your network. All this assumes, your implementation of SGD is correct.

Regards.

Thanks Michael