How to tune hyper parameters to train an agent based on proximal policy optimization in RL?

Damsara Udan Jayarathne @Damsara-Jayarathne

31 August 2022 0 5K Report

Hello everyone,

I am working on a problem in which a point mass is trying to catch another point mass. The dynamics are correct and implemented in MATLAB RL uswing a PPO agent. I have normalised the reward and the observations. The reward is 2*exp(-0.005*(e_x^2+e_z^2)) where e_x and e_z refer to the error in x and z directions.

I am a little bit confused about tuning the hyper parameters. Currently I am using

"ExperienceHorizon",100,...

"ClipFactor",0.1,...

"EntropyLossWeight",0.01,...

"MiniBatchSize",75,...

"NumEpoch",5,...

"AdvantageEstimateMethod","gae",...

"GAEFactor",0.95,...

Total training time per episode is 10 seconds (100 steps). The agent does not seem to train even after 20000 iteration.

I would be grateful if someone could point me in the right direction. Thanks in advance.

Badges
Science topic

More Damsara Udan Jayarathne's questions See All

How to train a neural network in simulink?

I was wondering if anyone has experince in tranining a neural network in simulink. I tried a couple of methods, they work in offline, however do not work well with the simulink. Any advice is much...

09 November 2022 3,178 2 View

How can you explain the surface condensation by the use of molecular levels?

In air conditioned buildings have the surface condensation problem. So it is important to find the factors affecting the condensation process. I need to know where and why the condensation...

27 July 2014 8,797 2 View

Feedback defines the constitution of an organism?

“Here is a thought experiment. Let's place Rodolpho Llinas's jarred-brain on top of a body (Fig. 1). I bet Llinas would argue that his jarred-brain retains its own consciousness, and the android...

11 August 2024 2,483 1 View

Self-Organizing Superorganisms—as envisaged by Nenad Sestan (2018)?

The rate of glucose consumption by the neocortex is reduced by over 80% during anesthesia (Sibson et al. 1998), which disables the synapses (Richards 2002) that are inundated by glial tissue (Engl...

08 August 2024 3,118 0 View

Measuring the Intelligence of a Species?

Larger brains, which typically contain more neurons, store and transfer more information (Tehovnik and Chen 2015), but the precise relationship between number of neurons and information has yet to...

05 August 2024 1,238 2 View

How can i do multivariate Time Series forecast using MLP, ANFIS and LSTM?

I need the python code to forecast what crop production will be in the next decade considering climate and crop production variables as seen in the attached.csv file.

05 August 2024 2,977 3 View

The Curse of Evolution and Complexity?

Brain and body mass together are positively correlated with lifespan (Hofman 1993). The duration of neural development is one of the best predictors of brain size, and conception is the best...

05 August 2024 6,247 3 View

Need help with my research project on open source SIEM and machine learning?

Hello everyone, I am currently working on a research project that aims to integrate machine learning techniques into an open source SIEM tool to automate the creation of security use cases from...

04 August 2024 3,196 2 View

Swimming/space travel depends on the proprioceptive muscle spindles?

When the entire neocortex is ablated in rodents, although they are still able to swim, all the limbs move continuously and asynchronously (Vanderwolf 2006; Vanderwolf et al. 1978). Normal animals...

03 August 2024 835 3 View

Training for new staff?

I am looking for some training for new staff that will be starting in a self contained classroom with students with ASD. Most new staff have little to no experience working with students with ASD....

03 August 2024 6,717 3 View

What are the limitations and challenges of using machine learning for predicting concrete compressive strength in practical applications?

Machine learning (ML) has shown great potential in predicting the compressive strength of concrete, an important property for structural engineering. However, its practical application comes with...

03 August 2024 2,546 2 View

Some new emerging problems on application of RL for scheduling in IoT networks?

I have seen plenty of existing works on applied Reinforcement Learning (RL) policies for optimized scheduling in IoT networks including Q-learning, DQNs, and the newer ones including PPO for...

01 August 2024 8,754 2 View