When training a RL actor-critic agent, what are the key values to monitor ?

13 February 2018 0 7K Report

I'm training an agent to accomplish a reaching task. The agent controls a multi-joint robotic arm and has to reach for a target. So far, I've had some success with vanilla policy gradient but, to my surprise, I can't get it to work with actor-critic.

I'm wondering about the ways I can find out what makes it fail. I've tried various reward functions, but none was robust enough. Hence, I thought about monitoring more the agent. I'd like to know what values you think might give some insight ?

I've thought of:

Critic/Value loss
Actor/Policy mean loss and variance

Thanks !

Badges
Science topic

More Mehdi Mounsif's questions See All

Hi there, someone has the SeinFit software for windows because I cannot download it?

DOS version.

29 July 2024 6,064 1 View

In a multi-scale calculation using ORCA, are charges of the QM system updated during each iteration of optimization?

I'm performing multiscale calculations with mechanical embedding in ORCA and I have a question about how atomic charges are handled during geometry optimization and reaction coordinate scans. Are...

06 July 2024 1,360 0 View

How can preparation concentration ppm from Essential oils for plant?

We are studying about essential oil of some plant again of antimicrobial, we want use concentration by ppm, how can preparation this concentration and which solvent we are use. Thank you for any...

28 May 2024 2,085 3 View

How to become a good reviewer for a scientific journal?

It is not only the responsibility of the authors but also the reviewers to assess the manuscript appropriately and help in improving the quality of the finished article. A good reviewer not only...

11 May 2024 4,858 7 View

Can dating detrital zircon in thin sections affect the representativeness of dates?

Dear Geology Community, I am currently researching the dating of detrital zircon in metasediments. Due to the small size of my samples, traditional mineral separation methods may not yield a...

09 May 2024 337 7 View

Modeling train induced vibration in Abaqus?

I want to study the effects of train induced vibrations on nearby structures. For this purpose I used this paper :Finite element model of ballasted railway with infinite boun... I modeled the...

24 April 2024 2,687 6 View

Seeking Co-Guide for my Ph. D. Degree, Are you willing or can you refer?

Dear ResearchGate Community Friends, Seeking a co-guide for my Ph. D. Degree. Concerning the above-mentioned subject, I wish to inform you that I am Mr. Mir Mehdi Ali Jafri, doing my Ph. D....

16 April 2024 3,572 0 View

What articles do we have on the use of body language among politicians for policy conception?

need the latest articles on the powerful use of body language among politicians for policy conception and how politicians leverage body language to influence public opinion and gain support.

10 March 2024 5,036 2 View

Grammarly Premium account?

"I am seeking information on how to access Grammarly Premium for free, as purchasing and making payments from within Iran is not feasible. Is there a way for someone to share their account with me?"

05 March 2024 6,873 0 View

Can anyone help me with PPG signal processing? you recommend a trainer, the trainer will be paid for online trainings?

Can anyone help me with PPG signal processing? or recommend a trainer, the trainer will be paid for online training....

29 February 2024 3,578 1 View

Can ROBINS-I be used to measure risk of bias in single-arm studies?

I am doing a systematic review, and I am measuring risk of bias with RoB2 for RCT, and ROBINS-I for non RCT. My questions is, for single arm studies, can I use ROBINS-I? I am not sure how to...

14 July 2024 3,570 5 View

Given the organizational complexity of academic institutions does an internal institutional politics play significant role in an institution's growth?

There are few business activities more prone to a credibility gap than the way in which executives approach organizational life. A sense of disbelief occurs when managers purport to make decisions...

08 July 2024 1,323 2 View

What is the current status of augmented learning in robotic surgery?

I would like to perform a literature review at this time on augmented learning and learning augmented algorithms to enhance performance-guided surgery

06 July 2024 246 1 View

Do farmers use robots and its in the field of agriculture and difference between automation and robot technology?

26 June 2024 9,055 2 View

What causes unexplained rash on the whole body after consuming Stillnox and Serdep in patients on Jalramet, Reaptan, Aspavor, Chela-fer and Myoprin?

I want to understand what are the contraindications and side effects caused in diabetic patients who are on chronic medication (Stillnox and Serdep in patients on Jalramet, Reaptan, Aspavor,...

25 June 2024 984 0 View

List Plasmid suitable for Homology cassette in CRISPR-Cas9 protocol for Knock-In in mammalian cell line ?

Hello, I am working on creating a mutant endothelial cell line having Neomycin knock-in present, I have designed my homology arms of 1kb size having neomycin in between them, I would like to...

25 June 2024 1,197 0 View

Why maximum likelihood approach over sample variance?

Maximum likelihood estimation is used to estimate population mean and variance. However, we can do it with sample variance too and sample variance is unbiased also. Then why we prefer maximum...

25 June 2024 3,921 3 View

Can any one fight desperate and what does hope means to you?

Dear Colleagues One might lose hope in a hard experience and falls in desperate. Then how can one fight desperate and be strong enough armed with hope?

24 June 2024 7,316 4 View

Appropriate test for statistical significance?

I have two groups (A and B). I want to determine statistical significance between the two groups for each individual item (7 items measuring participation). This way I can be able to tell whether...

22 June 2024 7,302 8 View

Motivational systems bind perception and behavior?

It was gestalt psychologists that took issue with elementalism (Kohler 1929), that all perception/consciousness can be broken down into component parts. The expression ‘the whole is greater than...

16 June 2024 9,199 0 View