How can we make the training loss decrease in reinforcement learning

More Chen Ziheng's questions See All

Are there any good application of Game theory in machine learning?

Hi everyone, I am working on improving the accuracy and efficiency of some traditional machine learning models. As game theory illustrates the cooperation and conflict of different decision...

01 February 2018 1,645 5 View

Could you give me some advices about how to improve the performance of Convolutional Neural Networks?

Hi, now I am dealing with a picture with size of 72*72 and I want to use the Convolutional Neural Networks to classify by the label. I know there are mant differnt ways to design the size of...

03 April 2016 3,199 3 View

What is the pros and cons of Convolutional neural networks?

Hi researchers! I am a learner of statistics learing and machine learning. After applying the Convolutional neural networks into image recognition and text mining, I think this method is powerful...

02 March 2016 8,406 5 View

When Convolutional Neural Networks meet Lasso, what will happen?

Hi, now I want to introduce the Convolutional Neural Network to variable selection. I have a high dimensional gene data and some labels, and I want to find some important variables....

01 February 2016 8,394 4 View

Can you recommend me some open soure of word2vec in java or python?

I am trying to make a project with word embedding. However, the first step is to extract word features from passages. Thus I need to apply word2vec into my material. As I am a student in...

11 December 2015 10,219 8 View

How can permutation approach to improve the accuracy of finding interaction in high dimentional cases?

Hi, now I just read a paper from JASA named《permutation approach to testing interaction for binary response by comparing correlations between classes》, it says permutation can lower the FDR rate....

10 November 2015 7,688 3 View

How can we improve the performance of high dimensional data analysis with Convolutional Neural Network?

Nowadays, I have a matrix of high dimensional about genetic expression with 655 variables. I want to find the influentional variables. Although there are some methods such as logistic regression...

10 November 2015 6,831 6 View

How do I improve the performance of Random Forests( or regression tree) on an unbalanced dataset?

Hi, I want to predict the salary of people working in financial market and I know their age, health-insurance level and marital condition and so on. However, the sample that have a salary above...

09 October 2015 3,222 5 View

How can we implement neural network algorithm and deep learning?

Dear friends, I am now looking for some useful packages for neural network computing and deeplearning. However I can't find a complete packages written in java,c or R. Can you recommend some...

06 July 2015 1,383 11 View

Can you recommend me some material about deep learning ?

Hi, now I start my new research. I know that deep learning is a very hot field of machine learning and I want to apply the theory of deep learning into text classification. However, as my major...

05 June 2015 8,362 3 View

Feedback defines the constitution of an organism?

“Here is a thought experiment. Let's place Rodolpho Llinas's jarred-brain on top of a body (Fig. 1). I bet Llinas would argue that his jarred-brain retains its own consciousness, and the android...

11 August 2024 2,483 1 View

Self-Organizing Superorganisms—as envisaged by Nenad Sestan (2018)?

The rate of glucose consumption by the neocortex is reduced by over 80% during anesthesia (Sibson et al. 1998), which disables the synapses (Richards 2002) that are inundated by glial tissue (Engl...

08 August 2024 3,118 0 View

Do you think can be any diamond in A type eclogites?

I want to know more about diamond ore deposits in world.

08 August 2024 1,514 0 View

U you think We need a website software of Blackbody radiation law expert software?

A website software of Blackbody radiation law expert software can used through the following web site. http://39.105.188.151:3000/index

07 August 2024 1,706 0 View

Enhancing Critical Thinking Skills for Slow Learners: A Review of Empirical Studies?

to identify themes in question with APA style references

07 August 2024 2,239 5 View

How can I improve the quality of lamb milk replacer for machine feeding?

Hello everyone, I am researching ways to enhance the quality of lamb milk replacer for machine feeding. I would appreciate any insights or recommendations on the following: Manual vs. Machine...

06 August 2024 6,227 2 View

Measuring the Intelligence of a Species?

Larger brains, which typically contain more neurons, store and transfer more information (Tehovnik and Chen 2015), but the precise relationship between number of neurons and information has yet to...

05 August 2024 1,238 2 View

How to preform densitometry on SDS-page bands?

I ran a SDS-page of a bacterial lysate and I want to quantify protein concentration in a specific band. I was thinking of using a standards ladder or make some standards are different...

05 August 2024 9,805 3 View

How can i do multivariate Time Series forecast using MLP, ANFIS and LSTM?

I need the python code to forecast what crop production will be in the next decade considering climate and crop production variables as seen in the attached.csv file.

05 August 2024 2,977 3 View

The Curse of Evolution and Complexity?

Brain and body mass together are positively correlated with lifespan (Hofman 1993). The duration of neural development is one of the best predictors of brain size, and conception is the best...

05 August 2024 6,247 3 View

Fabrice Noreils

Hi Chen

If you want that people help you here, you first need to clearly describe the problem you are working on:

What is your problem

How did you modelize it (I guess you are implementing a DQN algorithm):

- state input vector

- output vector (usually you are working with the Q value not V)

- what kind of neural net

- target for policy evaluation (Monte carlo, time difference with n-steps bootstrap or not...)

- E-greedy exploration or not? exploration versus exploitation ration

- replay buffer

- batch size?

- what is your loss function

- and finally your optimizer...

Then, I think that you can find somebody out there who will be able to help you, otherwise it is impossible to answer your question.

Regards

Wang Chunpeng

I have encountered before that the weight coefficient of L2 regularization is too large and the loss function goes up

Chen Ziheng

Yes, it combines the DQN and PG. The training data is updated by the policy and we sample them in the data buffer to update the policy. Thank you

Erik Cuevas

Combines the policies depending of the used approaches