How best to implement Q-learning?

More Chinedu Pascal Ezenkwu's questions See All

Can someone explain intuitively what ravel_multi_index( ... ) numpy function does?

I am trying to understand a python code but got stuck trying to understand what numpy.ravel_multi_index does.The documentation and the examples therein are confusing. Can someone help to explain...

03 April 2018 6,780 1 View

Can someone help me look at my Q-learning implementation in Python ?

I am trying to understand Q-learning; so I had to try my hand on a 3 by 3 grid world in python. The program runs but Q-learning is not converging after several epsiodes. Please, I would need...

02 March 2018 7,076 2 View

What are the challenges & possible techniques for developing Autonomous Learning Robots in a dynamic, stochastic & partially observable environment?

I am working on Autonomous Learning Techniques for robots. Assuming a robot is deployed in an unknown environment, what techniques could help the robot explore this environment and build its own...

10 November 2017 3,135 3 View

How can I plot my own U-Matrix in MATLAB without using SOMtoolbox?

I am trying to code SOM by myself. Visualising the weight matrix for 2-dimension input vectors is quite intuitive with ordinary plot function in MATLAB but it is hard to conceptualise how to do...

10 November 2017 6,040 0 View

Can someone help me with an easy-to-use Robot Simulator?

I wish to simulate a robot behaviour in an unknown environment. Can someone help me with a user-friendly simulator?

10 November 2017 7,235 9 View

Source of data set for global oil price prediction?

Can someone help me with the data source for global oil price prediction?

03 April 2016 1,025 0 View

Difficulty in coding convolutional neural networks ?

I find it difficult in understanding the concept of convolutional neural networks enough so that I can code it in MATLAB. Can someone help me present the idea algorithmically?

03 April 2016 1,062 6 View

Kernel regression for multivariate input matrix?

Can I use kernel regression on a multivariate input matrix? I would be happy if I am guided.

03 April 2016 8,574 0 View

I would need a comprehensive guide for installing TensorFlow on Windows?

I wanted to install TensorFlow for my Machine learning projects but I am finding it difficult to install it on Window. Would be happy if I can have a step by step guide.

01 February 2016 10,023 2 View

Apart from MATLAB neural network toolbox, What are other easy to use but free neural network tools?

Please, I need a very easy to use but free neural network toolbox. Thanks in advance.

31 December 2015 6,253 14 View

Feedback defines the constitution of an organism?

“Here is a thought experiment. Let's place Rodolpho Llinas's jarred-brain on top of a body (Fig. 1). I bet Llinas would argue that his jarred-brain retains its own consciousness, and the android...

11 August 2024 2,483 1 View

Self-Organizing Superorganisms—as envisaged by Nenad Sestan (2018)?

The rate of glucose consumption by the neocortex is reduced by over 80% during anesthesia (Sibson et al. 1998), which disables the synapses (Richards 2002) that are inundated by glial tissue (Engl...

08 August 2024 3,118 0 View

Measuring the Intelligence of a Species?

Larger brains, which typically contain more neurons, store and transfer more information (Tehovnik and Chen 2015), but the precise relationship between number of neurons and information has yet to...

05 August 2024 1,238 2 View

How can i do multivariate Time Series forecast using MLP, ANFIS and LSTM?

I need the python code to forecast what crop production will be in the next decade considering climate and crop production variables as seen in the attached.csv file.

05 August 2024 2,977 3 View

The Curse of Evolution and Complexity?

Brain and body mass together are positively correlated with lifespan (Hofman 1993). The duration of neural development is one of the best predictors of brain size, and conception is the best...

05 August 2024 6,247 3 View

Need help with my research project on open source SIEM and machine learning?

Hello everyone, I am currently working on a research project that aims to integrate machine learning techniques into an open source SIEM tool to automate the creation of security use cases from...

04 August 2024 3,196 2 View

Swimming/space travel depends on the proprioceptive muscle spindles?

When the entire neocortex is ablated in rodents, although they are still able to swim, all the limbs move continuously and asynchronously (Vanderwolf 2006; Vanderwolf et al. 1978). Normal animals...

03 August 2024 835 3 View

What are the limitations and challenges of using machine learning for predicting concrete compressive strength in practical applications?

Machine learning (ML) has shown great potential in predicting the compressive strength of concrete, an important property for structural engineering. However, its practical application comes with...

03 August 2024 2,546 2 View

Does atmosphere affect flow of matter & energy on Earth & flow of energy in biosphere related to the flow of food through a food chain?

How does the atmosphere affect the flow of matter and energy on Earth and flow of energy in the biosphere related to the flow of food through a food chain?

02 August 2024 9,644 0 View

Some new emerging problems on application of RL for scheduling in IoT networks?

I have seen plenty of existing works on applied Reinforcement Learning (RL) policies for optimized scheduling in IoT networks including Q-learning, DQNs, and the newer ones including PPO for...

01 August 2024 8,754 2 View

Chinedu Pascal Ezenkwu

Thanks but my question has not been addressed by your post.

Vishnu Raj

For your example, you would like the agent to try out a state where it stays in one cell for some time to figure out whether the waiting produces any results. I think you can add the amount of wait time also as another dimension to the state: so if you want to try out a maximum wait of 'W' steps in each cell, you can try the state space with 7x7xWx5; but after Wth wait the number of available options should be limited to 4 actions. But this state space cannot find arbitrary wait time - you need a cap on maximum wait time!!

Vishnu Raj, thanks for your intelligent response.