I'm trying to write a program to play a game similar to Atari games. details about it:

In this game, objects are falling in different angles and directions, and the agent's goal is to intercept them using a turrent. every state consists of the feedback given by the environment after each action. the feedback consists of the turrent's angle, object's locations, interception's locations and total score. there are four possible actions for the agent to choose.

I have decided to try and implement the program using Q-deep-learning, loss and gradient descent calculations. I'm using tensorflow with python 3.6.7

I'm not sure I'm passing the parameters to the training as I should. I've used some guides to write this program, and I can't wrap my head around one main issue - in which stage do the parameters such as the state and the reward, are added to the network? perhaps someone has an insight about it and can help me out understand if I'm doing something wrong?

here's the full code:

https://github.com/ElinorG11/Intrcpt/blob/master/IntrcptNN%20using%20tensorflow

currently the code won't compile because of a dimention problem which I'm trying to fix, you can see it here in the full trace-back:

https://github.com/ElinorG11/Intrcpt/blob/master/FullTraceback%20of%20compilation%20problem

until I'll find a way to solve this, I would like very much to understand the question that bothers me about if I insert the data into my network correctly - if so, i'd be grateful if anyone can further explain to me how does it work, and if I don't, can anyone refer me to further explanation of this topic? I've read a book, some articles and guides, but In most of the programs they used either input from screen or labeled data, which is different from my case.

Thank you very much for your time and attention!

Similar questions and discussions