Deep Q-Network Rewards incorporation?

1 visualización (últimos 30 días)
Zonghao zou
Zonghao zou el 19 de Sept. de 2020
Respondida: Sabiya Hussain el 29 de Ag. de 2022
I have read through most of the current documentations on the Deep Q-Network in Matlab, but it is still not very clear to me how to construct a Deep Q-Network in my case.
I previously wrote my own code for implementing a simple Q-learning, for which, I constructed a Q-matrix with corresponding states and actions. I am now trying to explore how to do the same with Deep Q-Network.
The overall goal is to trying to work out a best policy for an object to move from location A to location B (assuming it is in 2-D)
I have a specific function that has all the necessary physical relationship which will return the corresponding rewards given the current state and action. (lets' say it is called the function F).
I see on the documentations: https://www.mathworks.com/help/reinforcement-learning/ref/rldqnagent.html#d122e15363, to create an agent I must create an observation and an action sets.
In my case, since I can return the specific rewards per action given current state, what should I put down as my observation? (How should I incorporate my function F into the agent?)
Also, in the documentations, I don't see anywhere it takes rewards or calculate rewards for certain actions.
Could somone help me please?
Thanks

Respuesta aceptada

Emmanouil Tzorakoleftherakis
Emmanouil Tzorakoleftherakis el 24 de Sept. de 2020
Hello,
If you have a look at this page, it shows where the reward is incorporated in a custom MATLAB environment. As you can see, the reward is included in the 'step' method, which plays the same role as your F function, so you do not have to do anything different thatn what you are doing already - you just need to create an environment object.

Más respuestas (3)

Madhav Thakker
Madhav Thakker el 23 de Sept. de 2020
Editada: Madhav Thakker el 23 de Sept. de 2020
Hi Zonghao,
I understand you want to construct a Deep Q-Network. The observationInfo tells you the behaviour for your observations. In your case, you want to move an object on a grid. The observations can be the position of the object on the grid. So, your observationInfo will be rlFiniteSetSpec.
obsInfo = rlNumericSpec([2 1])
This creates observation of dimension [2,1]. If required, you can also specify upper and lower limits for your observations.
Hopet this helps.
  1 comentario
Zonghao zou
Zonghao zou el 24 de Sept. de 2020
Hi Madhav,
The thing is I don't want to specify the type of movements I will get by choosing one action. For example, in the grid situation when I choose right I might move right. One specific action determines a specific choose. However, in my case, I have no idea where the object will move by taking one out of all possible actions.
Therefore, all resutls come from my physics governing equation. I have attached a graph. This explains what I want to achieve.
Any help will be appreciated! Thank you

Iniciar sesión para comentar.


Sabiya Hussain
Sabiya Hussain el 29 de Ag. de 2022
Hello there! I'm working on a project based on Q-learning i really need some help regarding a Markov decision process matlab program it is an example of Recycling robot i need your help

Sabiya Hussain
Sabiya Hussain el 29 de Ag. de 2022

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by