Deep Q-Network Rewards incorporation?

Question

Zonghao zou el 19 de Sept. de 2020

0
Enlazar

Enlace directo a esta pregunta

https://es.mathworks.com/matlabcentral/answers/596560-deep-q-network-rewards-incorporation

Respondida: Sabiya Hussain el 29 de Ag. de 2022

Respuesta aceptada: Emmanouil Tzorakoleftherakis

I have read through most of the current documentations on the Deep Q-Network in Matlab, but it is still not very clear to me how to construct a Deep Q-Network in my case.

I previously wrote my own code for implementing a simple Q-learning, for which, I constructed a Q-matrix with corresponding states and actions. I am now trying to explore how to do the same with Deep Q-Network.

The overall goal is to trying to work out a best policy for an object to move from location A to location B (assuming it is in 2-D)

I have a specific function that has all the necessary physical relationship which will return the corresponding rewards given the current state and action. (lets' say it is called the function F).

I see on the documentations: https://www.mathworks.com/help/reinforcement-learning/ref/rldqnagent.html#d122e15363, to create an agent I must create an observation and an action sets.

In my case, since I can return the specific rewards per action given current state, what should I put down as my observation? (How should I incorporate my function F into the agent?)

Also, in the documentations, I don't see anywhere it takes rewards or calculate rewards for certain actions.

Could somone help me please?

Thanks

0 comentarios
Mostrar -2 comentarios más antiguosOcultar -2 comentarios más antiguos

Iniciar sesión para comentar.

Iniciar sesión para responder a esta pregunta.

Answer 1

Emmanouil Tzorakoleftherakis el 24 de Sept. de 2020

0
Enlazar

Enlace directo a esta respuesta

https://es.mathworks.com/matlabcentral/answers/596560-deep-q-network-rewards-incorporation#answer_499738

Hello,

If you have a look at this page, it shows where the reward is incorporated in a custom MATLAB environment. As you can see, the reward is included in the 'step' method, which plays the same role as your F function, so you do not have to do anything different thatn what you are doing already - you just need to create an environment object.

0 comentarios
Mostrar -2 comentarios más antiguosOcultar -2 comentarios más antiguos

Iniciar sesión para comentar.

Answer 2

Madhav Thakker el 23 de Sept. de 2020

0
Enlazar

Enlace directo a esta respuesta

https://es.mathworks.com/matlabcentral/answers/596560-deep-q-network-rewards-incorporation#answer_499045

Editada: Madhav Thakker el 23 de Sept. de 2020

Abrir en MATLAB Online

Hi Zonghao,

I understand you want to construct a Deep Q-Network. The observationInfo tells you the behaviour for your observations. In your case, you want to move an object on a grid. The observations can be the position of the object on the grid. So, your observationInfo will be rlFiniteSetSpec.

obsInfo = rlNumericSpec([2 1])

This creates observation of dimension [2,1]. If required, you can also specify upper and lower limits for your observations.

Hopet this helps.

1 comentario
Mostrar -1 comentarios más antiguosOcultar -1 comentarios más antiguos

Zonghao zou el 24 de Sept. de 2020

Hi Madhav,

The thing is I don't want to specify the type of movements I will get by choosing one action. For example, in the grid situation when I choose right I might move right. One specific action determines a specific choose. However, in my case, I have no idea where the object will move by taking one out of all possible actions.

Therefore, all resutls come from my physics governing equation. I have attached a graph. This explains what I want to achieve.

Any help will be appreciated! Thank you

Iniciar sesión para comentar.

Answer 3

Sabiya Hussain el 29 de Ag. de 2022

0
Enlazar

Enlace directo a esta respuesta

https://es.mathworks.com/matlabcentral/answers/596560-deep-q-network-rewards-incorporation#answer_1037255

Hello there! I'm working on a project based on Q-learning i really need some help regarding a Markov decision process matlab program it is an example of Recycling robot i need your help

0 comentarios
Mostrar -2 comentarios más antiguosOcultar -2 comentarios más antiguos

Iniciar sesión para comentar.

Answer 4

Sabiya Hussain el 29 de Ag. de 2022

0
Enlazar

Enlace directo a esta respuesta

https://es.mathworks.com/matlabcentral/answers/596560-deep-q-network-rewards-incorporation#answer_1037260

0 comentarios
Mostrar -2 comentarios más antiguosOcultar -2 comentarios más antiguos

Iniciar sesión para comentar.

Deep Q-Network Rewards incorporation?

0 comentarios
Mostrar -2 comentarios más antiguosOcultar -2 comentarios más antiguos

Respuesta aceptada

0 comentarios
Mostrar -2 comentarios más antiguosOcultar -2 comentarios más antiguos

Más respuestas (3)

1 comentario
Mostrar -1 comentarios más antiguosOcultar -1 comentarios más antiguos

0 comentarios
Mostrar -2 comentarios más antiguosOcultar -2 comentarios más antiguos

0 comentarios
Mostrar -2 comentarios más antiguosOcultar -2 comentarios más antiguos

Ver también

Categorías

Etiquetas

Community Treasure Hunt

Deep Q-Network Rewards incorporation?

0 comentarios Mostrar -2 comentarios más antiguosOcultar -2 comentarios más antiguos

Respuesta aceptada

0 comentarios Mostrar -2 comentarios más antiguosOcultar -2 comentarios más antiguos

Más respuestas (3)

1 comentario Mostrar -1 comentarios más antiguosOcultar -1 comentarios más antiguos

0 comentarios Mostrar -2 comentarios más antiguosOcultar -2 comentarios más antiguos

0 comentarios Mostrar -2 comentarios más antiguosOcultar -2 comentarios más antiguos

Ver también

Categorías

Etiquetas

Community Treasure Hunt

0 comentarios
Mostrar -2 comentarios más antiguosOcultar -2 comentarios más antiguos

0 comentarios
Mostrar -2 comentarios más antiguosOcultar -2 comentarios más antiguos

1 comentario
Mostrar -1 comentarios más antiguosOcultar -1 comentarios más antiguos

0 comentarios
Mostrar -2 comentarios más antiguosOcultar -2 comentarios más antiguos

0 comentarios
Mostrar -2 comentarios más antiguosOcultar -2 comentarios más antiguos