Reinforcement Learning does not show that training occurs?
5 visualizaciones (últimos 30 días)
Mostrar comentarios más antiguos
shadi abpeikar
el 12 de Mzo. de 2021
Editada: Emmanouil Tzorakoleftherakis
el 19 de Mzo. de 2021
Hi, I have a reinforcement learning in a continuous state/action space. I trained it for 2000 episodes, each episode contains maximum of 10 steps, and stops episode training when reaches a positive reward more than 10 or when reaches the maximum number of steps. Here is the training procedure of this off-policy reinforcement learning. This reinforcement learning visually shows that the training happens, when tested on some samples. But I cannot understand why it doesn't show the original training trend of RL (start from low reward to high rewards). I checked some of the answers provided in MathWork like changing OU noise, deep neural netwrok setting of actor and critics, and changing the reward function, but it just fluctuates as follow. I appreciaate if someone could help me in this case.
3 comentarios
Emmanouil Tzorakoleftherakis
el 18 de Mzo. de 2021
It's also not clear what the question is. How did you get the plot above? The x axis does not show all training episodes
Respuestas (1)
Emmanouil Tzorakoleftherakis
el 18 de Mzo. de 2021
Thanks for the info. I think this is a scaling issue with the plot. The Episode Manager has this option where you can uncheck "Q0" (orange line) which prevents you from seeing the training trends more closely
2 comentarios
Emmanouil Tzorakoleftherakis
el 19 de Mzo. de 2021
Editada: Emmanouil Tzorakoleftherakis
el 19 de Mzo. de 2021
Well, that means that your agent is not learning anything in which case you have to go back and see what you can change to improve training. I would recommend starting from the reward signal
Ver también
Productos
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!