Expected reward blows up while training (DDPG agent, reinforcement learning)
Mostrar comentarios más antiguos
I am training a DDPG network and after training for around 5000 iterations, the model seems doesnot seem to converge while the expected reward keeps on increasing exponentially. What can be a possible reason and how to solve the issue.
Respuestas (1)
Emmanouil Tzorakoleftherakis
el 12 de Oct. de 2020
Editada: Emmanouil Tzorakoleftherakis
el 12 de Oct. de 2020
0 votos
Hello,
This answer may be helpful.
I would make sure your reward signal outputs values that make sense, and also possibly simplify the critic network.
2 comentarios
Sayak Mukherjee
el 12 de Oct. de 2020
Emmanouil Tzorakoleftherakis
el 12 de Oct. de 2020
That's right
Categorías
Más información sobre Reinforcement Learning Toolbox en Centro de ayuda y File Exchange.
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!