Reinforcement Learning Toolbox - When does algorithm train?

2 visualizaciones (últimos 30 días)
I am currently using the RL-Toolbox with a DQN-Agent built into a long-running process-simulation.
The maximum stepcount is currently 8000 steps per episode.
Unfortunately the documentation seems a little ambiguous to me, so here my question:
Doese the train-function of the RL-Toolbox train the agent at the end of an episode or during the episode when the step count exeeds the minibatch-size (like in the baseline algorithms)?
Thank you in advance.

Respuesta aceptada

Emmanouil Tzorakoleftherakis
Emmanouil Tzorakoleftherakis el 25 de Sept. de 2019
The implementation is based on the algorithm listed here.
Weights are being updated at each time step.
  1 comentario
Hans-Joachim Steinort
Hans-Joachim Steinort el 26 de Sept. de 2019
"For each training time step" - that was the line I was looking for (yet looking into the source code lead me to the same conclusion).
After double-checking the baseline-algorithms I found that they do it the same way.
Thank you for your time!

Iniciar sesión para comentar.

Más respuestas (0)

Productos


Versión

R2019a

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by