Reinforcement learning unable to dupilcapte the best reward i had during training

Question

lab el 24 de Ag. de 2022

0
Enlazar

Enlace directo a esta pregunta

https://es.mathworks.com/matlabcentral/answers/1785335-reinforcement-learning-unable-to-dupilcapte-the-best-reward-i-had-during-training

Respondida: Emmanouil Tzorakoleftherakis el 26 de En. de 2023

I use matlab RL toolbox to train a model and I set following rltrainingoptions:

op = rlTraingOptions('StopTrainingCriteria','EpisodeReward','StopTrainingValue',100);

the training process stops when the episodeReawrd>100, however when i used the trained agent to simulate, the episode reward is much lower than 100. Does anybody know why? The other condition is exactly the same.

0 comentarios
Mostrar -2 comentarios más antiguosOcultar -2 comentarios más antiguos

Iniciar sesión para comentar.

Iniciar sesión para responder a esta pregunta.

Answer 1

Emmanouil Tzorakoleftherakis el 26 de En. de 2023

0
Enlazar

Enlace directo a esta respuesta

https://es.mathworks.com/matlabcentral/answers/1785335-reinforcement-learning-unable-to-dupilcapte-the-best-reward-i-had-during-training#answer_1156805

Just because the reward of a single episode meets the desired performance, this does not mean that when you stop ttraining you should see exactly the same behavior from the agent. It could be that the agent was influenced by parameters such as exploration, environment noise etc to get to this result.

Before stopping training, you shouldbe able to see consistent good behavior across multiple episodes in a row (or high average episode reward). In that case, after stopping training, the agent behavior should be close to what you saw in training.

0 comentarios
Mostrar -2 comentarios más antiguosOcultar -2 comentarios más antiguos

Iniciar sesión para comentar.

Reinforcement learning unable to dupilcapte the best reward i had during training

0 comentarios
Mostrar -2 comentarios más antiguosOcultar -2 comentarios más antiguos

Respuestas (1)

0 comentarios
Mostrar -2 comentarios más antiguosOcultar -2 comentarios más antiguos

Ver también

Categorías

Etiquetas

Community Treasure Hunt

Reinforcement learning unable to dupilcapte the best reward i had during training

0 comentarios Mostrar -2 comentarios más antiguosOcultar -2 comentarios más antiguos

Respuestas (1)

0 comentarios Mostrar -2 comentarios más antiguosOcultar -2 comentarios más antiguos

Ver también

Categorías

Etiquetas

Community Treasure Hunt

0 comentarios
Mostrar -2 comentarios más antiguosOcultar -2 comentarios más antiguos

0 comentarios
Mostrar -2 comentarios más antiguosOcultar -2 comentarios más antiguos