Is it possible to change RL action values under certain conditions?
Mostrar comentarios más antiguos
I want my agent to output a target value, but in certain situations (reward drops dramatically), I would want the agent to look for a better solution by letting him change the target value. I tried to use initial condition block in order to use the target value in the first place. However, my agent (PPO) always outputs an average value after some training episodes.
5 comentarios
Emmanouil Tzorakoleftherakis
el 18 de Mayo de 2021
Can you provide some more information? What do you mean by letting the agent change target value? Isn't that what is happening by default every time the agent takes an action? what is the envronment architecture?
black_cat
el 18 de Mayo de 2021
Emmanouil Tzorakoleftherakis
el 19 de Mayo de 2021
thanks. It's still not clear to me what you mean by "However, this results in having an output of 3 since the agent is averaging it during training". If it's best to output a 6, the agent should do so, why would it average the output? Unless you are talking about the average episode reward that you see in the episode manager?
Respuestas (0)
Categorías
Más información sobre Reinforcement Learning Toolbox en Centro de ayuda y File Exchange.
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!