can i decide the RL agents actions

Question

Sourabh el 2 de Sept. de 2023

0
Enlazar

Enlace directo a esta pregunta

https://es.mathworks.com/matlabcentral/answers/2016096-can-i-decide-the-rl-agents-actions

Comentada: Sourabh el 28 de Oct. de 2023

I am training a PPO agent and issue is it keeps on searching for a better value even after reaching close to stable state.

what i mean is I want my agent to keep applying last action values as soon as the error values reaches <= 0.05 (to prevent oscillations and offset near the set point as shown in shared image.)

my question is can i do it in matlab because i know you can do it in python for sure. any help would be really really helpfull :)

3 comentarios
Mostrar 1 comentario más antiguoOcultar 1 comentario más antiguo

Sourabh el 3 de Sept. de 2023

actually i saw it in a IEEE paper and when i asked that guy he told me he was using python.

I dont have any code with me right now but surely there can be a way to decide the action of my agent i feel.

Sourabh el 4 de Sept. de 2023

ppo actions.jpg

okay i might get some code after a week or so

but all i want is to limit the actions of my PPO agent to settle after some time, not act like as shown in image attached.

Iniciar sesión para comentar.

Iniciar sesión para responder a esta pregunta.

Answer 1

Sam Chak el 4 de Sept. de 2023

0
Enlazar

Enlace directo a esta respuesta

https://es.mathworks.com/matlabcentral/answers/2016096-can-i-decide-the-rl-agents-actions#answer_1300321

Abrir en MATLAB Online

Hi @Sourabh

I believe that it has something to do with the StopTrainingCriteria and StopTrainingValue options of your rlTrainingOptions object. Is the condition "steady-state error ≤ 0.05" reflected in the training termination condition? Typically, the agent will continue to train until MaxEpisodes is reached when the stopping condition is not satisfied.

maxepisodes  = 6000;
maxsteps     = 150;
trainingOpts = rlTrainingOptions(...
    'MaxEpisodes', maxepisodes,...
    'MaxStepsPerEpisode', maxsteps,...
    'ScoreAveragingWindowLength', 5, ...
    'Verbose', false,...
    'Plots', 'training-progress',...
    'StopTrainingCriteria', 'AverageReward',...
    'StopTrainingValue', 1500);

Also, please note that the rewards obtained by the final agents are not necessarily the greatest achieved during the training episodes. You need to save the agents that meet the "steady-state error ≤ 0.05" condition during training by specifying the SaveAgentCriteria and SaveAgentValue properties in the rlTrainingOptions object.

2 comentarios
Mostrar NingunoOcultar Ninguno

Sourabh el 4 de Sept. de 2023

then y r DDPG and TD3 agents working fine?

it has nothing to do with stop training criteria. i just want to settle my agent outputs to previous value as soon as error value reaches 0.05 in training episode.

Sourabh el 28 de Oct. de 2023

https://github.com/backgom2357/Reinforcement_learning_based_PID_Tuner/blob/master/PPO/ppo_agent.py

here is a example of ppo agent for PID tuner on python.

Iniciar sesión para comentar.

Answer 2

Emmanouil Tzorakoleftherakis el 25 de Sept. de 2023

0
Enlazar

Enlace directo a esta respuesta

https://es.mathworks.com/matlabcentral/answers/2016096-can-i-decide-the-rl-agents-actions#answer_1317932

Editada: Emmanouil Tzorakoleftherakis el 25 de Sept. de 2023

It seems like the paper you saw uses some logic to implement the behavior you mention. You could do the same with an if statement in MATLAB.

1 comentario
Mostrar -1 comentarios más antiguosOcultar -1 comentarios más antiguos

Sourabh el 26 de Sept. de 2023

you mean in my script or in my environment.

like can u give an example

Iniciar sesión para comentar.

can i decide the RL agents actions

3 comentarios
Mostrar 1 comentario más antiguoOcultar 1 comentario más antiguo

Respuestas (2)

2 comentarios
Mostrar NingunoOcultar Ninguno

1 comentario
Mostrar -1 comentarios más antiguosOcultar -1 comentarios más antiguos

Ver también

Categorías

Etiquetas

Productos

Versión

Community Treasure Hunt

can i decide the RL agents actions

3 comentarios Mostrar 1 comentario más antiguoOcultar 1 comentario más antiguo

Respuestas (2)

2 comentarios Mostrar NingunoOcultar Ninguno

1 comentario Mostrar -1 comentarios más antiguosOcultar -1 comentarios más antiguos

Ver también

Categorías

Etiquetas

Productos

Versión

Community Treasure Hunt

3 comentarios
Mostrar 1 comentario más antiguoOcultar 1 comentario más antiguo

2 comentarios
Mostrar NingunoOcultar Ninguno

1 comentario
Mostrar -1 comentarios más antiguosOcultar -1 comentarios más antiguos