Easy way to evaluate / compare the performance of RL algorithm

Question

Saurav Sthapit el 29 de Jul. de 2020

0
Enlazar

Enlace directo a esta pregunta

https://es.mathworks.com/matlabcentral/answers/572359-easy-way-to-evaluate-compare-the-performance-of-rl-algorithm

Editada: Saurav Sthapit el 6 de Ag. de 2020

I have a RL agent trained and would like to compare its performance with a dumb agent. I can run simout=sim(env,agent,simOpts) to evaluate the actual agent. But, I would like to compare the simulation results with a couple of dumb agents which always has the same action or random action. Is there any easy way to do this?

Currently, I have a seperate simulink model without RL agent block (replaced with constant block) and logging Observation and rewards using Simulation Data Inspector.

Thanks

Saurav

0 comentarios
Mostrar -2 comentarios más antiguosOcultar -2 comentarios más antiguos

Iniciar sesión para comentar.

Iniciar sesión para responder a esta pregunta.

Answer 1

Emmanouil Tzorakoleftherakis el 3 de Ag. de 2020

1
Enlazar

Enlace directo a esta respuesta

https://es.mathworks.com/matlabcentral/answers/572359-easy-way-to-evaluate-compare-the-performance-of-rl-algorithm#answer_474718

Why not use a MATLAB Fcn block and implement the dummy agent in there? If you want random/constant actions should be just one line.

1 comentario
Mostrar -1 comentarios más antiguosOcultar -1 comentarios más antiguos

Saurav Sthapit el 6 de Ag. de 2020

Editada: Saurav Sthapit el 6 de Ag. de 2020

Thanks, thats an excellent suggestion for evaluating random actions.

However, when I do that (or use constant blocks), I have to run two statements below: first one for evaluating random/dumb action and one for evaluating the agent.

logsout=sim(mdl)

simout=sim(env,agent,simOpts)

logsout and simout are not directly comparable, but logsout is a field in the simout.SimulationInfo struct.

I am wondering if this is the best approach or if there is a easy way to do this.

Also, simout contains action, observation and reward but if the reward is weighted sum of multiple rewards, I can't access the individual rewards. ( Of course, i can compare logsout with simout.logsout)

Iniciar sesión para comentar.

Easy way to evaluate / compare the performance of RL algorithm

0 comentarios
Mostrar -2 comentarios más antiguosOcultar -2 comentarios más antiguos

Respuestas (1)

1 comentario
Mostrar -1 comentarios más antiguosOcultar -1 comentarios más antiguos

Ver también

Categorías

Etiquetas

Productos

Versión

Community Treasure Hunt

Easy way to evaluate / compare the performance of RL algorithm

0 comentarios Mostrar -2 comentarios más antiguosOcultar -2 comentarios más antiguos

Respuestas (1)

1 comentario Mostrar -1 comentarios más antiguosOcultar -1 comentarios más antiguos

Ver también

Categorías

Etiquetas

Productos

Versión

Community Treasure Hunt

0 comentarios
Mostrar -2 comentarios más antiguosOcultar -2 comentarios más antiguos

1 comentario
Mostrar -1 comentarios más antiguosOcultar -1 comentarios más antiguos