RL designer toolbox | PPO agent | NaN output as policy

3 visualizaciones (últimos 30 días)
Atusa
Atusa el 16 de Abr. de 2025
Comentada: praguna manvi el 22 de Abr. de 2025
Dear all,
I'm using a PPO RL agent in a simulink environment and I'm training the RL agent using the RL Designer toolbox.
I'm getting the following error in the middle of the training at various episodes:
Error:Block '.../agent Obj/Evaluate Policy/Execute Policy/Enabled Policy Evaluator/Policy Evaluator/Policy Process Experience Internal' outputs 'NaN' for element 1 of output port 1 at major time step 0.
My agent's action is one numeric value which should be bounded between [-1, 1] so there shouldn't be any NaN values.
I'm using a rlContinuousGaussianActor that has a softplusLayer in the last layer of the standard deviation output and a tanhLayer followed by a scalingLayer with Scale=1 in the outputlayer of the mean value for the action. I have also used the following command:
actInfo = rlNumericSpec([1 1], 'LowerLimit', -1, 'UpperLimit', 1);
I'm resetting my environment at each episode using the following command:
simEnv.ResetFcn = @(in) setVariable(in,"q",0,"Workspace",mdl);
I'm sure there's no singularity in my model as the smulation runs with no error. The error only happens during the training with the toolbox and it also doesn't always happen.
I'd appreciate it if you could hel me solve this issue.
Thank you very much.

Respuestas (0)

Productos


Versión

R2024a

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by