What is the best activation function to get action between 0 and 1 in DDPG network?
16 visualizaciones (últimos 30 días)
Mostrar comentarios más antiguos
Sayak Mukherjee
el 13 de Oct. de 2020
Comentada: awcii
el 28 de Jul. de 2023
I am using DDPG network to run a control algorithm which has inputs (actions of RL agent, 23 in total) varying between 0 and 1. I an defining this using rlNumericSpec
actInfo = rlNumericSpec([numAct 1],'LowerLimit',0,'UpperLimit', 1);
Then I am using tanhLayer in the actor network (similar to bipedal robot example) and then using
actorOptions = rlRepresentationOptions('Optimizer','adam','LearnRate',1e-4, 'GradientThreshold',1,'L2RegularizationFactor',1e-5);
actor = rlRepresentation(actorNetwork,env.getObservationInfo,env.getActionInfo, 'Observation',{'observation'}, 'Action',{'ActorTanh1'},actorOptions);
But i feel that the model is only taking the extreme options ie mostly 0 and 1.
Will it be better to use a sigmoid function to get better action estimates?
0 comentarios
Respuesta aceptada
Emmanouil Tzorakoleftherakis
el 15 de Oct. de 2020
Hello,
With DDPG, a common thing to do in the final 3 layers of the actor is to use a fully connected layer, a tanh layer and a scaling layer. Tanh will get the ouput of that layer between -1 and 1 and then you can use the scaling layer to scale/shift values as needed based on the specifications of the actuator in your problem.
It seems the problem here is due to noise that is being added during training with DDPG to allow sufficient exploration (for example see step 1 here). The default noise options have a pretty high variance, so when this is added to the output of the tanh layer, it ends up outside the [0, 1] range and is being clipped. This is why you are only getting the two extremes.
Try adjusting the DDPG noise options, and particularly the variance (make it smaller, e.g. <=0.1). Also, see here for some best practices when choosing noise parameters.
Hope that helps
12 comentarios
Más respuestas (0)
Ver también
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!