How to set initial estimate for mean in PPO actor critic network

Question

Jason Butler el 8 de Mayo de 2024

0
Enlazar

Enlace directo a esta pregunta

https://es.mathworks.com/matlabcentral/answers/2116811-how-to-set-initial-estimate-for-mean-in-ppo-actor-critic-network

Respondida: Aneela el 22 de Mayo de 2024

I am using a PPO actor critic network. I created the actor following this example

https://www.mathworks.com/help/reinforcement-learning/ref/rl.function.rlcontinuousgaussianactor.html

How can I set an initial guess for the mean? Currently the actor always starts with an intial mean at time zero of zero.

0 comentarios
Mostrar -2 comentarios más antiguosOcultar -2 comentarios más antiguos

Iniciar sesión para comentar.

Iniciar sesión para responder a esta pregunta.

Answer 1

Aneela el 22 de Mayo de 2024

0
Enlazar

Enlace directo a esta respuesta

https://es.mathworks.com/matlabcentral/answers/2116811-how-to-set-initial-estimate-for-mean-in-ppo-actor-critic-network#answer_1461451

Abrir en MATLAB Online

Hi Jason Butler,

To set an initial guess for the mean in PPO actor network, modify the initial weights or biases of the layers that contribute to calculating the mean.

Set an initial guess for the mean in the bias of the “fullyConnectedLayer” in the mean calculation.
However, because of the non-linearities like the “tanhLayer”, directly setting the bias to achieve a specific mean after scaling and non-linear transformations can be complex.

Assuming the desired initial mean as 5, here’s a workaround:

desiredInitialMean = 5; % Adjust this value as needed
% Since you have 3 actions, create a bias vector with 3 elements
biasForDesiredMean = repmat(desiredInitialMean / actInfo.UpperLimit, [prod(actInfo.Dimension), 1]);
% Modify the meanPath definition to include the bias initialization as a vector
meanPath = [ 
    tanhLayer(Name="tanhMean");
    fullyConnectedLayer(prod(actInfo.Dimension), ...
    'Bias', biasForDesiredMean, ... 
    Name="fcMean");
    scalingLayer(Name="scale", ...
    'Scale', actInfo.UpperLimit)
];

For more information on “Bias” in the “fullyConnectedLayer”, refer to the following MathWorks documentation: https://www.mathworks.com/help/deeplearning/ref/nnet.cnn.layer.fullyconnectedlayer.html?s_tid=doc_ta#:~:text=Layer%20biases%2C%20specified,single%20%7C%20double

0 comentarios
Mostrar -2 comentarios más antiguosOcultar -2 comentarios más antiguos

Iniciar sesión para comentar.

How to set initial estimate for mean in PPO actor critic network

0 comentarios
Mostrar -2 comentarios más antiguosOcultar -2 comentarios más antiguos

Respuestas (1)

0 comentarios
Mostrar -2 comentarios más antiguosOcultar -2 comentarios más antiguos

Ver también

Categorías

Etiquetas

Productos

Versión

Community Treasure Hunt

How to set initial estimate for mean in PPO actor critic network

0 comentarios Mostrar -2 comentarios más antiguosOcultar -2 comentarios más antiguos

Respuestas (1)

0 comentarios Mostrar -2 comentarios más antiguosOcultar -2 comentarios más antiguos

Ver también

Categorías

Etiquetas

Productos

Versión

Community Treasure Hunt

0 comentarios
Mostrar -2 comentarios más antiguosOcultar -2 comentarios más antiguos

0 comentarios
Mostrar -2 comentarios más antiguosOcultar -2 comentarios más antiguos