beta distribution in PPO

5 visualizaciones (últimos 30 días)

Sourabh el 2 de Feb. de 2024

0
Enlazar

Enlace directo a esta pregunta

https://es.mathworks.com/matlabcentral/answers/2077451-beta-distribution-in-ppo

Comentada: Kautuk Raj el 15 de Feb. de 2024

I want to confine the actions of my PPO algorithm and I was thinking whether or not I can implement beta distribution for my PPO algorithm to confine my action space somehow.

heres the script of networks i am using

----------

commonPath = [

featureInputLayer(prod(obsInfo.Dimension),Name="comPathIn")

fullyConnectedLayer(120)

tanhLayer

fullyConnectedLayer(1,Name="comPathOut")

];

% Define mean value path

meanPath = [

fullyConnectedLayer(64,Name="meanPathIn")

tanhLayer

fullyConnectedLayer(64,Name="fc_2")

tanhLayer

fullyConnectedLayer(prod(actInfo.Dimension))

leakyReluLayer(0.1,Name="meanPathOut")

];

% Define standard deviation path

sdevPath = [

fullyConnectedLayer(64,"Name","stdPathIn")

tanhLayer

fullyConnectedLayer(64)

tanhLayer

fullyConnectedLayer(prod(actInfo.Dimension));

softmaxLayer(Name="stdPathOut")

];

% Add layers to layerGraph object

actorNet = layerGraph(commonPath);

actorNet = addLayers(actorNet,meanPath);

actorNet = addLayers(actorNet,sdevPath);

% Connect paths

actorNet = connectLayers(actorNet,"comPathOut","meanPathIn/in");

actorNet = connectLayers(actorNet,"comPathOut","stdPathIn/in");

actorNetwork = dlnetwork(actorNet);

1 comentario
Mostrar -1 comentarios más antiguosOcultar -1 comentarios más antiguos

Kautuk Raj el 15 de Feb. de 2024

To implement a Beta distribution for the action outputs in the PPO algorithm, I think we would need to modify the network architecture to output the parameters (alpha and beta) of the Beta distribution. These parameters must be positive, so one would typically use an activation function that ensures positivity, such as the softplus function.

Iniciar sesión para comentar.

Iniciar sesión para responder a esta pregunta.

Respuestas (0)

Iniciar sesión para responder a esta pregunta.

Categorías

AI and Statistics Statistics and Machine Learning Toolbox Probability Distributions Continuous Distributions Gamma Distribution

Más información sobre Gamma Distribution en Help Center y File Exchange.

Productos

Versión

R2023a

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by

beta distribution in PPO

1 comentario
Mostrar -1 comentarios más antiguosOcultar -1 comentarios más antiguos

Respuestas (0)

Ver también

Categorías

Etiquetas

Productos

Versión

Community Treasure Hunt

beta distribution in PPO

1 comentario Mostrar -1 comentarios más antiguosOcultar -1 comentarios más antiguos

Respuestas (0)

Ver también

Categorías

Etiquetas

Productos

Versión

Community Treasure Hunt

1 comentario
Mostrar -1 comentarios más antiguosOcultar -1 comentarios más antiguos