rlPPOAgent

Create proximal policy optimization reinforcement learning agent

Description

example

agent = rlPPOAgent(actor,critic,opt) creates an proximal policy optimization (PPO) agent with the specified actor and critic networks, using the specified PPO agent options. For more information on PPO agents, see Proximal Policy Optimization Agents.

Examples

collapse all

Create an environment interface, and obtain its observation and action specifications.

env = rlPredefinedEnv("CartPole-Discrete");
obsInfo = getObservationInfo(env);
actInfo = getActionInfo(env);

Create a critic representation.

criticNetwork = [
    imageInputLayer([4 1 1],'Normalization','none','Name','state')
    fullyConnectedLayer(1,'Name','CriticFC')];
criticOpts = rlRepresentationOptions('LearnRate',8e-3,'GradientThreshold',1);
critic = rlRepresentation(criticNetwork,obsInfo,'Observation',{'state'},criticOpts);

Create an actor representation.

actorNetwork = [
    imageInputLayer([4 1 1],'Normalization','none','Name','state')
    fullyConnectedLayer(2,'Name','action')];
actorOpts = rlRepresentationOptions('LearnRate',8e-3,'GradientThreshold',1);
actor = rlRepresentation(actorNetwork,obsInfo,actInfo,...
    'Observation',{'state'},'Action',{'action'},actorOpts);

Specify agent options, and create a PPO agent using the environment, actor, and critic.

agentOpts = rlPPOAgentOptions(...
    'ExperienceHorizon',1024, ...
    'DiscountFactor',0.95);
agent = rlPPOAgent(actor,critic,agentOpts);

Input Arguments

collapse all

Actor network representation for representing the policy, specified as either an rlLayerRepresentation or rlDLNetworkRepresentation object created using rlRepresentation. For more information on creating actor representations, see Create Policy and Value Function Representations.

Critic network representation for estimating the state-value function, specified as an either an rlLayerRepresentation or rlDLNetworkRepresentation object created using rlRepresentation. For more information on creating critic representations, see Create Policy and Value Function Representations.

Agent options, specified as an rlPPOAgentOptions object.

Output Arguments

collapse all

PPO agent, returned as an rlPPOAgent object.

Introduced in R2019b