rlDDPGAgent

Create deep deterministic policy gradient reinforcement learning agent

Syntax

agent = rlDDPGAgent(actor,critic,opt)

Description

example

agent = rlDDPGAgent(actor,critic,opt) creates a DDPG agent with the specified actor and critic networks, using the specified DDPG agent options. For more information on DDPG agents, see Deep Deterministic Policy Gradient Agents.

Examples

collapse all

Create a DDPG agent with actor and critic and obtain its observation and action specifications.

env = rlPredefinedEnv("DoubleIntegrator-Continuous");
obsInfo = getObservationInfo(env);
numObservations = obsInfo.Dimension(1);
actInfo = getActionInfo(env);
numActions = numel(actInfo);

Create a critic representation.

statePath = imageInputLayer([numObservations 1 1], 'Normalization', 'none', 'Name', 'state');
actionPath = imageInputLayer([numActions 1 1], 'Normalization', 'none', 'Name', 'action');
commonPath = [concatenationLayer(1,2,'Name','concat')
             quadraticLayer('Name','quadratic')
             fullyConnectedLayer(1,'Name','StateValue','BiasLearnRateFactor', 0, 'Bias', 0)];
criticNetwork = layerGraph(statePath);
criticNetwork = addLayers(criticNetwork, actionPath);
criticNetwork = addLayers(criticNetwork, commonPath);
criticNetwork = connectLayers(criticNetwork,'state','concat/in1');
criticNetwork = connectLayers(criticNetwork,'action','concat/in2');
criticOpts = rlRepresentationOptions('LearnRate',5e-3,'GradientThreshold',1);
critic = rlRepresentation(criticNetwork,obsInfo,actInfo,...
    'Observation',{'state'},'Action',{'action'},criticOpts);

Create an actor representation.

actorNetwork = [
    imageInputLayer([numObservations 1 1], 'Normalization', 'none', 'Name', 'state')
    fullyConnectedLayer(numActions, 'Name', 'action', 'BiasLearnRateFactor', 0, 'Bias', 0)];
actorOpts = rlRepresentationOptions('LearnRate',1e-04,'GradientThreshold',1);
actor = rlRepresentation(actorNetwork,obsInfo,actInfo,...
    'Observation',{'state'},'Action',{'action'},actorOpts);

Specify agent options, and create a PG agent using the environment, actor, and critic.

agentOpts = rlDDPGAgentOptions(...
    'SampleTime',env.Ts,...
    'TargetSmoothFactor',1e-3,...
    'ExperienceBufferLength',1e6,...
    'DiscountFactor',0.99,...
    'MiniBatchSize',32);
agent = rlDDPGAgent(actor,critic,agentOpts);

Input Arguments

collapse all

Actor network representation, specified as an rlLayerRepresentation object created using rlRepresentation. For more information on creating actor representations, see Create Policy and Value Function Representations.

Critic network representation, specified as an rlLayerRepresentation object created using rlRepresentation. For more information on creating critic representations, see Create Policy and Value Function Representations.

Agent options, specified as an rlDDPGAgentOptions object.

Output Arguments

collapse all

DDPG agent, returned as an rlDDPGAgent object.

Introduced in R2019a