rlAgentInitializationOptions

Options for initializing reinforcement learning agents

Since R2020b

Description

Use the rlAgentInitializationOptions object to specify initialization options for an agent. To create an agent, use an agent creation function such as rlACAgent.

Creation

Syntax

initOpts = rlAgentInitializationOptions

initOpts = rlAgentInitializationOptions(Name=Value)

Description

initOpts = rlAgentInitializationOptions creates a default options object for initializing a reinforcement learning agent with default networks. Use the initialization options to specify agent initialization parameters, such as the number of units for each hidden layer of the agent networks and whether to use a recurrent neural network.

initOpts = rlAgentInitializationOptions(Name=Value) creates an initialization options object and sets its properties using one or more name-value arguments.

example

Properties

expand all

`NumHiddenUnit` — Number of units in each hidden fully connected layer
`256` (default) | positive integer

Number of units in each hidden fully connected layer of the agent networks, except for the fully connected layer just before the network output, specified as a positive integer. The value you set also applies to any LSTM layers.

Example: 64

`UseRNN` — Flag to use recurrent neural network
`false` (default) | `true`

Flag to use recurrent neural network, specified as a logical.

If you set UseRNN to true, during agent creation, the software inserts a recurrent LSTM layer with the output mode set to sequence in the output path of the agent networks. For more information on LSTM, see Long Short-Term Memory Neural Networks.

Note

TRPO agents do not support recurrent networks.

Example: true

`Normalization` — Normalization method
`"none"` (default) | string | character vector

Normalization method for the actor and critic used in the agent, based on the limits defined in the channel specification objects and specified as one of the following values:

"none" — Do not normalize the input of the actor and critic objects.
"rescale-zero-one" — Normalize the input by rescaling it to the interval between 0 and 1. The normalized input Y is (U–Min)./(UpperLimit–LowerLimit), where U is the nonnormalized input. Note that nonnormalized input values lower than LowerLimit result in normalized values lower than 0. Similarly, nonnormalized input values higher than UpperLimit result in normalized values higher than 1. Here, UpperLimit and LowerLimit are the corresponding properties defined in the specification object of the input channel.
"rescale-symmetric" — Normalize the input by rescaling it to the interval between –1 and 1. The normalized input Y is 2(U–LowerLimit)./(UpperLimit–LowerLimit) – 1, where U is the nonnormalized input. Note that nonnormalized input values lower than LowerLimit result in normalized values lower than –1. Similarly, nonnormalized input values higher than UpperLimit result in normalized values higher than 1. Here, UpperLimit and LowerLimit are the corresponding properties defined in the specification object of the input channel.

Note

When you specify the Normalization property of rlAgentInitializationOptions, normalization is applied only to the approximator input channels corresponding to rlNumericSpec specification objects in which both the UpperLimit and LowerLimit properties are defined. After you create the agent, you can use setNormalizer to assign normalizers that use any normalization method. For more information on normalizer objects, see rlNormalizer.

Example: "rescale-symmetric"

Object Functions

`rlACAgent`	Actor-critic (AC) reinforcement learning agent
`rlPGAgent`	Policy gradient (PG) reinforcement learning agent
`rlDDPGAgent`	Deep deterministic policy gradient (DDPG) reinforcement learning agent
`rlDQNAgent`	Deep Q-network (DQN) reinforcement learning agent
`rlPPOAgent`	Proximal policy optimization (PPO) reinforcement learning agent
`rlTD3Agent`	Twin-delayed deep deterministic (TD3) policy gradient reinforcement learning agent
`rlSACAgent`	Soft actor-critic (SAC) reinforcement learning agent
`rlTRPOAgent`	Trust region policy optimization (TRPO) reinforcement learning agent

Examples

collapse all

Create Agent Initialization Options Object

Open Live Script

Create an agent initialization options object. Specify the number of hidden neurons for every fully connected layer and use of a recurrent network.

agtInitOpts = rlAgentInitializationOptions(NumHiddenUnit=64,UseRNN=true)

agtInitOpts = 
  rlAgentInitializationOptions with properties:

    NumHiddenUnit: 64
           UseRNN: 1
    Normalization: "none"

You can modify the options using dot notation. For example, set the number of hidden units to 128.

agtInitOpts.NumHiddenUnit = 128

agtInitOpts = 
  rlAgentInitializationOptions with properties:

    NumHiddenUnit: 128
           UseRNN: 1
    Normalization: "none"

To create a default agent in which the inputs are normalized according to the limits defined in the channel specification objects, assign the Normalization property of agtInitOpts. For example, assign the Normalization property to normalize the inputs between zero and one.

agtInitOpts.Normalization = "rescale-zero-one"

agtInitOpts = 
  rlAgentInitializationOptions with properties:

    NumHiddenUnit: 128
           UseRNN: 1
    Normalization: "rescale-zero-one"

To create your agent, use agtInitOpts as an input argument of an agent constructor function.

Create Default Agent with Normalized Inputs

Open Live Script

This example shows how to create a default agent in which the inputs are normalized according to the limits defined in the channel specification objects.

Define specification objects for the action and observation channels.

obsInfo = [ 
    rlNumericSpec([3,1],LowerLimit=-10,UpperLimit=10) 
    rlNumericSpec([2,1],LowerLimit=-3,UpperLimit=[3 5]')
    ]

obsInfo=2×1 rlNumericSpec array with properties:
    LowerLimit
    UpperLimit
    Name
    Description
    Dimension
    DataType

actInfo = rlNumericSpec([2,1],LowerLimit=-1,UpperLimit=9)

actInfo = 
  rlNumericSpec with properties:

     LowerLimit: -1
     UpperLimit: 9
           Name: [0×0 string]
    Description: [0×0 string]
      Dimension: [2 1]
       DataType: "double"

Create agent initialization options object, specifying symmetric normalization.

agtInitOpts = rlAgentInitializationOptions( ...
    Normalization="rescale-symmetric")

agtInitOpts = 
  rlAgentInitializationOptions with properties:

    NumHiddenUnit: 256
           UseRNN: 0
    Normalization: "rescale-symmetric"

Create a default PPO agent.

agent = rlPPOAgent(obsInfo,actInfo,agtInitOpts);

When the agent is created, an rlNormalizer object is applied to the input channels of both the actor and the critic. If any input channel is either noncontinuous or does not have a finite upper or lower limit, then normalization is not applied to that input, and a warning is displayed.

Extract and display the approximator objects.

actor = getActor(agent)

actor = 
  rlContinuousGaussianActor with properties:

    ObservationInfo: [2×1 rl.util.rlNumericSpec]
         ActionInfo: [1×1 rl.util.rlNumericSpec]
      Normalization: ["rescale-symmetric"    "rescale-symmetric"]
          UseDevice: "cpu"
         Learnables: {10×1 cell}
              State: {0×1 cell}

critic = getCritic(agent)

critic = 
  rlValueFunction with properties:

    ObservationInfo: [2×1 rl.util.rlNumericSpec]
      Normalization: ["rescale-symmetric"    "rescale-symmetric"]
          UseDevice: "cpu"
         Learnables: {8×1 cell}
              State: {0×1 cell}

To specify a different normalization type for some inputs, first create the rlNormalizer object. Alternatively, you can use getNormalizer to extract an array of normalizer objects from the actor or critic and then modify any of the elements using dot notation.

obs2nrmz = rlNormalizer(obsInfo(2).Dimension, ...
    Normalization="zerocenter", Mean=6)

obs2nrmz = 
  rlNormalizer with properties:

        Dimension: [2 1]
    Normalization: "zerocenter"
             Mean: 6

Then, to assign the normalizer object to the desired input channel of the actor or critic, use setNormalizer. For this example, apply obs2nrmz to the second observation channel of the actor.

actor = setNormalizer(actor,obs2nrmz,2)

actor = 
  rlContinuousGaussianActor with properties:

    ObservationInfo: [2×1 rl.util.rlNumericSpec]
         ActionInfo: [1×1 rl.util.rlNumericSpec]
      Normalization: ["rescale-symmetric"    "zerocenter"]
          UseDevice: "cpu"
         Learnables: {10×1 cell}
              State: {0×1 cell}

Then, to assign the new actor to the agent, use setActor.

setActor(agent,actor);

To check that the agent works, use getAction.

a = getAction(agent, { ...
    rand(obsInfo(1).Dimension) ...
    rand(obsInfo(2).Dimension) ...
    });
a{1}

ans = 2×1

    6.0517
    3.2648

Version History

Introduced in R2020b

rlAgentInitializationOptions

Description

Creation

Syntax

Description

Properties

`NumHiddenUnit` — Number of units in each hidden fully connected layer
`256` (default) | positive integer

`UseRNN` — Flag to use recurrent neural network
`false` (default) | `true`

`Normalization` — Normalization method
`"none"` (default) | string | character vector

Object Functions

Examples

Create Agent Initialization Options Object

Create Default Agent with Normalized Inputs

Version History

See Also

Functions

Objects

Topics

rlAgentInitializationOptions

Description

Creation

Syntax

Description

Properties

NumHiddenUnit — Number of units in each hidden fully connected layer 256 (default) | positive integer

UseRNN — Flag to use recurrent neural network false (default) | true

Normalization — Normalization method "none" (default) | string | character vector

Object Functions

Examples

Create Agent Initialization Options Object

Create Default Agent with Normalized Inputs

Version History

See Also

Functions

Objects

Topics

`NumHiddenUnit` — Number of units in each hidden fully connected layer
`256` (default) | positive integer

`UseRNN` — Flag to use recurrent neural network
`false` (default) | `true`

`Normalization` — Normalization method
`"none"` (default) | string | character vector