# rlPGAgentOptions

## Description

Use an `rlPGAgentOptions` object to specify options for policy gradient (PG) agents. To create a PG agent, use `rlPGAgent`

For more information on the different types of reinforcement learning agents, see Reinforcement Learning Agents.

## Creation

### Syntax

``opt = rlPGAgentOptions``
``opt = rlPGAgentOptions(Name,Value)``

### Description

````opt = rlPGAgentOptions` creates an `rlPGAgentOptions` object for use as an argument when creating a PG agent using all default settings. You can modify the object properties using dot notation.```

example

````opt = rlPGAgentOptions(Name,Value)` sets option properties using name-value pairs. For example, `rlPGAgentOptions('DiscountFactor',0.95)` creates an option set with a discount factor of `0.95`. You can specify multiple name-value pairs. Enclose each property name in quotes.```

## Properties

expand all

Instruction to use baseline for learning, specified as a logical values. When `UseBaseline` is true, you must specify a critic network as the baseline function approximator.

In general, for simpler problems with smaller actor networks, PG agents work better without a baseline.

Sample time of agent, specified as a positive scalar.

Discount factor applied to future rewards during training, specified as a positive scalar less than or equal to 1.

Entropy loss weight, specified as a scalar value between `0` and `1`. A higher loss weight value promotes agent exploration by applying a penalty for being too certain about which action to take. Doing so can help the agent move out of local optima.

The entropy loss function for episode step t is:

`${H}_{t}=E\sum _{k=1}^{M}{\mu }_{k}\left({S}_{t}|{\theta }_{\mu }\right)\mathrm{ln}{\mu }_{k}\left({S}_{t}|{\theta }_{\mu }\right)$`

Here:

• E is the entropy loss weight.

• M is the number of possible actions.

• μk(St) is the probability of taking action Ak following the current policy.

When gradients are computed during training, an additional gradient component is computed for minimizing this loss function.

## Object Functions

 `rlPGAgent` Policy gradient reinforcement learning agent

## Examples

collapse all

This example shows how to create and modify a PG agent options object.

Create a PG agent options object, specifying the discount factor.

`opt = rlPGAgentOptions('DiscountFactor',0.9)`
```opt = rlPGAgentOptions with properties: UseBaseline: 1 EntropyLossWeight: 0 SampleTime: 1 DiscountFactor: 0.9000 ```

You can modify options using dot notation. For example, set the agent sample time to `0.5`.

`opt.SampleTime = 0.5;`